The paper trail: steps towards a reference model for the metadata ecology, presentation at ~CoLIS5 workshop. Presentation with Jane Barton. http://mwi.cdlr.strath.ac.uk/Colisworkshop.htm
Archiving- from June 2005.
please note this presentation is currently all rights reserved until i contact the other author.
Comparing Sidecar-less Service Mesh from Cilium and Istio
The paper trail:steps towards a reference model for the metadata ecology
1. The paper trail: steps towards a reference model for the metadata ecology R. John Robertson & Jane Barton Centre for Digital Library Research University of Strathclyde, UK
2.
3.
4. Tracking the object DC2003/ DCMI E-lis Strathprints E-Prints Soton E-Prints UK Arc (ODU) Oaister Metalis HAIRST Citebase Worldcat CDLR pubs Erpanet Stephen’s Web zniff Tardis list CIS pubs Erpanet Other resource lists
5. Tracking the object’s metadata DC2003/ DCMI E-lis Strathprints E-Prints Soton E-Prints UK Arc (ODU) Oaister Metalis HAIRST Citebase Worldcat CDLR pubs Erpanet Stephen’s Web zniff Tardis list CIS pubs Erpanet Other resource lists
6. Tracking the object’s metadata DC2003/ DCMI E-lis Strathprints E-Prints Soton E-Prints UK Arc (ODU) Oaister Metalis HAIRST Citebase Worldcat CDLR pubs Erpanet Stephen’s Web zniff Tardis list CIS pubs Erpanet Other resource lists
7.
8. Author metadata lifecycle for the object DC2003 Strathprints CDLR pubs CIS pubs Erpanet Title Authors Abstract Publisher Date Url Pages Title Authors Publisher Date Url bibtex Review process Review process
9. Worldcat metadata lifecycle for the object E-Prints Soton Worldcat Title Authors Publisher Date Url DC MaRC Yahoo search Authority files Controlled vocabulary tools Subject
10. Resource list metadata lifecycle for the object DC2003 Erpanet Stephen’s Web Tardis zniff Other resource lists Title Authors Date Url Title Url
11. Harvester metadata lifecycle for the object DC2003 E-lis Strathprints E-Prints Soton E-Prints UK Oaister Metalis HAIRST Citebase Erpanet GET GET GET GET GET GET GET GET Review process
12.
13. Analysis of discovery metadata y y y y n n conference paper url y y n y y n Date n n n y y p Author y y n y y y Title erpanet stephen's web zniff cdlr worldcat HAIRST Element n n n y p y conference paper url p n n p y y Date y y n y y y Author y y y y y y Title metalis e-Prints UK strathprints e-lis e-prints Soton DC2003 Repository Element
14. Analysis of Citation completeness 6 2 2 2 9 5 6 Citation score Tardis list metalis e-Prints UK strathprints e-lis e-prints Soton DC2003 Repository 2 (typically) 4 5 2 6 5 1 Citation score resource list erpanet stephen's web zniff cdlr worldcat HAIRST Repository URL Pages Publication place Editors Publisher Conference date / location Conference Date Title Author
15.
16.
17. Illustrating the metadata ecology: metadata lifecycle Strathprints Review process HAIRST GET Author/ depositor
18. Illustrating the metadata ecology: extended metadata lifecycle Strathprints CDLR pubs CIS pubs Review process HAIRST GET
19. Illustrating the metadata ecology: metadata relationships Strathprints CDLR pubs CIS pubs Review process HAIRST GET E-lis Metalis Review process GET Metadata relationship
Introduction to exercise The first thing you do look for your work or people you know – new paper, book,search tool Looking for…[citation] Indulge the example - Jane’s paper Highlight simple object, simple purpose of metadata: discovery and citation e-print, single version – in the wider community represented here its an easy object in itself: not LO; Image; video; museum object; dataset or anything else the desirable metadata is geared towards resource discovery and to some extent citation – that’s it. Key to diagrams 4 type2 represented in our discovery environment as you’d expect a variety of entities in the wider environment. Places where a digital copy of the paper sits; harvesters of metadata; metadata creators; automatic web tools
Which object do they link to? Note paper is OA on DC/conf site Explain contents of slide, what are all these things? Explain why erpanet x2 Examine links and comment Visibility in Google? Oai harvester’s aren’t pointing to ‘real’ object
Metadata transfer Note on zniff No md connections in catalogues or resource ,ists no m2m relationship Where m2m inspection suggests little human ibtervention Note on the unexpected isolation of worldcat The obvious observation – far fewer connections, all OAI based – no apparent importing into repositories or catalogues
Created by author Created within 15 feet of the author
Note that this is a very simple example (lifecycle’s and associated workflows would be more complex outside of the e-print situation) Further note that metadata for e-prints are likely to serve an obvious purpose resource discovery/ management and will not occur that much outside of this domain Lifecycle- note include specification of repository, design workflow, implementation Lifecycle’s also straight forward – single iteration most likely so lifecycle is ingest workflow, otherwise metadata fairly static These 4 lifecycles provide a snapshot of the metadata creation processes involved in creating the paper trail we’ve just seen.
[check md set for diff reps] No revisions Time taken by redoing this Md referred back to by reps editors if questions Intellectual decision about content of object
Set up of worldcat Choice of controlled etc. Periodic updating of these tools and md review Choice to use aacr2 – resource types and semantic follow from that Located electronic copy Examples of authority files and controlled vocabularies used Name authority lcnaf; lcsh; resource type Export though unused key Subject classification process This should produce one of the ‘best’ records Exporting native marc/ xml/ dc/ oai potential/ link usage and other through yahoo Should be able to import…
Other resource list example needed as this is both most frequent and thinnest Title (often as link ) and abstractin some form are often the only md
Harvester set up choices of what to store harvest/ software/ and set up choices What changes in harvester repository? Transformations Subject Date normalisation url of source
A lot of metadata activity – investment not trivial So paper found in all these places, what’s return on the effort expended? If it was someone else looking for the paper what do they need to know? Citation needs [list from later] Assessment made on what md is found/ visible
Based on visible md
Scores not finished What to comment on – hairst Worldcat Zniff Metalis Other odd points Pages Conference location/dates Publication editor Subject terms (variety) PURL
DELOS workshop breakout session on developing a reference model for repositories – note focus on individual repositories. Noted ‘repository’ is an accepted term for certain sorts of collections against which services can be offered, choice of technical solutions for delivering repository services. Discussed repositories as social constructs with many roles, considered it would be useful to build up a list of services that would be offered under each of these roles, and to identify those which are 'common services' and those which are specific to a particular sort of repository. Johns Hopkins U. is currently working on a "A Technology Analysis of Repositories and Services", funded by the Mellon Foundation. They are: * gathering and writing repository use cases and scenarios * mapping functionality to various repository interfaces * looking at the attributes of a repository interface layer Presentations at CNI and DLF. Currently inviting use cases and scenarios for repositories.
Use Heatherbank example: physical exhibition digitised and repurposed as a virtual museum exhibit based on a collection of digital library objects and as an interconnected set of learning objects – two object lifecycles developed in parallel to produce an extended object lifecycle Related object lifecycles arise from other uses of the exhibition content eg some photographs have previously been used as illustrations in books and journal articles; individual digital library objects or learning objects may subsequently be reused or repurposed elsewhere