METADATA John Hargreaves Technical Support officer JISC Digital Media
Before we begin <ul><li>Why are we going to all this trouble? </li></ul><ul><li>Staffordshire Past Track </li></ul><ul><li...
Metadata   <ul><li>Definition </li></ul><ul><li>Types </li></ul><ul><li>Location </li></ul><ul><li>Schemas </li></ul><ul><...
Definition <ul><li>Common definition: </li></ul><ul><li>Information about information </li></ul><ul><li>For my purposes: <...
Structure <ul><li>Organised information is created using </li></ul><ul><li>Schemas (element sets) </li></ul><ul><li>Vocabu...
Purposes <ul><ul><li>Finding, identifying and understanding a resource Descriptive/Discovery metadata e.g. “Title”, “Subje...
Purposes <ul><ul><li>Organising and relating resources Structural and Packaging metadata e.g. “Is part of”, “Master image ...
Purposes <ul><li>Funder requirements… </li></ul><ul><li>Images for Education - A JISC funded project. </li></ul><ul><li>ht...
Metadata - Attributes <ul><ul><li>Different ‘levels’  of  a resource (e.g. item, component, collection) </li></ul></ul><ul...
Metadata can have different origins… <ul><ul><li>“ Implicit” – derived from the image itself (typically technical data) </...
…  and can exist in different locations <ul><ul><li>Embedded within the digital resource itself </li></ul></ul><ul><ul><li...
“ Standards” <ul><ul><li>Commonly, consistently applied formats or processes; measurable; well documented; endorsed by som...
Be very aware of differences <ul><ul><li>How do they deal with different “layers” within a resource? (e.g. images of image...
Choosing, adapting, and mapping schemas <ul><ul><li>Ideally we ’ d pull a schema off the shelf and begin cataloguing </li>...
Dublin Core <ul><ul><li>International (ISO 15836-2003) cross-community standard for describing digital resources   http://...
VRA Core  <ul><ul><li>Visual Resources Association </li></ul></ul><ul><ul><li>Version 4.0 is now also available </li></ul>...
SEPIADES <ul><li>Safeguarding European Photographic Images for Access </li></ul><ul><ul><li>For photographic collections  ...
CDWA <ul><li>Categories for the Description of Works of Art  </li></ul><ul><li>Describes art works or cultural objects </l...
Some Established Mappings <ul><ul><li>Mapping metadata schemas: </li></ul></ul><ul><ul><ul><li>Getty crosswalks: http://ww...
Vocabularies Image courtesy of stock.xchng
Why Use Controlled Vocabularies? <ul><ul><li>Better retrieval </li></ul></ul><ul><ul><li>Improved cataloguing efficiency a...
Ways to Control Vocabularies <ul><ul><li>Data entry rules or guidelines </li></ul></ul><ul><ul><li>Formal subject headings...
What about ‘Uncontrolled’ Keywords? <ul><ul><li>Made up by a cataloguer at the point of cataloguing </li></ul></ul><ul><ul...
Alternative Vocabularies <ul><li>Consider some more  creative approaches : </li></ul><ul><ul><li>Ask some of your users to...
CBIR & Community Involvement <ul><ul><li>Exploring Flickr by colour  http:// labs.systemone.at/retrievr / </li></ul></ul><...
Further Support and Guidance <ul><li>Web site: http://www.jiscdigitalmedia.ac.uk/ </li></ul><ul><li>helpdesk: http://www.j...
Upcoming SlideShare
Loading in …5
×

Metadata

1,231 views
1,167 views

Published on

John Hargreaves introduces the topic of metadata

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,231
On SlideShare
0
From Embeds
0
Number of Embeds
168
Actions
Shares
0
Downloads
35
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • What about “standards”? Best practice says “use standards”, but it’s not always clear what a standard is Is it something: Approved by a standards body? – few of the commonly used metadata schemas and vocabularies have gone right the way through this process. Dublin Core was only finally approved as an ISO standard earlier this year (2003) – de jure standards Community endorsement? Widespread practice? – de facto standards We will try to avoid using “standards”, if we slip into it, understand that we mean “standards” in a broad sense, with quotation marks around the term
  • What about “standards”? Best practice says “use standards”, but it’s not always clear what a standard is Is it something: Approved by a standards body? – few of the commonly used metadata schemas and vocabularies have gone right the way through this process. Dublin Core was only finally approved as an ISO standard earlier this year (2003) – de jure standards Community endorsement? Widespread practice? – de facto standards We will try to avoid using “standards”, if we slip into it, understand that we mean “standards” in a broad sense, with quotation marks around the term
  • For some projects the choice is clear
  • Metadata is derived from 2 sources…
  • New European schema
  • New European schema
  • North American, has influenced a number of other schema
  • Ask the audience to suggest reasons for controlling vocabs?
  • Ask the audience to suggest ways to control vocabs
  • Finally moving away from controlled vocabularies to uncontrolled “keywords” – words supplied by cataloguer or even the user It’s NOT an either/or situation - Metadata frameworks can accommodate both Opinion is divided on whether formal controlled vocab or keywords are better. The research suggests that controlled vocabularies are better than uncontrolled keywords, but a mix of both is even better
  • Ask the audience to suggest ways to control vocabs
  • Image file header information - BMP, PCX, JPEG, FLI/FLC, and AVI files include headers that define the image size, number of colors, and other information needed to display the image. GPS - time and location could be used for a lot of things DRM data Audio annotation - though has obvious implications for file size and hasn’t been widely accepted is a option it may useful in some circumstances DOI Are unique numbers assigned to each unique published object These numbers are stored in a server that knows where on the net to find the specific document. very useful for DRM Histories – particularly medical/legal images User generate metadata Cookies can help personalise the online experience recall specific information on subsequent visits simplifying the process of recording information- provide a convenience feature to save you time Most Web browsers automatically accept cookies Log and track usage of the system - data could be used to analyse amount and reasons for failed searches - indicating that your chosen schema or vocabulary is ineffective Also be used to enable customisation &amp; notification - MORE HERE User contributed - talk about in more detail &gt; next slide
  • Closure and evaluation checklist: review the topics covered on the timetable. At the end of your presentation you might ask, “How will that work in your institution?” share what they feel has been the most valuable thing they have learnt, what they will do with their new learning in the coming week what they want to understand better
  • Metadata

    1. 1. METADATA John Hargreaves Technical Support officer JISC Digital Media
    2. 2. Before we begin <ul><li>Why are we going to all this trouble? </li></ul><ul><li>Staffordshire Past Track </li></ul><ul><li>http://www.staffspasttrack.org.uk </li></ul>
    3. 3. Metadata <ul><li>Definition </li></ul><ul><li>Types </li></ul><ul><li>Location </li></ul><ul><li>Schemas </li></ul><ul><li>Vocabularies </li></ul><ul><li>Examples </li></ul>
    4. 4. Definition <ul><li>Common definition: </li></ul><ul><li>Information about information </li></ul><ul><li>For my purposes: </li></ul><ul><li>Structured information about information </li></ul>
    5. 5. Structure <ul><li>Organised information is created using </li></ul><ul><li>Schemas (element sets) </li></ul><ul><li>Vocabularies (values) </li></ul>
    6. 6. Purposes <ul><ul><li>Finding, identifying and understanding a resource Descriptive/Discovery metadata e.g. “Title”, “Subject” </li></ul></ul><ul><ul><li>Creating, managing and preserving a resource Administrative, Technical, and Preservation metadata e.g. “Format”, “Filesize” </li></ul></ul>
    7. 7. Purposes <ul><ul><li>Organising and relating resources Structural and Packaging metadata e.g. “Is part of”, “Master image location” </li></ul></ul><ul><ul><li>Using a resource Usage and User-contributed metadata e.g. “Published in”, “License requirements”, “User rating” </li></ul></ul>
    8. 8. Purposes <ul><li>Funder requirements… </li></ul><ul><li>Images for Education - A JISC funded project. </li></ul><ul><li>http:// imagesforeducation.org.uk / </li></ul>
    9. 9. Metadata - Attributes <ul><ul><li>Different ‘levels’ of a resource (e.g. item, component, collection) </li></ul></ul><ul><ul><li>Different ‘layers’ within a resource (e.g. physical resources, intermediaries, digital resources) </li></ul></ul><ul><ul><li>Things outside the resource (e.g. rights ownership) </li></ul></ul>
    10. 10. Metadata can have different origins… <ul><ul><li>“ Implicit” – derived from the image itself (typically technical data) </li></ul></ul><ul><ul><li>“ Explicit” – brought to the image (typically descriptive metadata; might be ‘legacy’ data, or newly created) </li></ul></ul><ul><ul><li>New metadata might be: </li></ul></ul><ul><ul><ul><li>Provided by an image contributor </li></ul></ul></ul><ul><ul><ul><li>Added by a cataloguer </li></ul></ul></ul><ul><ul><ul><li>Added by a user </li></ul></ul></ul><ul><ul><ul><li>All of above </li></ul></ul></ul>
    11. 11. … and can exist in different locations <ul><ul><li>Embedded within the digital resource itself </li></ul></ul><ul><ul><li>Held in a traditional database </li></ul></ul><ul><ul><li>Within an XML encoding </li></ul></ul><image> <ID> Jga-0019a </ID> <Title> Sanctuary of Apollo </Title> </image>
    12. 12. “ Standards” <ul><ul><li>Commonly, consistently applied formats or processes; measurable; well documented; endorsed by somebody </li></ul></ul><ul><ul><li>JISC Digital Media recommends: </li></ul></ul><ul><ul><ul><li>Where there are clear standards, use them </li></ul></ul></ul><ul><ul><ul><li>Where standards are unclear/competing, follow models of good practice within your ‘community’ </li></ul></ul></ul><ul><ul><ul><li>Where there are no standards/models, create your own (and document them carefully!) </li></ul></ul></ul><ul><ul><ul><li>Watch this space… </li></ul></ul></ul>
    13. 13. Be very aware of differences <ul><ul><li>How do they deal with different “layers” within a resource? (e.g. images of images of images…) </li></ul></ul><ul><ul><li>What purposes are they serving? (description, administration, presevervation…) </li></ul></ul>
    14. 14. Choosing, adapting, and mapping schemas <ul><ul><li>Ideally we ’ d pull a schema off the shelf and begin cataloguing </li></ul></ul><ul><ul><li>Choice is clear for some collections but difficult for others (esp. where collection spans resource types or communities) </li></ul></ul><ul><ul><li>Adaptation is common and generally necessary (but needs to be done carefully!) </li></ul></ul><ul><ul><li>You might be combining several standard schemas or developing your own and mapping to standards for particular purposes </li></ul></ul>
    15. 15. Dublin Core <ul><ul><li>International (ISO 15836-2003) cross-community standard for describing digital resources http://dublincore.org/ </li></ul></ul><ul><ul><li>Concentrates on descriptive/ discovery metadata </li></ul></ul><ul><ul><li>“ 1:1 rule” (1 record for 1 thing) </li></ul></ul><ul><ul><li>Frequently adapted, mapped-to, used to achieve interoperability </li></ul></ul>Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights
    16. 16. VRA Core <ul><ul><li>Visual Resources Association </li></ul></ul><ul><ul><li>Version 4.0 is now also available </li></ul></ul><ul><ul><li>Concentrates on descriptive/discovery metadata </li></ul></ul><ul><ul><li>For art and cultural images </li></ul></ul><ul><ul><li>Influenced by Dublin Core </li></ul></ul><ul><ul><li>1:1 rule (Work/Image) </li></ul></ul><ul><ul><li>Frequently adapted </li></ul></ul><ul><ul><li>http:// www.vraweb.org / </li></ul></ul>Record Type Type Title Measurements Material Technique Creator Date Location ID Number Style/Period Culture Subject Relation Description Source Rights
    17. 17. SEPIADES <ul><li>Safeguarding European Photographic Images for Access </li></ul><ul><ul><li>For photographic collections </li></ul></ul><ul><ul><li>Very extensive, with many sub-categories </li></ul></ul><ul><ul><li>Covers description and administration, physical works and their digital reproductions </li></ul></ul><ul><ul><li>Multi-level description which can describe a whole collection at many levels at once (based on archival metadata ) </li></ul></ul><ul><ul><li>http://www.knaw.nl/ecpa/sepia/workinggroups/wp5/sepiadestool/sepiadesdef.pdf </li></ul></ul>
    18. 18. CDWA <ul><li>Categories for the Description of Works of Art </li></ul><ul><li>Describes art works or cultural objects </li></ul><ul><ul><li>Museum/gallery community </li></ul></ul><ul><ul><li>Extensive with many sub-categories </li></ul></ul><ul><ul><li>Covers description and administration, original works and their reproductions </li></ul></ul><ul><ul><li>Can describe complex objects with multiple parts </li></ul></ul><ul><ul><li>Note that there is a ‘lite’ version </li></ul></ul><ul><ul><li>http://www.getty.edu/research/conducting_research/standards/cdwa/index.html </li></ul></ul>
    19. 19. Some Established Mappings <ul><ul><li>Mapping metadata schemas: </li></ul></ul><ul><ul><ul><li>Getty crosswalks: http://www.getty.edu/research/conducting_research/standards/intrometadata/crosswalks.html </li></ul></ul></ul><ul><ul><ul><li>UKOLN resources: http:// www.ukoln.ac.uk /metadata/ </li></ul></ul></ul>
    20. 20. Vocabularies Image courtesy of stock.xchng
    21. 21. Why Use Controlled Vocabularies? <ul><ul><li>Better retrieval </li></ul></ul><ul><ul><li>Improved cataloguing efficiency and consistency </li></ul></ul><ul><ul><li>‘ Disambiguate’ the language (e.g. ‘bank’) </li></ul></ul><ul><ul><li>Put things in their place (e.g. classify, identify relationships) </li></ul></ul><ul><ul><li>Support interoperability (improved cross-searching and metadata sharing) </li></ul></ul>
    22. 22. Ways to Control Vocabularies <ul><ul><li>Data entry rules or guidelines </li></ul></ul><ul><ul><li>Formal subject headings </li></ul></ul><ul><ul><li>Thesauri </li></ul></ul><ul><ul><li>Classifications </li></ul></ul><ul><ul><li>Authority lists (people, places, events…) </li></ul></ul><ul><ul><li>In-house keyword lists </li></ul></ul><ul><ul><li>Uncontrolled cataloguer-added keywords? </li></ul></ul><ul><ul><li>Combination of approaches </li></ul></ul>
    23. 23. What about ‘Uncontrolled’ Keywords? <ul><ul><li>Made up by a cataloguer at the point of cataloguing </li></ul></ul><ul><ul><li>Not an either/or situation – metadata can accommodate both </li></ul></ul><ul><ul><li>A mix of both can assist with retrieval </li></ul></ul>
    24. 24. Alternative Vocabularies <ul><li>Consider some more creative approaches : </li></ul><ul><ul><li>Ask some of your users to ‘catalogue’ a representative sample of your collection </li></ul></ul><ul><ul><li>Get your users to do the cataloguing! </li></ul></ul><ul><ul><li>Get the technology to do the cataloguing! (e.g. CBIR) </li></ul></ul><ul><ul><li>Draw on vocabularies from other communities, traditions and disciplines </li></ul></ul><ul><ul><li>Use an alternative vocabulary source (e.g. a children’s encyclopaedia, book index) </li></ul></ul>
    25. 25. CBIR & Community Involvement <ul><ul><li>Exploring Flickr by colour http:// labs.systemone.at/retrievr / </li></ul></ul><ul><ul><li>Using Flickr to catalogue a collection http:// www.flickr.com/photos/Library_of_Congress / </li></ul></ul><ul><ul><li>Galaxy Zoo - http:// www.galaxyzoo.org / </li></ul></ul>
    26. 26. Further Support and Guidance <ul><li>Web site: http://www.jiscdigitalmedia.ac.uk/ </li></ul><ul><li>helpdesk: http://www.jiscdigitalmedia.ac.uk/helpdesk/ </li></ul><ul><li>JISC Mail: </li></ul><ul><li>https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind0907&L=JISCDIGITALMEDIA </li></ul>

    ×