• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Does metadata matter?
 

Does metadata matter?

on

  • 21,541 views

A lunchtime seminar for Eduserv staff.

A lunchtime seminar for Eduserv staff.

Statistics

Views

Total Views
21,541
Views on SlideShare
18,257
Embed Views
3,284

Actions

Likes
37
Downloads
306
Comments
4

47 Embeds 3,284

http://www.semanticlibrary.net 1358
http://efoundations.typepad.com 837
http://www.crossref.org 279
http://upr.libguides.com 177
http://orweblog.oclc.org 116
http://courseweb.lis.illinois.edu 67
http://www.kevenlw.name 57
http://semanticlibrary.net 53
http://alanpoon.wordpress.com 51
http://www.catalogingfutures.com 49
http://rioghail.com 34
http://ryojin3.blogspot.jp 23
http://www.sciencedirectly.com 18
http://ifi7033.edublogs.org 18
http://elearning.technology.in.th 17
http://www.slideshare.net 15
http://www.ghislain-chasme.net 15
http://annewelsh.wordpress.com 14
http://ryojin3.blogspot.com 14
http://wiki.sbb.spk-berlin.de 12
https://tcc.blackboard.com 11
http://crosstech.crossref.org 8
http://wiki.sbb.spk-berlin.de:8000 7
http://translate.googleusercontent.com 3
http://blog.jaffamonkey.com 3
http://pintini.blogspirit.com 3
http://www.eduservinternet.local 2
http://0-www.crossref.org.biblio.eui.eu 2
http://0-www.crossref.org.libcat.lafayette.edu 2
http://0-www.crossref.org.brum.beds.ac.uk 2
http://www.familylegacyvideos.com 1
http://www.eduserv.org.uk 1
http://theoldreader.com 1
http://synote-server.ecs.soton.ac.uk:8083 1
http://www.blogspirit.com 1
http://molecularmanufacture.com 1
http://sciencedirectly.com 1
http://webcache.googleusercontent.com 1
http://inezha.com 1
http://www.paperblog.fr 1
http://xianguo.com 1
http://0-www.crossref.org.oasis.unisa.ac.za 1
http://www.zhuaxia.com 1
http://static.slideshare.net 1
http://72.14.235.104 1
http://courseweb.lis.uiuc.edu 1
http://www.rioghail.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

14 of 4 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • I think this is a key slide for the presentation - you need metadata where machines can't do a good job of deriving a self-description (and a ranking)
    Are you sure you want to
    Your message goes here
    Processing…
  • Actually z39.50 was supplemented by these some time ago - and various other things such as OpenSearch have come along since.
    Are you sure you want to
    Your message goes here
    Processing…
  • Actually z39.50 was supplemented by these some time ago - and various other things such as OpenSearch have come along since.
    Are you sure you want to
    Your message goes here
    Processing…
  • yes, i'm an idiot - 1200 is the 13th century, not the 11th!
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Does metadata matter? Does metadata matter? Presentation Transcript

    • Does metadata matter?
      • or…
      • should we be interested in metadata and, if so, why?
      • I’m going to try to deliver 130 slides in 30 minutes
      • then you can ask questions
      • (yes… I really did say “130 slides”)
      • non-technical
      • metadata jargon
      • first, some history…
    •  
    •  
      • metadata is…
      • machine-readable
      • descriptive
      • for the purposes of…
      • resource discovery
      • resource management
      • delivery / access control
      • use / re-use
      • long term preservation
    •  
      • MARC - Machine-Readable Catalogue
    •  
    •  
      • still the predominant
      • metadata standard
      • (in the library world)
    •  
      • a distributed search standard called…
      • Z39.50
      • so that multiple library catalogues can be searched from one place
      • AACR2 currently being replaced by…
      • RDA
      • (more generic – i.e. not just books!)
      • Z39.50 supplemented by SRW and SRU
      • (Web-friendly variants)
      • FRBR
      • none of which needs bother you…
      • other than to note that…
      • metadata tends to get more complicated the longer you think about it
      • 1994
    •  
    •  
      • a few 10s of 1000s of pages
      • but recognised that finding stuff was going to start getting difficult
      • people (mainly librarians) began trying to catalogue it by hand
    •  
    •  
    • http://www.intute.ac.uk/ http://www.intute.ac.uk/
      • meanwhile…
      • AltaVista
      • (first major Web search engine – circa 1995)
      • people began to realise that the metadata they embedded into Web pages might be important
      • hang on…
      • did I just say “metadata embedded into Web pages”?
      • <html>
      • <head>
      • <title>A web page</title>
      • </head>
      • <body>
      • </body>
      • </html>
      • <html>
      • <head>
      • <title>A web page</title>
      • <meta name=“keywords” content=“some, key, words” />
      • <meta name=“description” content=“a summary” />
      • </head>
      • <body>
      • birth of the SEO industry
      • then came Google
      • and the rest, as they say, is history
      • Google takes note of links between pages
      • Google PageRank
      • but places less emphasis on embedded metadata
      • metaspam
      • <meta name=“keywords” content=“coca cola” />
      • metacrap
      • <title>put your title here</title>
      • despite that, work continued on embedded metadata
      • most notably in the form of…
      • Dublin Core
      • (circa 1995)
      • initially 15 metadata elements
      • a.k.a properties
      • a.k.a. attribute/value pairs
      • contributor
      • coverage
      • creator
      • date
      • description
      • format
      • identifier
      • language
      • publisher
      • relation
      • rights
      • source
      • subject
      • title
      • type
      • embedded into Web pages
      • or encoded using XML
      • intention was to improve indexing by search engines
      • but people forgot about…
      • “ metaspam” and “metacrap”
      • the search engines didn’t!
      • and so, by and large,
      • search engines still ignore embedded metadata
      • despite that, there has been fairly widespread adoption in policy terms
      • particularly in e-Government
      • (e.g. UK eGMS)
      • but also in other areas – education, health, environmental agencies, libraries, cultural heritage sector, …
      • growth of rules around metadata content
      • (i.e. cataloguing rules)
      • (everyone’s rules are different)
      • and growth in use of additional elements for particular communities
      • (and everyone’s additions are different)
      • such usage documented in the form of “application profiles”
      • Dublin Core Metadata Initiative
      • coordinating “standards” body
      • (note again: growing complexity over time)
      • meanwhile…
      • the W3C developed the Resource Description Framework (RDF)
      • (circa 1999)
      • the standard for the “Semantic Web”
      • Tim Berners-Lee’s vision for a machine-readable Web of data
      • allowing software to navigate and reason about Web content automatically
      • a Web of “Linked Data”
      • RDF, RDFS, OWL, FOAF, …
      • (but also Microformats, RSS, …)
      • meanwhile…
      • elearning community was busy developing its own standards
      • IEEE LOM
      • (Learning Object Metadata)
      • same as DC
      • …but different!
      • (different elements, different syntax)
      • <cough />
      • a brief aside
      • identifiers are important
      • URI (Uniform Resource Identifier) is the identifier system of the Web
      • but some issues…
      • e.g. around persistence
      • to cut a long story short
      • it turns out that ‘http’ URIs (a.k.a. URLs) are the worst kind of Web identifier…
      • …apart from all the others
      • (not everyone agrees with that!)
      • <breath />
      • repositories
    •  
    •  
      • but arXiv was not the only repository
      • recognised need for aggregating metadata from different repositories into a single place so that it could be searched
      • OAI-PMH
      • (a protocol for metadata harvesting)
      • harvesting metadata into repository search engines
      • of which OAIster is best known
      • (but it isn’t really used much)
      • and the major search engines like Google don’t support the OAI-PMH
      • because it isn’t mainstream enough
      • (and because of “metaspam” and “metacrap”)
      • and so to 2008…
      • political agenda around institutional repositories
      • in order to store, manage and disclose…
      • institutional assets
      • research papers
      • learning objects
      • research data
      • exposing metadata about these things using the OAI-PMH to search services
      • which services?
      • err… OAIster :-(
      • what kind of metadata?
      • typically DC
      • or a variant of DC
      • because simple DC is really too simple to be very useful
      • but unfortunately…
      • …DC variants tend to be more complex
      • and therefore metadata can’t be created very easily by ordinary researchers
      • but that’s another story!
      • ok…
      • why are we interested?
      • are we interested?
      • metadata is everywhere
    •  
    •  
    •  
    •  
    •  
    •  
      • metadata in sites like the Science Museum is mostly locked away within the site
      • can expect growing pressure to expose it on the Web for others to “mash up”
      • in HE, we are operating in an “institutional repository” political environment
      • we (should?) have an interest in repositories of research publications and research data
      • particularly research data
      • metadata comes to the fore in scenarios where content is non-textual (e.g. data) and where required information can’t easily be derived from textual content (e.g. author name)
      • thank you
      • OK I lied…
      • …it was 128 slides
      • unless you count these last two