Metadata Training for Staff and Librarians for the New Data Environment
 

Metadata Training for Staff and Librarians for the New Data Environment

on

  • 3,446 views

Presented at the 2011 DLF Forum in Baltimore Maryland.

Presented at the 2011 DLF Forum in Baltimore Maryland.

Statistics

Views

Total Views
3,446
Views on SlideShare
3,262
Embed Views
184

Actions

Likes
4
Downloads
26
Comments
1

4 Embeds 184

http://www.diglib.org 126
http://www.silas.org.sg 49
http://pinterest.com 5
http://www.pinterest.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • No absolutes with metadata – all relative to what you need to do with your data. Can be very different for different applications.
  • Data facilitates information gathering. It can highlight information, as in the use of facets, or topic maps. These are connections that cannot be done manually in a reasonable amount of time, and therefore are connections that users do not see without the help of machines.
  • It is the opposite of the closed world assumption, which holds that any statement that is not known to be true is false.

Metadata Training for Staff and Librarians for the New Data Environment Metadata Training for Staff and Librarians for the New Data Environment Presentation Transcript

  •  
  • Today’s Task
    • Part 1: Audiences, current training strategies, cost-effectiveness
    • Part 2: A taste of the training
      • “ From Metadata to a Web of Data”
    • Part 3: Structured feedback session
      • Can you help us make this better?
    DLF Forum, Nov. 2, 2011
  • Why Are We Doing This?
    • Increasing frustration with webinars
      • Not particularly good for anything but introductions
      • Very few opportunities for interaction or follow-up
    • One day seminars at various institutions and conferences also seems limited in terms of participation
    • ‘ Older’ model of repeatable workshops (with a group of trainers) is still useful if tweaked
      • Better opportunities for participation and learning
    DLF Forum, Nov. 2, 2011
  • Goals
    • Offer direct training for libraries in a format that encourages participatory learning
      • Building on the successful library workshop model is one option
    • Encourage other library organizations and conference planners to include training options in their regular meetings
      • Generally requires members to lobby for workshops, pre-conferences, etc.
    DLF Forum, Nov. 2, 2011
  • Part I: Intro to Metadata
    • Questions:
      • Do we have a shared understanding of metadata
      • What are some of the practical definitions and modes of thinking that you can use in practice?
      • What is the basis for understanding the technology context of today’s data?
    DLF Forum, Nov. 2, 2011
  • Intro to Metadata
    • What is metadata?
      • not: data about data
    • Instead: Data with a purpose
      • constructed (human-made, artificial)
      • constructive (designed for a purpose, not theoretical)
      • computable (all metadata today will be used by computer applications as well as managed and understood by humans)
    DLF Forum, Nov. 2, 2011
  • Exercise 1: Data With a Purpose
    • Each group has a book on the table. What metadata is needed for:
      • A warehouse that will ship books to bookstores
      • A brick-and-mortar bookstore that orders books, displays and sells them
      • An online bookstore that will take orders and ship books to customers
    • Look over your lists—it will cost you $1 for every metadata field you create. If you use this field in your operation, you get back the $1
      • Have you changed your mind?
    DLF Forum, Nov. 2, 2011
  •  
  • Part II: Understanding DATA
    • Goals:
      • Understand the difference between data and text by thinking about computability
      • Learn some basic data types
      • Recognize data types in library data
    DLF Forum, Nov. 2, 2011
  • Standard Data Types
    • Text – ‘text’ (we know this one!)
    • Defined data types:
      • Date (& time)
      • Currency
      • Numbers (integers, etc.)
    • Controlled lists: finite sets of values to use
      • Languages (ISO)
      • Countries (ISO)
    DLF Forum, Nov. 2, 2011
  • Why Data?
    • Enables machine processing of amounts of data too large for humans to grasp (which is just about all of our information)
      • processing across patron files, or bibliographic database
      • processing on retrieved sets (e.g. extracting facets)
    • Enables libraries to move beyond ‘artisanal metadata’ towards more efficient and cost-effective assignment of tasks to humans and machines
      • Comes with new sources of data and new collaborations
    DLF Forum, Nov. 2, 2011
  • Data Use Examples
    • Making decisions
      • If user for more than 5 years, then …  
      • If book height greater than x, then …
    • Making connections
      • These books have the same author
      • These books have the same (or similar topic)
      • These CDs have the same orchestra
      • This place of publication has lat/long info and can be located on a map
    DLF Forum, Nov. 2, 2011
  •  
  • Things: What Your Metadata Talks About
    • Book
    • Author
    • Place
    • Person (in subject)
    • Historical period
    • All of these exist outside your metadata, and are independent of it
      • You can talk about these ‘things’ in many different contexts
    • If you assign them identifiers that can be shared with others, then you have a ‘thing’ or entity
      • Things become points of connection between metadata descriptions (e.g., all books by the same author)
    DLF Forum, Nov. 2, 2011
  • Strings: Limited Connections
    • Metadata statements using strings don’t represent (to machines) something outside the metadata
      • They aren’t linkable to other things or strings
      • They often can’t be effectively parsed by machines
    • Transcribed data in traditional library metadata is often ‘strings’
      • Titles are good examples
    • Some strings are intended to identify something else (controlled author names, for instance) but may be used for display as well
    DLF Forum, Nov. 2, 2011
  • Exercise 2: Things & Strings
    • Start with a simple file
    • Each group has a ‘record’ (BBC, etc.—not MARC)
    • A general description is provided of the purpose of the data
    • Tasks:
      • Pick out the strings and things in your example
      • Bonus points: any data types?
      • Reporting by groups and discussion
    DLF Forum, Nov. 2, 2011
  •  
  • Identifiers
    • Uniquely identify a variety of resources
      • On the web they use http and domain names
    • Advantages
      • Language independent
      • Display independent
      • Unambiguous
    • Usage should be oriented towards machines, hidden from humans
      • Humans have different requirements
    DLF Forum, Nov. 2, 2011
  • Identifiers: What They Identify
    • Easier to attach an identifier than understand what it actually identifies
      • ISBN – identifies publisher’s product
      • LCCN – identifies LC-created metadata; = ISBN even though may have very similar metadata to publisher’s
      • DOI – identifies item in DOI system, but may link to a general sales page
    DLF Forum, Nov. 2, 2011
  • Identifiers must …
    • Be unique within a domain (private db; web)
    • Be consistent (identifier must always ID the same thing; DO NOT RE-USE!)
    • Be persistent (must live as long as thing it identifies)
    • Be in a standard format
    DLF Forum, Nov. 2, 2011
  • Note on “Consistent”
    • The same thing may have more than one identifier – this happens naturally in the creation of metadata. It ’s not a huge problem as long as you have a way of saying that:
    • A = B
    • … so that you can bring together the identifiers for the same thing. (cf. VIAF; also xISBN)
    • This is the basis for mapping between vocabularies so that metadata can be more easily re-used
    DLF Forum, Nov. 2, 2011
  • Identifier Readability
    • Opaque: no meaning to the identifier, ex.: LCCN example (just a number)
    • Readable: makes sense to a human, ex.: Wikipedia page IDs (include page name or partial page name)
    • Can be both: system can add readable bit to opaque identifier, ex.: Open Library thing IDs
    • Choices here are controversial, and have a big impact on multilingual efforts
    DLF Forum, Nov. 2, 2011
  •  
  • The Open World Assumption
    • “ The open world assumption (OWA) is used in knowledge representation to codify the informal notion that in general no single agent or observer has complete knowledge, and therefore cannot make the closed world assumption.”
    • --Wikipedia
    DLF Forum, Nov. 2, 2011
  • Things with relationships to other things Thing Thing Relationship DLF Forum, Nov. 2, 2011
  • Things with relationships to other things Thing Thing Relationship Subject Predicate (verb) Object DLF Forum, Nov. 2, 2011
  • object can be URI or "string" URI is a thing some examples: book -- has author – [lcname#] book -- has author -- "John Doe" Subject and Predicate Must be URIs DLF Forum, Nov. 2, 2011
  • [diagram that shows this -- i have a slide]
  •  
  • Triples or Graphs?
    • Machines work with triples
      • Statements about the same thing have the same subject
    • Graphs are easier for humans to understand
      • In libraries we’re not used to visualizing data as graphs
      • More used to databases, files, hierarches
    • Making this new world work for us is as much about changing how we think as it is changing what we do
    DLF Forum, Nov. 2, 2011
  • DLF Forum, Nov. 2, 2011 http://milicicvuk.com/blog/2011/10/04/the-web-is-just-a-bunch-of-trees-plus-shorcuts/ “ Graph Thinking” Graph relationships are different than tree relationships …
  • Exercise 3: Statements
    • Present a set of triples and ask participants to turn them into sentences
      • Ex.: Book has title ‘Moby Dick’
      • Ex.: Book has author [lcna] or ‘Herman Melville’
      • Ex.: Author has death date XXXX
    • Suggest participants try drawing graphs to represent statements with the same subject
    • Suggest that participants represent how ‘strings’ create dead ends and ‘things’ can be linked
    DLF Forum, Nov. 2, 2011
  • Exercise 4: Statements
    • Give each group a web page with a description
    • Ask them to organize the data as statements
    • See if the site you are using has data for persons, subjects or places
    • Discussion
      • How hard was it to find the ‘things’?
      • Did you always have the predicates you needed?
      • How different is this from today’s metadata?
    DLF Forum, Nov. 2, 2011
  •  
  • Properties and Classes
    • Record-based metadata is often in the form of ‘records’, using elements from only one schema
    • Statement-based metadata is often more flexible
      • Proper declaration, definition and management of the elements is very important
      • Mix and match is part of the value
    • Some current schemas might find the transition to from records to statements more challenging
      • Especially where the definition of the property depends on its place in a hierarchy (MODS and ONIX for example)
    DLF Forum, Nov. 2, 2011
  • Hierarchy (top-down organization) A  Military Assets  Dogs ≠ B  Pets  Dogs DLF Forum, Nov. 2, 2011 A B Military assets Pets Guns Dogs Cats Dogs
  • Caveats
    • Unless … there is a definition of dog and it can be used in either hierarchy
    • But if the meaning is defined by the hierarchy, the hierarchy is part of its meaning
    DLF Forum, Nov. 2, 2011
  • Bottom-up organization “ Dogs” has meaning on its own, and can be used in multiple contexts. DLF Forum, Nov. 2, 2011 Dogs Military assets Pets
  • Exercise 6: Mix & Match
    • Each group is assigned an entity to describe in metadata
    • Around the room are poster-sized depictions of various vocabularies and their definitions
    • Groups are instructed to study their task, determine what elements they need, then get up and look at the posters
      • Getting up and contemplating the posters encourages conversation!
      • Discussion: How do you decide what’s fit for purpose?
    DLF Forum, Nov. 2, 2011
  • Overview of Training Plan DLF Forum, Nov. 2, 2011
  • Feedback
    • Important questions as we continue to build this program
      • Does the program plan seem useful? If not, what’s missing?
      • Does the content of the session seem at an appropriate level? What could be improved?
    • What advice can you give about bringing this program to libraries?
      • Is there a place for F2F training in your budgets?
      • Would you pay for personalized online training for staff or local trainers?
    DLF Forum, Nov. 2, 2011
    • Slide Credits:
    • Karen Coyle
    • Diane Hillmann
    • Contact info: [email_address]
    • Metadata Matters: http://managemetadata.com/blog
    DLF Forum, Nov. 2, 2011