Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Metadata Training for Staff and Librarians for the New Data Environment


Published on

Presented at the 2011 DLF Forum in Baltimore Maryland.

Published in: Education, Technology

Metadata Training for Staff and Librarians for the New Data Environment

  1. 2. Today’s Task <ul><li>Part 1: Audiences, current training strategies, cost-effectiveness </li></ul><ul><li>Part 2: A taste of the training </li></ul><ul><ul><li>“ From Metadata to a Web of Data” </li></ul></ul><ul><li>Part 3: Structured feedback session </li></ul><ul><ul><li>Can you help us make this better? </li></ul></ul>DLF Forum, Nov. 2, 2011
  2. 3. Why Are We Doing This? <ul><li>Increasing frustration with webinars </li></ul><ul><ul><li>Not particularly good for anything but introductions </li></ul></ul><ul><ul><li>Very few opportunities for interaction or follow-up </li></ul></ul><ul><li>One day seminars at various institutions and conferences also seems limited in terms of participation </li></ul><ul><li>‘ Older’ model of repeatable workshops (with a group of trainers) is still useful if tweaked </li></ul><ul><ul><li>Better opportunities for participation and learning </li></ul></ul>DLF Forum, Nov. 2, 2011
  3. 4. Goals <ul><li>Offer direct training for libraries in a format that encourages participatory learning </li></ul><ul><ul><li>Building on the successful library workshop model is one option </li></ul></ul><ul><li>Encourage other library organizations and conference planners to include training options in their regular meetings </li></ul><ul><ul><li>Generally requires members to lobby for workshops, pre-conferences, etc. </li></ul></ul>DLF Forum, Nov. 2, 2011
  4. 5. Part I: Intro to Metadata <ul><li>Questions: </li></ul><ul><ul><li>Do we have a shared understanding of metadata </li></ul></ul><ul><ul><li>What are some of the practical definitions and modes of thinking that you can use in practice? </li></ul></ul><ul><ul><li>What is the basis for understanding the technology context of today’s data? </li></ul></ul>DLF Forum, Nov. 2, 2011
  5. 6. Intro to Metadata <ul><li>What is metadata? </li></ul><ul><ul><li>not: data about data </li></ul></ul><ul><li>Instead: Data with a purpose </li></ul><ul><ul><li>constructed (human-made, artificial) </li></ul></ul><ul><ul><li>constructive (designed for a purpose, not theoretical) </li></ul></ul><ul><ul><li>computable (all metadata today will be used by computer applications as well as managed and understood by humans) </li></ul></ul>DLF Forum, Nov. 2, 2011
  6. 7. Exercise 1: Data With a Purpose <ul><li>Each group has a book on the table. What metadata is needed for: </li></ul><ul><ul><li>A warehouse that will ship books to bookstores </li></ul></ul><ul><ul><li>A brick-and-mortar bookstore that orders books, displays and sells them </li></ul></ul><ul><ul><li>An online bookstore that will take orders and ship books to customers </li></ul></ul><ul><li>Look over your lists—it will cost you $1 for every metadata field you create. If you use this field in your operation, you get back the $1 </li></ul><ul><ul><li>Have you changed your mind? </li></ul></ul>DLF Forum, Nov. 2, 2011
  7. 9. Part II: Understanding DATA <ul><li>Goals: </li></ul><ul><ul><li>Understand the difference between data and text by thinking about computability </li></ul></ul><ul><ul><li>Learn some basic data types </li></ul></ul><ul><ul><li>Recognize data types in library data </li></ul></ul>DLF Forum, Nov. 2, 2011
  8. 10. Standard Data Types <ul><li>Text – ‘text’ (we know this one!) </li></ul><ul><li>Defined data types: </li></ul><ul><ul><li>Date (& time) </li></ul></ul><ul><ul><li>Currency </li></ul></ul><ul><ul><li>Numbers (integers, etc.) </li></ul></ul><ul><li>Controlled lists: finite sets of values to use </li></ul><ul><ul><li>Languages (ISO) </li></ul></ul><ul><ul><li>Countries (ISO) </li></ul></ul>DLF Forum, Nov. 2, 2011
  9. 11. Why Data? <ul><li>Enables machine processing of amounts of data too large for humans to grasp (which is just about all of our information) </li></ul><ul><ul><li>processing across patron files, or bibliographic database </li></ul></ul><ul><ul><li>processing on retrieved sets (e.g. extracting facets) </li></ul></ul><ul><li>Enables libraries to move beyond ‘artisanal metadata’ towards more efficient and cost-effective assignment of tasks to humans and machines </li></ul><ul><ul><li>Comes with new sources of data and new collaborations </li></ul></ul>DLF Forum, Nov. 2, 2011
  10. 12. Data Use Examples <ul><li>Making decisions </li></ul><ul><ul><li>If user for more than 5 years, then …   </li></ul></ul><ul><ul><li>If book height greater than x, then … </li></ul></ul><ul><li>Making connections </li></ul><ul><ul><li>These books have the same author </li></ul></ul><ul><ul><li>These books have the same (or similar topic) </li></ul></ul><ul><ul><li>These CDs have the same orchestra </li></ul></ul><ul><ul><li>This place of publication has lat/long info and can be located on a map </li></ul></ul>DLF Forum, Nov. 2, 2011
  11. 14. Things: What Your Metadata Talks About <ul><li>Book </li></ul><ul><li>Author </li></ul><ul><li>Place </li></ul><ul><li>Person (in subject) </li></ul><ul><li>Historical period </li></ul><ul><li>All of these exist outside your metadata, and are independent of it </li></ul><ul><ul><li>You can talk about these ‘things’ in many different contexts </li></ul></ul><ul><li>If you assign them identifiers that can be shared with others, then you have a ‘thing’ or entity </li></ul><ul><ul><li>Things become points of connection between metadata descriptions (e.g., all books by the same author) </li></ul></ul>DLF Forum, Nov. 2, 2011
  12. 15. Strings: Limited Connections <ul><li>Metadata statements using strings don’t represent (to machines) something outside the metadata </li></ul><ul><ul><li>They aren’t linkable to other things or strings </li></ul></ul><ul><ul><li>They often can’t be effectively parsed by machines </li></ul></ul><ul><li>Transcribed data in traditional library metadata is often ‘strings’ </li></ul><ul><ul><li>Titles are good examples </li></ul></ul><ul><li>Some strings are intended to identify something else (controlled author names, for instance) but may be used for display as well </li></ul>DLF Forum, Nov. 2, 2011
  13. 16. Exercise 2: Things & Strings <ul><li>Start with a simple file </li></ul><ul><li>Each group has a ‘record’ (BBC, etc.—not MARC) </li></ul><ul><li>A general description is provided of the purpose of the data </li></ul><ul><li>Tasks: </li></ul><ul><ul><li>Pick out the strings and things in your example </li></ul></ul><ul><ul><li>Bonus points: any data types? </li></ul></ul><ul><ul><li>Reporting by groups and discussion </li></ul></ul>DLF Forum, Nov. 2, 2011
  14. 18. Identifiers <ul><li>Uniquely identify a variety of resources </li></ul><ul><ul><li>On the web they use http and domain names </li></ul></ul><ul><li>Advantages </li></ul><ul><ul><li>Language independent </li></ul></ul><ul><ul><li>Display independent </li></ul></ul><ul><ul><li>Unambiguous </li></ul></ul><ul><li>Usage should be oriented towards machines, hidden from humans </li></ul><ul><ul><li>Humans have different requirements </li></ul></ul>DLF Forum, Nov. 2, 2011
  15. 19. Identifiers: What They Identify <ul><li>Easier to attach an identifier than understand what it actually identifies </li></ul><ul><ul><li>ISBN – identifies publisher’s product </li></ul></ul><ul><ul><li>LCCN – identifies LC-created metadata; = ISBN even though may have very similar metadata to publisher’s </li></ul></ul><ul><ul><li>DOI – identifies item in DOI system, but may link to a general sales page </li></ul></ul>DLF Forum, Nov. 2, 2011
  16. 20. Identifiers must … <ul><li>Be unique within a domain (private db; web) </li></ul><ul><li>Be consistent (identifier must always ID the same thing; DO NOT RE-USE!) </li></ul><ul><li>Be persistent (must live as long as thing it identifies) </li></ul><ul><li>Be in a standard format </li></ul>DLF Forum, Nov. 2, 2011
  17. 21. Note on “Consistent” <ul><li>The same thing may have more than one identifier – this happens naturally in the creation of metadata. It ’s not a huge problem as long as you have a way of saying that: </li></ul><ul><li>A = B </li></ul><ul><li>… so that you can bring together the identifiers for the same thing. (cf. VIAF; also xISBN) </li></ul><ul><li>This is the basis for mapping between vocabularies so that metadata can be more easily re-used </li></ul>DLF Forum, Nov. 2, 2011
  18. 22. Identifier Readability <ul><li>Opaque: no meaning to the identifier, ex.: LCCN example (just a number) </li></ul><ul><li>Readable: makes sense to a human, ex.: Wikipedia page IDs (include page name or partial page name) </li></ul><ul><li>Can be both: system can add readable bit to opaque identifier, ex.: Open Library thing IDs </li></ul><ul><li>Choices here are controversial, and have a big impact on multilingual efforts </li></ul>DLF Forum, Nov. 2, 2011
  19. 24. The Open World Assumption <ul><li>“ The open world assumption (OWA) is used in knowledge representation to codify the informal notion that in general no single agent or observer has complete knowledge, and therefore cannot make the closed world assumption.” </li></ul><ul><li>--Wikipedia </li></ul>DLF Forum, Nov. 2, 2011
  20. 25. Things with relationships to other things Thing Thing Relationship DLF Forum, Nov. 2, 2011
  21. 26. Things with relationships to other things Thing Thing Relationship Subject Predicate (verb) Object DLF Forum, Nov. 2, 2011
  22. 27. object can be URI or &quot;string&quot; URI is a thing some examples: book -- has author – [lcname#] book -- has author -- &quot;John Doe&quot; Subject and Predicate Must be URIs DLF Forum, Nov. 2, 2011
  23. 28. [diagram that shows this -- i have a slide]
  24. 30. Triples or Graphs? <ul><li>Machines work with triples </li></ul><ul><ul><li>Statements about the same thing have the same subject </li></ul></ul><ul><li>Graphs are easier for humans to understand </li></ul><ul><ul><li>In libraries we’re not used to visualizing data as graphs </li></ul></ul><ul><ul><li>More used to databases, files, hierarches </li></ul></ul><ul><li>Making this new world work for us is as much about changing how we think as it is changing what we do </li></ul>DLF Forum, Nov. 2, 2011
  25. 31. DLF Forum, Nov. 2, 2011 “ Graph Thinking” Graph relationships are different than tree relationships …
  26. 32. Exercise 3: Statements <ul><li>Present a set of triples and ask participants to turn them into sentences </li></ul><ul><ul><li>Ex.: Book has title ‘Moby Dick’ </li></ul></ul><ul><ul><li>Ex.: Book has author [lcna] or ‘Herman Melville’ </li></ul></ul><ul><ul><li>Ex.: Author has death date XXXX </li></ul></ul><ul><li>Suggest participants try drawing graphs to represent statements with the same subject </li></ul><ul><li>Suggest that participants represent how ‘strings’ create dead ends and ‘things’ can be linked </li></ul>DLF Forum, Nov. 2, 2011
  27. 33. Exercise 4: Statements <ul><li>Give each group a web page with a description </li></ul><ul><li>Ask them to organize the data as statements </li></ul><ul><li>See if the site you are using has data for persons, subjects or places </li></ul><ul><li>Discussion </li></ul><ul><ul><li>How hard was it to find the ‘things’? </li></ul></ul><ul><ul><li>Did you always have the predicates you needed? </li></ul></ul><ul><ul><li>How different is this from today’s metadata? </li></ul></ul>DLF Forum, Nov. 2, 2011
  28. 35. Properties and Classes <ul><li>Record-based metadata is often in the form of ‘records’, using elements from only one schema </li></ul><ul><li>Statement-based metadata is often more flexible </li></ul><ul><ul><li>Proper declaration, definition and management of the elements is very important </li></ul></ul><ul><ul><li>Mix and match is part of the value </li></ul></ul><ul><li>Some current schemas might find the transition to from records to statements more challenging </li></ul><ul><ul><li>Especially where the definition of the property depends on its place in a hierarchy (MODS and ONIX for example) </li></ul></ul>DLF Forum, Nov. 2, 2011
  29. 36. Hierarchy (top-down organization) A  Military Assets  Dogs ≠ B  Pets  Dogs DLF Forum, Nov. 2, 2011 A B Military assets Pets Guns Dogs Cats Dogs
  30. 37. Caveats <ul><li>Unless … there is a definition of dog and it can be used in either hierarchy </li></ul><ul><li>But if the meaning is defined by the hierarchy, the hierarchy is part of its meaning </li></ul>DLF Forum, Nov. 2, 2011
  31. 38. Bottom-up organization “ Dogs” has meaning on its own, and can be used in multiple contexts. DLF Forum, Nov. 2, 2011 Dogs Military assets Pets
  32. 39. Exercise 6: Mix & Match <ul><li>Each group is assigned an entity to describe in metadata </li></ul><ul><li>Around the room are poster-sized depictions of various vocabularies and their definitions </li></ul><ul><li>Groups are instructed to study their task, determine what elements they need, then get up and look at the posters </li></ul><ul><ul><li>Getting up and contemplating the posters encourages conversation! </li></ul></ul><ul><ul><li>Discussion: How do you decide what’s fit for purpose? </li></ul></ul>DLF Forum, Nov. 2, 2011
  33. 40. Overview of Training Plan DLF Forum, Nov. 2, 2011
  34. 41. Feedback <ul><li>Important questions as we continue to build this program </li></ul><ul><ul><li>Does the program plan seem useful? If not, what’s missing? </li></ul></ul><ul><ul><li>Does the content of the session seem at an appropriate level? What could be improved? </li></ul></ul><ul><li>What advice can you give about bringing this program to libraries? </li></ul><ul><ul><li>Is there a place for F2F training in your budgets? </li></ul></ul><ul><ul><li>Would you pay for personalized online training for staff or local trainers? </li></ul></ul>DLF Forum, Nov. 2, 2011
  35. 42. <ul><li>Slide Credits: </li></ul><ul><li>Karen Coyle </li></ul><ul><li>Diane Hillmann </li></ul><ul><li>Contact info: [email_address] </li></ul><ul><li>Metadata Matters: </li></ul>DLF Forum, Nov. 2, 2011