Metadata Workshop


Published on

Slides from my Metadata Workshop at Content Strategy Applied 2012. The session included several hands on exercises, which is where a lot of the interesting conversation took place.

Published in: Technology, Self Improvement

Metadata Workshop

  1. 1. METADATA WORKSHOP Rachel Lovinger @rlovinger Content Strategy Applied March 1, 2012Photo by wizetux
  2. 2. ABOUT ME: RACHEL LOVINGER 2• Associate Content Strategy Director, Razorfish, New York• Co-editor of scatter/gather, a content strategy blog:• Author of Nimble: A Razorfish Report on Publishing in the Digital Age (June 2010): (@NimbleRF on Twitter) ©2012 Razorfish. All rights reserved. Photo by Rohanna Mertens
  5. 5. MY DEFINITION 5• Data is the core communication of a piece of content.• Metadata is information about the content that provides structure, context and meaning. ©2012 Razorfish. All rights reserved.
  6. 6. Structure
  7. 7. Context
  8. 8. Meaning
  9. 9. TYPES OF METADATA 9• Structural Metadata • Models the content types and attributes• Administrative Metadata • Indicates how, when and by whom the content was created • Defines how it can and will be used, its status, who can access it• Descriptive Metadata • Describes the subject matter of the content
  10. 10. EXAMPLE METADATA 10Title: Ta-dah!Description: That’s a serious jello mold!Tags: jello, layers, deliciousAppears in: Dinner (set)Created by: Dan DeLucaTaken on: February 14, 2010Taken with: Fujifilm FinePix F70EXRUsage Rights: CC-BY Some rights reservedSource URL: ©2012 Razorfish. All rights reserved. Photo by Dan DeLuca `
  11. 11. STRUCTURAL METADATA 11Title: Ta-dah!Description: That’s a serious jello mold!Tags: jello, layers, deliciousAppears in: Dinner (set)Created by: Dan DeLucaTaken on: February 14, 2010Taken with: Fujifilm FinePix F70EXRUsage Rights: CC-BY Some rights reservedSource URL:• Models the content types and attributes• Answers the question “What constitutes a piece of content?”• Example types: article, product, photo ©2012 Razorfish. All rights reserved. Photo by Dan DeLuca `
  12. 12. ADMINISTRATIVE METADATA 12Title: Ta-dah!Description: That’s a serious jello mold!Tags: jello, layers, deliciousAppears in: Dinner (set)Created by: Dan DeLucaTaken on: February 14, 2010Taken with: Fujifilm FinePix F70EXRUsage Rights: CC-BY Some rights reservedSource URL:• Often machine generated• Answers questions about the creation & status of the content• Examples: Author, publish date, status, rights and access ©2012 Razorfish. All rights reserved. Photo by Dan DeLuca `
  13. 13. DESCRIPTIVE METADATA 13Title: Ta-dah!Description: That’s a serious jello mold!Tags: jello, layers, deliciousAppears in: Dinner (set)Created by: Dan DeLucaTaken on: February 14, 2010Taken with: Fujifilm FinePix F70EXRUsage Rights: CC-BY Some rights reservedSource URL:• Describes the subject matter of the content• Answers the question “What is this content about?”• Examples: Keywords, subjects, title, description and abstract ©2012 Razorfish. All rights reserved. Photo by Dan DeLuca `
  15. 15. EXERCISE 15Group Discussion: How is Metadata used? ©2012 Razorfish. All rights reserved.
  16. 16. SEARCH 16
  17. 17. BROWSE 17
  19. 19. AGGREGATION 19
  20. 20. SYNDICATION 20
  23. 23. HOW METADATA IS USED 23• Search• Browse• Contextual Linking• Aggregation• Syndication• Access Permissions• Personalized Content• Advanced Functionality ©2012 Razorfish. All rights reserved.
  26. 26. CREATING STRUCTURAL METADATA 26Start with the Content Management System1. Determine the content types.2. Determine the elements that make up each type.3. Determine any potential relationships between content types. ©2012 Razorfish. All rights reserved.
  27. 27. CONTENT MANAGEMENT SYSTEMS 27Separate the information fromthe presentation ©2012 Razorfish. All rights reserved.
  28. 28. DETERMINE THE CONTENT TYPES 28• Which types of content are different enough that they might warrant a unique structure and/or layout?• Article, quiz, slideshow, recipe and event are all fairly distinct. ©2012 Razorfish. All rights reserved. © A List Apart, Jeff Baker and Alex Graham, Washington Post, Food Network, and Barnes & Noble
  29. 29. DETERMINE THE ELEMENTS OF EACH TYPE 29• Figure out the separate elements, or attributes, of each one.• Think about how each segment of information will be used.• EX: Event Event Name Date & Time Location ©2012 Razorfish. All rights reserved. Event © Barnes & Noble
  30. 30. DETERMINE RELATIONSHIPS BETWEEN TYPES 30• Content can be linked or embedded within another item.• EX: The book & author each have their own page Book Page Author Page ©2012 Razorfish. All rights reserved. Event © Barnes & Noble
  31. 31. EXERCISE 31Individual Task:Structural Metadata• Identify the content attributes & relationships in a recipe ©2012 Razorfish. All rights reserved. Recipe © Food Network
  32. 32. STRUCTURAL METADATA: RECIPE 32Content Attributes:• Title• Author• Tags• Time• Level• Yield• Ingredients• Directions ©2012 Razorfish. All rights reserved. Recipe © Food Network
  33. 33. STRUCTURAL METADATA: RECIPE 33Relationships:• Show• Episode• Photo• Sub-Recipes• Glossary Terms• Related Guides/Menus• Recipes Like This ©2012 Razorfish. All rights reserved. Recipe © Food Network
  35. 35. CREATING ADMINISTRATIVE METADATA 35Consider how the content is used, published, and delivered.1. Identify functionality driven by administrative aspects of the content.2. Determine preferred formats for administrative attributes.Note: Administrative metadata tends to be used a lot more in digital records of offlinematerial, so you’ll see it utilized a lot in library and archive work. ©2012 Razorfish. All rights reserved.
  36. 36. CONSIDERATIONS 36• Where did the content come from?• Are there restrictions on how it can be used?• Is the content time-sensitive or evergreen?• Who can access it?• When it’s archived or indexed, how will it be ordered?• Does the content have to adhere to any legal regulations? ©2012 Razorfish. All rights reserved.
  37. 37. IDENTIFY FUNCTIONALITY 37Dynamic functionality based on the status, date, permissions, or otheradministrative aspects of the content. ©2012 Razorfish. All rights reserved.
  38. 38. DETERMINE FORMATS & VALUES 38• Text Field• Numbers• DateTime • Can appear in a variety of formats, for example: - YYYY-MM-DDThh:mm:ss[.mmm] - 2012-03-01T11:00:00• Boolean • True or False• Selection ListNote: In order to reliably sort or filter content by administrative data, use the appropriateformat so that the data can be compared in a meaningful way. ©2012 Razorfish. All rights reserved.
  39. 39. EXERCISE 39Group Discussion: What kinds of functionality would besupported by the following Administrative Metadata?• Publish date• Status• Source• Author• Version• Copyright ©2012 Razorfish. All rights reserved.
  41. 41. CREATING DESCRIPTIVE METADATA 41What is the content is about?1. Determine the high level dimensions.2. Determine the level of depth needed to support functionality.3. Fill out the details in each dimension.Note:There are a lot of things that can be described about any give content.To constrainthe scope, the aspects that are being described should also be based on data that’s neededto drive functionality. ©2012 Razorfish. All rights reserved.
  42. 42. IDENTIFY FUNCTIONALITY 42Dynamic functionality based on descriptive aspects of content. ©2012 Razorfish. All rights reserved.
  43. 43. HIGH LEVEL DIMENSIONS 43 Home Decorating Vocabulary Decorating Decorating Rooms Styles Solutions Source Room Details Home Items Publications ©2012 Razorfish. All rights reserved.
  44. 44. LEVEL OF DEPTH NEEDED 44• Floors - Loose Material Floors - Marble Floors • Concrete Floors - Onyx Floors • Laminate Floors - Quartzite Floors - Stone Laminate Floors - Slate Floors - Wood Laminate Floors - Terrazzo Floors • Metal Floors - Travertine Floors • Resilient Floors • Wood Floors - Cork Floors - Bamboo Floors - Leather Floors - Cherry Floors - Linoleum Floors - Mahogany Floors - Rubber Floors - Maple Floors - Vinyl Floors - Oak Floors • Stone Floors - Pecan Floors - Brick Floors - Pine Floors - Granite Floors - Teak Floors - Limestone Floors ©2012 Razorfish. All rights reserved.
  45. 45. FILL OUT THE DETAILS 45• Hierarchical lists can be captured in a Word Document• Spreadsheets for more detailed lists ©2012 Razorfish. All rights reserved.
  46. 46. EXERCISE 46Team Task: Superheroes• Identify the high level dimensions to describe superheroes ©2012 Razorfish. All rights reserved. Characters © DC, Marvel & New England Comics
  47. 47. SUPERHEROES 47• Name • Real Name • Primary Alias • Other Aliases• Group Affiliation• Source of Power• Type of Power• Identity • Public/Secret• Gender• Citizenship• Place of Birth• Current Residence• Marital Status• Relatives ©2012 Razorfish. All rights reserved. Chart © Pop Chart Lab
  48. 48. SUPERHEROES – META-METADATA 48Data about the characters as characters• Creator• Licensed by• First appearance• Comics appearances• Movie appearances• Television appearances• Other appearances• Portrayed by ©2012 Razorfish. All rights reserved.
  49. 49. BREAK©2012 Razorfish. All rights reserved.
  51. 51. SOURCING VOCABULARIES 51Where do you get all this metadata?• Industry Standards• Commercial & Open Vocabularies• Your own content ©2012 Razorfish. All rights reserved.
  53. 53. INDUSTRY STANDARDS 53• Provide a good starting point for structural metadata that you can build upon• Make your content more compliant with tools and APIs that also use the standards• Enable more effective Search Engine Optimization ©2012 Razorfish. All rights reserved.
  54. 54. THREE FOUNDATIONAL STANDARDS 54Many of the other standards are based on these:• RDF/RDFa – a non-hierarchical structure for expressing metadata• Dublin Core – a set of core attributes that can be used for any type of content• – a collection of frameworks for a wide range of content types, developed by a collaboration between Google, Bing & Yahoo! ©2012 Razorfish. All rights reserved.
  55. 55. RDF: RESOURCE DESCRIPTION FRAMEWORK 55• Provides a structure (aka framework) for describing identified things (aka resources)• Composed of three basic elements • Resources – the things being described (Ex: “Men In Black”) • Properties – the relationships between things (Ex: “hasStar”) • Classes – the buckets used to group the things (Ex: “Movie”)• Elements combine to make simple statements called Triples • Men In Black is a Movie Movie Actor • Will Smith is an Actor typeOf typeOf • Men In Black stars Will Smith <MenInBlack> <hasStar> <WillSmith> Men In Black hasStar Will Smith ©2012 Razorfish. All rights reserved.
  56. 56. RDFA: RDF IN ATTRIBUTES 56• Allows RDF attributes and properties to be included in XHTML and HTML documents. <div> <div xmlns:dc=""> <h2 property="dc:title">The trouble with Bob</h2> <h2>The trouble with Bob</h2> <h3 property="dc:creator">Alice</h3> <h3>Alice</h3> … </div>• For more information see: ©2012 Razorfish. All rights reserved.
  57. 57. DUBLIN CORE METADATA INITIATIVE 57• A metadata framework for describing any type of content• Example attributes: • Name: The unique term that identifies the item • Label: The human-readable label assigned to the term • Definition: A description of the term.• Example properties: • abstract: A summary of the item • audience: The intended audience for the item • creator: A person, organization or service responsible for creating the item • license: Indicates usage rights for the item • subject: The topic of the item• For more information see: ©2012 Razorfish. All rights reserved.
  58. 58. SCHEMA.ORG 58• A collaboration between Google, Bing and Yahoo!• Intended to improve the display of search results by directly including valuable data right• Includes formats for marking up the following common types of content (as well as many others): • Creative works (such as Books, Movies, Music, Recipes, etc.) • Non-text objects (such as Audio, Image,Video) • Events • Organizations • Persons • Places • Products & Offers • Reviews• For more information see: ©2012 Razorfish. All rights reserved.
  59. 59. STANDARDS FOR SPECIFIC TYPES OF CONTENT 59• For Journalism • PRISM = Publishing Requirements for Industry Standard Metadata • NewsML for news, news metadata, and news management metadata • rNews uses RDFa to express news-specific metadata in news content• For Images • EXIF = Exchangeable Image File Format, data embedded by digital cameras • XMP = Extensible Metadata Platform, developed by Adobe • ITPC Photo Metadata for professional news and stock photos• For Videos • MPEG-7 from the Moving Picture Experts Group, add data to audio and video • Media RSS a flavor of RSS that allows for detailed info about media• For Social Connections • FOAF = Friend of a Friend, describes people, their connections & creations • SIOC = Semantically-Interlinked Online Communities, incorporates social networks• For Products • Good Relations for E-commerce ©2012 Razorfish. All rights reserved.
  60. 60. RNEWS 60• Uses RDFa to express news-specific metadata• Coordinated with• Used by the New York Times• For more information see: ©2012 Razorfish. All rights reserved. Article © The Wall Street Journal
  61. 61. EXERCISE 61Team Task: Apply a standard• Determine how the properties of rNews would be applied to the sample content• Start by looking through the specification to see which properties seem likely to be applicable ©2012 Razorfish. All rights reserved. Article © The Wall Street Journal
  63. 63. COMMERCIAL & OPEN VOCABULARIES 63• Provide descriptive metadata, often for a specific knowledge domain• Can hook into other data or content that may be used to augment your own• Will probably need to be expanded or modified• Commercial vocabularies may be expensive to license, but commercially supported• Open vocabularies are free to use, but may not be as well supported ©2012 Razorfish. All rights reserved.
  64. 64. USEFUL RESOURCES 64• WAND Inc – commercial taxonomies and tools• WordNet – a lexical database for English• Taxonomy Warehouse – a searchable directory of commercial and open taxonomies• Linked Data – open data sets on the web ©2012 Razorfish. All rights reserved.
  65. 65. LINKED OPEN DATA – FEBRUARY 2008 65Diagram by Richard Cyganiak and Anja Jentzsch
  66. 66. LINKED OPEN DATA – SEPTEMBER 2011 66Diagram by Richard Cyganiak and Anja Jentzsch
  68. 68. YOUR OWN CONTENT 68• Navigation, current classification and other site functionality should be leveraged as a starting point for any new metadata development• Offline resources can also provide inputs• Evaluate current site organization to make sure you’re not carrying over structures that are obsolete• Consider future functionality to make sure you have accounted for additional needs• Entity extraction tools (there are many out there) can evaluate large amounts of content and automatically generate metadata ©2012 Razorfish. All rights reserved.
  69. 69. YOUR OWN CONTENT 69 ©2012 Razorfish. All rights reserved. © NME
  70. 70. FINAL EXERCISE 70Group Discussion: Identify Sources of Metadata• Pretend you’re going to redesign• Identify possible sources of metadata you would use – including standards, commercial or open data sets, and data from the site itself. ©2012 Razorfish. All rights reserved. © NME