Semantic Web, Cataloging, & Metadata


Published on

An introduction to metadata, semantic web, social web and how it all fits together by robin fay,

Published in: Technology

Semantic Web, Cataloging, & Metadata

  1. 1. Robin Fay,
  2. 2. Linked: Semantic web & metadata Robin Fay, 2010/03 robin fay / georgiawebgurl @, twitter, linkedin, google buzz, blogger, slideshare, etc. SLIDES AT SLIDESHARE.NET/ROBINFAY
  3. 3. <ul><ul><ul><li>Brief introduction </li></ul></ul></ul><ul><ul><ul><ul><li>More materials available at </li></ul></ul></ul></ul><ul><ul><ul><li>Semantic web </li></ul></ul></ul><ul><ul><ul><ul><li>How we got here </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Introduction </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Metadata : where it fits </li></ul></ul></ul></ul><ul><ul><ul><ul><li>A little semantic web terminology </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Challenges of the semantic web </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Looking forward </li></ul></ul></ul></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  4. 4. Linked: Semantic Web & metadata <ul><ul><li>Social media/social networking/social web </li></ul></ul><ul><ul><ul><li>Focus is on providing a user centered experience </li></ul></ul></ul><ul><ul><ul><li>Content created by users </li></ul></ul></ul><ul><ul><ul><li>Metadata frequently created by users in the form of assigning rights (creative commons such as at flickr), assigning metadata (tags, description) and more. </li></ul></ul></ul><ul><ul><ul><li>read/write web </li></ul></ul></ul><ul><ul><li>Semantic web, aka Web 3.0 </li></ul></ul><ul><ul><ul><li>Focus on data and harvesting </li></ul></ul></ul><ul><ul><ul><li>Machine driven; rules based </li></ul></ul></ul><ul><ul><ul><li>Relies on structure </li></ul></ul></ul><ul><ul><ul><li>Portal/personal web </li></ul></ul></ul><ul><ul><ul><li>Early early stages </li></ul></ul></ul><ul><ul><ul><li>Structured metadata; Controlled vocabularies </li></ul></ul></ul>Robin Fay, March 2010
  5. 5. <ul><li>So: </li></ul><ul><li>Web 1.0 = we used the web as a tool </li></ul><ul><ul><li>primarily html based; few database driven sites; little interactivity ; metadata by web designers, software, little by users </li></ul></ul><ul><li>Web 2.0 = we interacted, enhanced, and controlled our experience </li></ul><ul><ul><li>Many database driven websites, search engines within websites, widgets, highly customizable </li></ul></ul><ul><ul><li>Lots of user created metadata: users tag, assign copyright info and more </li></ul></ul><ul><li>Web 3.0 = we teach it, it learns </li></ul><ul><ul><li>Metadata and structure are the keys. Controlled vocabularies, databases talk to each other and share data; crosswalking is the NORM </li></ul></ul>Linked: Semantic web & metadata
  6. 6. Linked: Semantic web & metadata <ul><li>The web as we know it (and think of it) links together documents (html, pdf, dynamic documents created from databases, etc.) </li></ul><ul><li>The Semantic web links together data. </li></ul>Robin Fay, 2009/10 Metadata provides the connections as well as the description of content. In a library catalog sense, think holdings records and bibliographic data.
  7. 7. Robin Fay, Univ. of Georgia, Metadata 101, Contributed metadata = harvested, input or generated > metadata schemas, crosswalking, etc. MARC can be exported as XML and Non MARC metadata is often written in XML, a flexible programming language.
  8. 8. <ul><ul><li>At its core, the semantic web comprises: </li></ul></ul><ul><ul><ul><li>a set of design principles, </li></ul></ul></ul><ul><ul><ul><li>collaborative working groups , </li></ul></ul></ul><ul><ul><ul><li>and a variety of enabling technologies. </li></ul></ul></ul><ul><ul><li>Some elements of the semantic web are expressed as prospective future possibilities that are yet to be implemented or realized </li></ul></ul><ul><ul><ul><li>AND </li></ul></ul></ul><ul><ul><li>Other elements of the semantic web are expressed in formal specifications -- (wikipedia, 2009) </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 2009/10
  9. 9. <ul><ul><li>So, the semantic web is not so much a place or even a destination. It is a goal to create a seamless user experience with accurate results. </li></ul></ul><ul><ul><ul><ul><ul><li>To succeed it needs: tools, people, and standards – sounds like cataloging, right? </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Metadata will play an even more important part in describing resources. </li></ul></ul></ul></ul></ul>Linked: Semantic web & metadata Robin Fay, 2009/10
  10. 10. <ul><li>In brief: </li></ul><ul><ul><li>Types of metadata: </li></ul></ul><ul><ul><ul><li>Descriptive </li></ul></ul></ul><ul><ul><ul><li>Structural </li></ul></ul></ul><ul><ul><ul><li>Administrative </li></ul></ul></ul><ul><ul><li>Many forms of metadata include elements of each of these; however it is dependent upon the schema. </li></ul></ul><ul><ul><li>A schema is a set of rules covering the elements and requirements for coding. Examples of common schemas in the library world include Dublin Core, TEI, EAD, and others. Examples of schemas in the semantic web include Dublin Core, FOAF (Friend of a Friend), and many others. </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 2009/10
  11. 11. Linked: Semantic web & metadata Karen Coyle 2004,
  12. 12. <ul><ul><li>Semantic web and web metadata is frequently from outside of the library community – working in parallel or sometimes, at odds. </li></ul></ul><ul><ul><li>Metadata in libraries encompasses a wide variety from MARC to EAD to Dublin Core, to HTML/XML markup for websites. </li></ul></ul><ul><ul><li>Metadata in libraries is traditionally focused on description and holdings (coverage). A library catalog record or a metadata record in a database is primarily descriptive. </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  13. 13. <ul><ul><li>Much of library metadata is highly structured and done by trained professionals. In the library world, MARC has been a long term standard. While it can be rigid, its structural nature can makes it easier to crosswalk and harvest into other databases. </li></ul></ul><ul><ul><li>SEO (Search Engine Optimization) is a common term in the web world; these experts assign descriptive, administrative (usually copyright) to websites; their goal is generally higher search results. Given that search engine algorithms change regularly, SEO is a highly dynamic field, which can lead to inconsistencies in metadata application, making it harder for databases and search engines to harvest. </li></ul></ul><ul><ul><li>In a nutshell, most library metadata has rules and standards; metadata in the web world is often (but not always) more flexible. The Semantic Web will need to manage (and make sense!) of all of these types of metadata. </li></ul></ul><ul><ul><li>Okay, let’s hit the semantic web terminology! </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  14. 14. <ul><li>RDF = Resource Description Framework </li></ul><ul><li>RDFS = Resource Description Framework Schema </li></ul><ul><li>OWL = Web Ontology Language </li></ul><ul><li>URI = Uniform Resource Identifier </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03 Many terms associated with the Semantic Web are used or based upon information architecture, database, information science, and library science fields – controlled vocabularies, structural elements, etc.
  15. 15. <ul><li>RDF = Resource Description Framework </li></ul><ul><ul><li>is a general-purpose language for representing information in the Web (a metadata data model ) </li></ul></ul><ul><ul><li>is a W3C specification </li></ul></ul><ul><ul><li>is a conceptual description </li></ul></ul><ul><ul><li>is based upon making statements about web resources (triplets) </li></ul></ul><ul><ul><li>More or less : XML </li></ul></ul><ul><ul><li>Think sentence structure : </li></ul></ul><ul><ul><ul><li>subject – predicate(verb)-object </li></ul></ul></ul><ul><ul><ul><ul><li>My dog eats dogfood. </li></ul></ul></ul></ul><ul><li> Let’s look at some examples </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  16. 16. Linked: Semantic web & metadata Robin Fay, October 2009 RDF concept
  17. 17. W3C model Linked: Semantic web & metadata Simple RDF concept marked up
  18. 18. <ul><li>So, we have the framework, but how do we apply it? </li></ul><ul><li>RDFS = Resource Description Framework Schema </li></ul><ul><ul><li>A schema is </li></ul></ul><ul><ul><ul><li>outline : a schematic or preliminary plan </li></ul></ul></ul><ul><ul><ul><li>A structure described in a formal language supported by the database management system ; in a relational database [such as MySQL), the schema defines the tables, the fields in each table, and the relationships between fields and tables. </li></ul></ul></ul><ul><ul><ul><li>a description of the structure and rules a document must satisfy for an XML document type </li></ul></ul></ul><ul><ul><ul><li> (define: schema -- google) </li></ul></ul></ul><ul><ul><ul><li>Dublin Core is a schema </li></ul></ul></ul>Linked: Semantic web & metadata Robin Fay, 2009/10
  19. 19. <ul><li>Popular choices for schemas include </li></ul><ul><ul><li>Creative Commons: embeds RDF into mp3s </li></ul></ul><ul><ul><li>FOAF (Friend of Friend) : address books, contact lists, etc. </li></ul></ul><ul><ul><li>MusicBrainz: music CD information </li></ul></ul><ul><ul><li>DC (Dublin Core): multi-format, cross domain </li></ul></ul><ul><ul><li>RDF Site Summary: anything with a RSS </li></ul></ul><ul><ul><li>Schemas are sometimes called vocabularies </li></ul></ul><ul><ul><li>There are many, many schemas…. </li></ul></ul>Linked: Semantic web & metadata A couple of examples Robin Fay, 2009/10
  20. 20. Linked: Semantic web & metadata Robin Fay, October 2009 FOAF
  21. 21. Linked: Semantic web & metadata Robin Fay, 2009/10 … okay so that doesn’t sound so bad …. But it can get even more defined… Very Simple Dublin Core in RDF We can combine FOAF + DC
  22. 22. <ul><li>OWL = Web Ontology Language </li></ul><ul><ul><li>invented to link ontologies which are classification systems </li></ul></ul><ul><ul><li>Attempts to define objects and their relationships </li></ul></ul><ul><ul><li>Different “flavors” </li></ul></ul><ul><ul><li>“ interpreted as a set of &quot;individuals&quot; and a set of &quot;property assertions&quot; which relate these individuals to each other” (wikipedia 2009) </li></ul></ul><ul><ul><li>Not a requirement </li></ul></ul><ul><ul><li>Sounds familiar to catalogers, right? </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 03/2010
  23. 23. Linked: Semantic web & metadata Robin Fay, 2009/10
  24. 24. Linked: Semantic web & metadata Being that this is data driven, we can query, using SPARQL, a standard query language. It all fits together except….
  25. 25. <ul><li>Who is going to create all of this new data? </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03 <ul><li>We are: </li></ul><ul><ul><li>Libraries, museums, and other resources already have huge record sets of structured data for their resources. </li></ul></ul><ul><ul><li>Our databases are structured and rule based. </li></ul></ul><ul><ul><li>Social networking sites create metadata – think> tags </li></ul></ul><ul><ul><li>RSS feeds are structured data which can be harvested. API! </li></ul></ul>
  26. 26. Linked: Semantic web & metadata Robin Fay, 2010/03 <ul><li>Examples of descriptive metadata include </li></ul><ul><ul><li>MARC bibliographic records; </li></ul></ul><ul><ul><li>Tags, titles, and notes in flickr , del.ici.ous, and other social networking sites; </li></ul></ul><ul><ul><li>metadata embedded in the code of websites; </li></ul></ul><ul><ul><li>Metadata assigned by machine about tags (and other user generated metadata) from websites; </li></ul></ul><ul><ul><li>and other tagging projects such as, OCLC, LibraryThing, some digital library projects; </li></ul></ul><ul><ul><li>… really any website where a user (or authorized user, such as a cataloger or member participant) can create or edit description, keywords/tags, title, creator information and more. </li></ul></ul><ul><ul><li>Let’s look at a couple of examples of metadata he on the web… </li></ul></ul>
  27. 27. Linked: Semantic web & metadata Robin Fay, 2010/03 Descriptive metadata Administrative metadata
  28. 28. Linked: Semantic web & metadata Robin Fay, 2010/03
  29. 29. <ul><li>The social web/social media/social networking is about people. Its focus has been less on standards, controlled vocabularies, RULES…. After all, here comes everybody… </li></ul><ul><li>..but that is not exactly true. In order for blog posts to display sequentially, in order for discussions to be threaded, there must be an underlying order – a structure -- rules. </li></ul>Linkedin: Semantic web & metadata Robin Fay, 2010/03
  30. 30. <ul><li>Much of the technology (databases, servers, etc.) already exists – at least for the initial stages of the semantic web. </li></ul><ul><li>A great deal of the criticism of the semantic web hinges on either artificial intelligence OR on human inability – in other words, the computers can’t learn and we won’t code correctly for the computers, either. </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  31. 31. <ul><li>..but it’s not completely hopeless… </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  32. 32. <ul><ul><li>Linked data is: “about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data.” </li></ul></ul><ul><ul><li>Think> related, series records, authority files </li></ul></ul><ul><ul><li>Libraries already link data. </li></ul></ul><ul><ul><li>Projects such as the NYT Linked Open Data project and the Virtual Authority File project are resources of controlled vocabularies. </li></ul></ul><ul><ul><li>Verified and digital identity accounts such as openID and claimID to differentiate names </li></ul></ul>Linked: Semantic web & metadata Robin Fay, 2010/03
  33. 33. Linked: Semantic web & metadata <ul><ul><li>The thesaurus consists of more than a million terms organized into five controlled vocabularies: subjects, personal names, organizations, geographic locations and the titles of creative works (books, movies, plays, etc).” – NYT Blogs </li></ul></ul>Robin Fay, 2010/03
  34. 34. <ul><li>In order for the semantic web to reach its potential: </li></ul><ul><li>We need smarter computers. </li></ul><ul><li>We need better tools. </li></ul><ul><li>We need to share. </li></ul>Linked: Semantic web & metadata Robin Fay, 2010/03 <ul><li>We need better web standards. </li></ul><ul><li>We need a better behind-the-scenes structure (RDF). </li></ul><ul><li>We need more built in metadata generation tools on social networking sites. </li></ul><ul><li>We’ll need better search engines </li></ul><ul><li>… but we are on the way…. </li></ul>
  35. 35. <ul><li>Questions? Thoughts? </li></ul><ul><li>Many many links @ </li></ul><ul><li> </li></ul><ul><li>Presentation also at </li></ul>Robin Fay, 03/2010 Linked: Semantic web & metadata