Your SlideShare is downloading. ×

Metadata and ontologies


Published on

Slides from the Introduction and Theoretical Foundations of New Media course of the Interactive Media and Knowledge Environments master program (Tallinn University).

Slides from the Introduction and Theoretical Foundations of New Media course of the Interactive Media and Knowledge Environments master program (Tallinn University).

Published in: Education, Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Introduction and Theoretical Foundations of New Media
    Metadata and Ontologies
  • 2. Contents
    The sematic web
    The internet of things
    David Lamas, TLU, 2011
  • 3. Metadata
    David Lamas, TLU, 2011
  • 4. Metadata
    So, why is metadata relevant?
    Or… why should we care about metadata?
    David Lamas, TLU, 2011
  • 5. Metadata
    As a concept, is not new
    Metadata has long been for managing document collections such as the ones kept by libraries
    But the term itself, was only coined in 1968
    By Philip Bagley, a pioneer of computerized document retrieval
    David Lamas, TLU, 2011
  • 6. Metadata
    Literally, a set of data that describes and gives information about other data, metadata in our context is:
    Machine readable
    For the purposes of resource…
    Access control
    Long term preservation
    David Lamas, TLU, 2011
  • 7. Metadata
    Or in other words, metadata allows for the description of the…
    Structure; and
    of selected resources with all contents in context to ease the further use of the resource
    David Lamas, TLU, 2011
  • 8. MARC
    Or… Machine Readable Catalogue
    Is still the main metadata standard in the library world although it is not a full cataloguing scheme being
    David Lamas, TLU, 2011
  • 9. UDC, AARC2 and RDA
    Universal Decimal Classification
    A multilingual classification scheme for all fields of knowledge
    Available at…
    Anglo-American Cataloguing Rules
    For use in the construction of catalogues
    Available at…
    Resource description and access
    Available at…
    David Lamas, TLU, 2011
  • 10. Z39.50, SRW and SRU
    is a client–server protocol for searching and retrieving information widely used in library environments
    Search & Retrieve Web Service
    A intended standard web-based text-searching interface
    Search/Retrieval via URL
    Astandard XML-focused search protocol for Internet search queries, which uses the Contextual Query Language
    David Lamas, TLU, 2011
  • 11. But…
    This should not bother you other than to note that…
    Metadata tends to get more complicated the longer you think about it
    David Lamas, TLU, 2011
  • 12. As for the web…
    It was early recognized that finding what you need was going to start getting difficult
    We’re talking about the mid nineties when the web’s size was referred to in terms of tens of thousands
    Users, mainly information sciences specialists, begun trying to catalogue it by hand
    Do you remember Yahoo’s earlier versions?
    David Lamas, TLU, 2011
  • 13. As for the web…
    The first search engines appeared and authors begun to realize that the metadata they embedded into web pages might be important
    <title>A web page</title>
    <meta name=“keywords” content=“some, key, words” />
    <meta name=“description” content=“a summary” />

    David Lamas, TLU, 2011
  • 14. As for the web…
    Then came Google
    And metadata lost some relevance as Google’s PageRank algorithm takes note of links between pages but places less emphasis on embedded metadata to avoid…
    <meta name=“description” content=“a summary” />
    <title>put your title here</title>
    David Lamas, TLU, 2011
  • 15. Dublin Core
    Despite the initial drawbacks, work continued on embedded metadata and the Dublin Core was and still is one of the main players with its 15 elements…
    Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, Rights
    …embedded into web pages or encoded using XML
    The initial intention was to improve indexing by search engines
    But whereas its promoters forgot about metaspam and metacrap, the search engines didn’t
    And so, main search engines still ignore embedded metadata
    David Lamas, TLU, 2011
  • 16. Dublin Core
    David Lamas, TLU, 2011
  • 17. Metadata
    Remarkably, there has been fairly widespread adoption of metadata principles, specially in policy terms, namely in government
    (look into and interesting example)
    And in:
    Cultural heritage
    Environmental agencies, and…
    Libraries, of course
    David Lamas, TLU, 2011
  • 18. Metadata
    This resulted in the…
    Growth of metadata cataloguing rules
    (although every community has its own rules)
    Growth in use of additional elements for particular communities
    (and again, every community’s additions are different)
    Adoption of application profiles to document the distinct cataloguing rules and additions
    Institution of the Dublin Core Metadata Initiative as
    an organization engaged in the development of interoperable metadata standards that support a broad range of purposes and business models
    David Lamas, TLU, 2011
  • 19. Metadata
    But the Dublin Core isn’t alone, far from it
    Many other standards were and are being developed such as these, just to name two:
    RDF (Resource Description Framework)
    LOM (Learning Object Metadata)
    David Lamas, TLU, 2011
  • 20. Resource Description Framework
    The resource description framework was developed by the W3C, the RDF is the envisioned standard for the semantic web
    Its goal is to allow software to automatically navigate and reason about web content thus enabling…
    A web of (linked) data
    David Lamas, TLU, 2011
  • 21. Resource Description Framework
    David Lamas, TLU, 2011
  • 22. Learning Object Metadata
    Learning Object Metadata is a data model
    Usually encoded in XML, it is used to describe learning objects and similar digital resources used to support learning.
    David Lamas, TLU, 2011
  • 23. Learning Object Metadata
    David Lamas, TLU, 2011
  • 24. Metadata
    As said in the beginning…
    Metadata tends to get more complicated the longer we think about it
    The current metadata efforts lack of within standards and within communities coherence and cohesion are a good example
    And that is why we will next look into Ontologies
    So… do we care about metadata?
    Why are we interested?
    David Lamas, TLU, 2011
  • 25. Metadata
    I guess the answer is yes, we care.
    And yes, we are interested, because metadata is everywhere
    Sometimes it is explicitly available,
    Other times it is hidden or not so readily available, butanyway…
    It would be foolish not to make use of it
    David Lamas, TLU, 2011
  • 26. Metadata
    Further, there is increasing pressure to expose metadata on the web for other to mash up and this is specially true today in settingssuch as…
    Research; and
    And finally, metadata becomes paramount in scenarios where
    content is data; or
    the required information can not easily derived from content
    David Lamas, TLU, 2011
  • 27. Ontologies
    David Lamas, TLU, 2011
  • 28. Ontologies
    One way of dealing with the lack of within standards and within communities coherence and cohesion of current metadata efforts is to evolve to an ontology-base metadata approach
    But what does this means?
    David Lamas, TLU, 2011
  • 29. Ontologies
    An ontology is a logical theory which gives an explicit partial account of a conceptualization
    An intentional semantic structure which encodes the implicit rules constraining the structure of a piece of reality
    In this light, the aim of an ontology is to define which primitives, provided with their associated semantics, are necessary for knowledge representation in a given context
    David Lamas, TLU, 2011
    Thomas R. Gruber (1993). Toward principles for the design of ontologies used for knowledge sharing. Originally in N. Guarino and R. Poli, (Eds.), International Workshop on Formal Ontology, Padova, Italy. Revised August 1993. Published in International Journal of Human-Computer Studies, Volume 43 , Issue 5-6 Nov./Dec. 1995, Pages: 907-928, special issue on the role of formal ontology in the information technology.
  • 30. Ontologies
    Ontologies are usually characterized by their…
    The extent to which the primitives mobilized by the perceived usage scenarios are covered by the ontology
    The extent to which ontological primitives are precisely identified
    The extent to which primitives are precisely and formally defined
    The extent to which primitives are described in a formal language
    David Lamas, TLU, 2011
  • 31. Ontologies
    And ontologies are not… taxonomies
    But taxonomy might be perceived as a specific case of an ontology
    A taxonomy is a particular classification arranged in a hierarchical structure
    Typically it is organized by supertype/subtype relationships also called generalization/specialization relationships
    David Lamas, TLU, 2011
  • 32. Why ontologies?
    David Lamas, TLU, 2011
  • 33. Why ontologies?
    David Lamas, TLU, 2011
  • 34. Why ontologies?
    David Lamas, TLU, 2011
  • 35. Why ontologies?
    In short, we interpret, machines don’t
    As such, an effort must be undertaken in order to support adequate usage of digital resources
    So, what’s missing?
    Among other…
    The possibility to share a common understanding of the structure of information within a specific domain
    The possibility to reuse domain knowledge
    The possibility to make domain assumptions explicit
    The possibility to analyze domain knowledge
    David Lamas, TLU, 2011
  • 36. Ontologies and the web
    It is estimated that by 2010…
    70% of public web pages will have some level of metadata, but only
    20% will use more extensive semantic web approaches such as ontology-based metadata
    But why should we care?
    David Lamas, TLU, 2011
  • 37. Ontologies and the web
    An emerging ontological approach is OWL or…
    Web Ontology Language
    A vocabulary extension of the Resource Description Framework, which adds more vocabulary for describing characteristics of properties and classes or relations between classes
    David Lamas, TLU, 2011
  • 38. Web Ontology Language
    OWL enables ontology-based information sharing and manipulation together with RDF and XML
    In reverse order…
    XML allows users to add arbitrary structure to their docuemnts but says nothing about what such structures mean
    RDF enables expression of meaning over XML (and other) structures
    Using subject, verb and object triples
    OWL enables machines to comprehend semantic documents and data
    David Lamas, TLU, 2011
  • 39. Web Ontology Language
    David Lamas, TLU, 2011
  • 40. Ontologies
    This said and while addressing some of the current metadata efforts weaknesses, present-day ontologies still largely depend on explicit human intervention to be useful
    And that is why we will next look into folksonomies
    David Lamas, TLU, 2011
  • 41. Folksonomies
    David Lamas, TLU, 2011
  • 42. Folksonomies
    Are mainly a bottom-up social classification system
    A way to organize and share contents by tagging resources
    Synonyms are…
    Ethno-classification; and
    Collaborative tagging
    David Lamas, TLU, 2011
  • 43. Folksonomies
    Folksonomies are created by users and have…
    No structure
    No fixed vocabulary
    No explicit relationships between terms, and
    No authority
    David Lamas, TLU, 2011
  • 44. Folksonomies
    Folksonomies also are…
    Distributed, and
    Collaboratively built and maintained
    You can tag items owned by others
    You can get instant feedback
    All items for the same tag
    All tags for the same item
    You can a adapt your tags to the group norm
    But you are never forced
    David Lamas, TLU, 2011
  • 45. Folksonomies
    Some of their apparent benefits are…
    Being cheap and easy to build and use
    Being capable to adapt very quickly to changes and users needs
    They scale well
    Foster serendipity
    Semantic browsing instead of searching
    Lower the cooperation barriers
    David Lamas, TLU, 2011
  • 46. Folksonomies
    But they have limits such as…
    Semantic ambiguity
    Polysemy, synonymy, cardinality and the use of acronyms
    Syntax free
    Spaces and multiple words are used without rules
    Different languages can be used for the same tag
    Being eventually shortsighted
    Fail to depict the general overview
    Lack of (or minimal) structure
    No explicit relationships between otherwise related tags
    David Lamas, TLU, 2011
  • 47. Folksonomies and ontologies
    Large corpus
    Informal categories
    Unstable entities
    Unclear edges
    Naïve cataloguers
    No authority
    Uncoordinated users
    Amateur users
    Critical mass needed
    Small corpus
    Formal categories
    Stable entities
    Restricted entities
    Clear edges
    Expert cataloguers
    Authoritative sources of judgment
    Coordinated users
    Expert users
    David Lamas, TLU, 2011
  • 48. Folksonomies and ontologies
    How do we choose?
    Folksonomies are useful when all that is needed is the ability to link items to topics
    Ontologies are useful when what is needed is to formally define meaning
    But… do we need to choose?
    Not really, at least that what current research is exploring
    David Lamas, TLU, 2011
  • 49. Folksonomies and ontologies
    Research directions include
    The combination of the folksonomy and ontology approaches into an hybrid system where the most consensual constructs would long last while others would be forgotten or redefined
    An approach that combines the ease and adaptability of folksonomy with the formality and semantic richness of an ontology
    Quantitative tag analysis and qualitative use analysis in current online social networking services
    To understand if tag usage converge or not
    To understand how a folksonomy is formed
    To… any ideas?
    David Lamas, TLU, 2011
  • 50. Semantic web
    David Lamas, TLU, 2011
  • 51. Semantic Web
    The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help
    One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web
    Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.
    David Lamas, TLU, 2011
  • 52. Internet of things
    David Lamas, TLU, 2011
  • 53. The internet of things
    The internet of things might be described as a self-configuring wireless network of sensors whose purpose would be to interconnect all things
    And the concept is attributed to the former Auto-ID Center, founded in 1999, based at the time at the MIT
    An alternative viewfocuses instead on making all things addressable by the existing naming protocols
    In the current vision, objects themselves do not interact, but they may now be referred to by other agents, such as centralized servers acting for their human users
    David Lamas, TLU, 2011
  • 54. Metadata and Ontologies recap
    The sematic web
    The internet of things
    David Lamas, TLU, 2011