Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ALOE - Combining User Generated Content and Traditional Metadata


Published on

A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).

Published in: Business, Technology
  • Be the first to comment

ALOE - Combining User Generated Content and Traditional Metadata

  1. Leuven, 07.02.2008 Combining User Generated Content and Traditional Metadata Martin Memmel German Research Center for AI Kaiserslautern, Germany
  2. From scarcity to abundance – a short time travel
  3. Information was scarce and in the hand of few
  4. Gutenberg: Invention of movable type printing (1439)
  5. Biblioteca Angelica, Roma
  6. Plan for the National Library (1785) Étienne-Louis Boullée (1728-1799)
  7. Welcome to the digital age!
  8. Online Education Database
  9. The Long Tail [Anderson 2004] quot;We sold more books today that didn't sell at all yesterday than we sold today of all the books that did sell yesterday.quot;
  12. We need information about the available resources!
  13. Web2.0 & User Generated Content
  15. Spell with flickr:
  24. We are experiencing a paradigm shift!
  26. From Traditional Media to Social Media • from consumers to producers (prosumers) • more democracy, less control • it‘s about the user • users are active, contribution is easy • everybody can reach a broad audience • networking, communication • open, public, sharing
  27. Is there also a paradigm shift in the creation of metadata?
  28. #views #favorites ratings tags comments
  29. Information generated explicitly by users • tags, comments, ratings • relations (e.g., sets) • profiles • conversations • … gives an understanding of the individuals who contributed (social browsing!)
  30. Human Computation
  33. Collective Intelligence
  35. Problems with metadata
  36. “Metacrap” [Doctorow 2001] • People lie • People are lazy • People are stupid • Mission: Impossible – know thyself • Schema’s aren’t neutral • Metrics influence results • There’s more than one way to describe something!
  37. Metadata is context-dependent! • description of a resource strongly depends on - the role of a reader - at what time the document is considered - in which terminology and language it is written - on which tasks he/she is currently working - what expertise and experience is available • valuable / bootless / annoyance?
  38. Some philosophy… Wittgenstein ‘Die Bedeutung eines Wortes ist sein Gebrauch in der Sprache’ (‘The meaning of a word is its use in the language’) Transferred into the world of (digital) resources: ‘The meaning of a resource is its use in the community’
  39. Combining different ways to generate metadata
  40. How can metadata be generated? • metadata generated by experts • metadata generated by users and user interactions - generated explicitly (e.g. tags, comments, ratings) - collected by tracking and observation components (sensors, log files, context information) • metadata generated automatically - content analysis • statistic-based methods • NLP • ontology based approaches • … - inference
  41. Benefits, weaknesses of the approaches • expert metadata (centralistic approaches) • expertise about respective domain, formats, standards • cope with complex, difficult, time-consuming tasks • expensive • static (lifecycle? context?) • often biased, subjective (motivations?) • user generated metadata (distributed approaches) • flat, simple ( cheap, fast) • different opinions, viewpoints (collective intelligence) • metadata generated automatically • scalable • restricted to the available knowledge • usually only works reasonably well with text [MemmelSchirruTomadakiWolpers2008]
  42. Combine the strengths of each approach!
  43. Resource profiles – combining metadata
  44. [Downes 2004] Resource profiles A ‘multi-faceted, wide ranging description of a resource’ which is characterised by the following features: • not conform to a particular XML schema, but a patchwork of metadata formats which are assembled as needed in order to form a description that is most appropriate for the given resource • not authored by a particular author - it consists of a large set of information which is authored by many people • may be distributed, in pieces, across a multitude of locations • there is no single canonical or authoritative resource profile associated with a given resource
  46. ALOE – a social resource and metadata hub • possibility to integrate and aggregate any kind of existing resources and metadata - wherever they is located - via upload or reference (persistence?) - not “just” several repositories - not by a single authority! • possibility to contribute new resources and metadata • access, preview, (inline)player, editor… • integration of advanced functionalities
  47. Resource types images (bmp, gif, jpg, png, tif, …) documents (pdf, odt, odp, sxw, doc, ppt, …) videos (avi, mpeg, mov, …) web pages audio (aac, mp3, …)
  48. Resource locations • the Web • intranet • desktop • …
  49. ALOE – some features • upload and share arbitrary types of digital resources • share and organize bookmarks • tag, rate, and comment on resources and bookmarks • initiate groups and communicate with other users • publish as private, public, or only for certain groups • find resources with different types of search filters • rank search results according to different criteria • associate arbitrary metadata sets with resources • Web Service API (SOAP, REST)
  50. Application: Tag Recommendations
  51. Tag recommendations using multiple sources
  53. Thank you for your attention!