• Save
ALOE - Combining User Generated Content and Traditional Metadata
Upcoming SlideShare
Loading in...5

ALOE - Combining User Generated Content and Traditional Metadata



A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).

A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).



Total Views
Views on SlideShare
Embed Views



5 Embeds 200

http://elearningnuts.de 115
http://cestmauvaisca.de 81
http://www.dfki.uni-kl.de 2
http://www.cestmauvaisca.de 1
http://www.slideshare.net 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

ALOE - Combining User Generated Content and Traditional Metadata ALOE - Combining User Generated Content and Traditional Metadata Presentation Transcript

  • Leuven, 07.02.2008 Combining User Generated Content and Traditional Metadata Martin Memmel German Research Center for AI Kaiserslautern, Germany www.dfki.de/~memmel
  • From scarcity to abundance – a short time travel
  • Information was scarce and in the hand of few http://de.wikipedia.org/wiki/Skriptorium
  • Gutenberg: Invention of movable type printing (1439) http://etc.usf.edu/clipart/11300/11358/gutenberg_11358.htm
  • Biblioteca Angelica, Roma http://curiousexpeditions.org/2007/09/a_librophiliacs_love_letter_1.html
  • Plan for the National Library (1785) Étienne-Louis Boullée (1728-1799)
  • Welcome to the digital age!
  • Online Education Database http://oedb.org/library/features/236-open-courseware-collections
  • The Long Tail [Anderson 2004] quot;We sold more books today that didn't sell at all yesterday than we sold today of all the books that did sell yesterday.quot; http://en.wikipedia.org/wiki/The_Long_Tail
  • http://www.markluthringer.com/RidgemontTypologies/cellphones.html
  • http://www.google.com
  • We need information about the available resources!
  • Web2.0 & User Generated Content
  • http://flickr.com/photos/stabilo-boss
  • Spell with flickr: http://metaatem.net/words/
  • http://en.wikipedia.org/wiki/User_generated_content
  • http://elearningnuts.de
  • http://www.flickr.com/photos/pavel1998/407853059/
  • http://youtube.com/watch?v=36bXgqmQ6rE
  • http://del.icio.us/search/?fr=del_icio_us&p=learntec&type=all
  • http://www.facebook.com
  • http://www.redesignme.org/
  • http://labs.adobe.com/technologies/knowhow/
  • We are experiencing a paradigm shift!
  • http://www.time.com/time/magazine/article/0,9171,1569514,00.html
  • From Traditional Media to Social Media • from consumers to producers (prosumers) • more democracy, less control • it‘s about the user • users are active, contribution is easy • everybody can reach a broad audience • networking, communication • open, public, sharing
  • Is there also a paradigm shift in the creation of metadata?
  • #views #favorites ratings tags comments http://aloe-project.de/
  • Information generated explicitly by users • tags, comments, ratings • relations (e.g., sets) • profiles • conversations • … gives an understanding of the individuals who contributed (social browsing!)
  • Human Computation
  • http://www.espgame.org/
  • http://recaptcha.net/
  • Collective Intelligence
  • http://www.flickr.com/photos/azlijamil01/231592469/
  • Problems with metadata
  • “Metacrap” [Doctorow 2001] • People lie • People are lazy • People are stupid • Mission: Impossible – know thyself • Schema’s aren’t neutral • Metrics influence results • There’s more than one way to describe something!
  • Metadata is context-dependent! • description of a resource strongly depends on - the role of a reader - at what time the document is considered - in which terminology and language it is written - on which tasks he/she is currently working - what expertise and experience is available • valuable / bootless / annoyance?
  • Some philosophy… Wittgenstein ‘Die Bedeutung eines Wortes ist sein Gebrauch in der Sprache’ (‘The meaning of a word is its use in the language’) Transferred into the world of (digital) resources: ‘The meaning of a resource is its use in the community’
  • Combining different ways to generate metadata
  • How can metadata be generated? • metadata generated by experts • metadata generated by users and user interactions - generated explicitly (e.g. tags, comments, ratings) - collected by tracking and observation components (sensors, log files, context information) • metadata generated automatically - content analysis • statistic-based methods • NLP • ontology based approaches • … - inference
  • Benefits, weaknesses of the approaches • expert metadata (centralistic approaches) • expertise about respective domain, formats, standards • cope with complex, difficult, time-consuming tasks • expensive • static (lifecycle? context?) • often biased, subjective (motivations?) • user generated metadata (distributed approaches) • flat, simple ( cheap, fast) • different opinions, viewpoints (collective intelligence) • metadata generated automatically • scalable • restricted to the available knowledge • usually only works reasonably well with text [MemmelSchirruTomadakiWolpers2008]
  • Combine the strengths of each approach!
  • Resource profiles – combining metadata
  • [Downes 2004] Resource profiles A ‘multi-faceted, wide ranging description of a resource’ which is characterised by the following features: • not conform to a particular XML schema, but a patchwork of metadata formats which are assembled as needed in order to form a description that is most appropriate for the given resource • not authored by a particular author - it consists of a large set of information which is authored by many people • may be distributed, in pieces, across a multitude of locations • there is no single canonical or authoritative resource profile associated with a given resource
  • http://aloe-project.de
  • ALOE – a social resource and metadata hub • possibility to integrate and aggregate any kind of existing resources and metadata - wherever they is located - via upload or reference (persistence?) - not “just” several repositories - not by a single authority! • possibility to contribute new resources and metadata • access, preview, (inline)player, editor… • integration of advanced functionalities
  • Resource types images (bmp, gif, jpg, png, tif, …) documents (pdf, odt, odp, sxw, doc, ppt, …) videos (avi, mpeg, mov, …) web pages audio (aac, mp3, …)
  • Resource locations • the Web • intranet • desktop • …
  • ALOE – some features • upload and share arbitrary types of digital resources • share and organize bookmarks • tag, rate, and comment on resources and bookmarks • initiate groups and communicate with other users • publish as private, public, or only for certain groups • find resources with different types of search filters • rank search results according to different criteria • associate arbitrary metadata sets with resources • Web Service API (SOAP, REST)
  • Application: Tag Recommendations
  • Tag recommendations using multiple sources
  • http://www.flickr.com/photos/fliegender/sets/1161829
  • Thank you for your attention! http://www.dfki.de/~memmel http://elearningnuts.de http://aloe-project.de