• Save
ALOE - Combining User Generated Content and Traditional Metadata
Upcoming SlideShare
Loading in...5

ALOE - Combining User Generated Content and Traditional Metadata



A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).

A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).



Total Views
Views on SlideShare
Embed Views



5 Embeds 200

http://elearningnuts.de 115
http://cestmauvaisca.de 81
http://www.dfki.uni-kl.de 2
http://www.cestmauvaisca.de 1
http://www.slideshare.net 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    ALOE - Combining User Generated Content and Traditional Metadata ALOE - Combining User Generated Content and Traditional Metadata Presentation Transcript

    • Leuven, 07.02.2008 Combining User Generated Content and Traditional Metadata Martin Memmel German Research Center for AI Kaiserslautern, Germany www.dfki.de/~memmel
    • From scarcity to abundance – a short time travel
    • Information was scarce and in the hand of few http://de.wikipedia.org/wiki/Skriptorium
    • Gutenberg: Invention of movable type printing (1439) http://etc.usf.edu/clipart/11300/11358/gutenberg_11358.htm
    • Biblioteca Angelica, Roma http://curiousexpeditions.org/2007/09/a_librophiliacs_love_letter_1.html
    • Plan for the National Library (1785) Étienne-Louis Boullée (1728-1799)
    • Welcome to the digital age!
    • Online Education Database http://oedb.org/library/features/236-open-courseware-collections
    • The Long Tail [Anderson 2004] quot;We sold more books today that didn't sell at all yesterday than we sold today of all the books that did sell yesterday.quot; http://en.wikipedia.org/wiki/The_Long_Tail
    • http://www.markluthringer.com/RidgemontTypologies/cellphones.html
    • http://www.google.com
    • We need information about the available resources!
    • Web2.0 & User Generated Content
    • http://flickr.com/photos/stabilo-boss
    • Spell with flickr: http://metaatem.net/words/
    • http://en.wikipedia.org/wiki/User_generated_content
    • http://elearningnuts.de
    • http://www.flickr.com/photos/pavel1998/407853059/
    • http://youtube.com/watch?v=36bXgqmQ6rE
    • http://del.icio.us/search/?fr=del_icio_us&p=learntec&type=all
    • http://www.facebook.com
    • http://www.redesignme.org/
    • http://labs.adobe.com/technologies/knowhow/
    • We are experiencing a paradigm shift!
    • http://www.time.com/time/magazine/article/0,9171,1569514,00.html
    • From Traditional Media to Social Media • from consumers to producers (prosumers) • more democracy, less control • it‘s about the user • users are active, contribution is easy • everybody can reach a broad audience • networking, communication • open, public, sharing
    • Is there also a paradigm shift in the creation of metadata?
    • #views #favorites ratings tags comments http://aloe-project.de/
    • Information generated explicitly by users • tags, comments, ratings • relations (e.g., sets) • profiles • conversations • … gives an understanding of the individuals who contributed (social browsing!)
    • Human Computation
    • http://www.espgame.org/
    • http://recaptcha.net/
    • Collective Intelligence
    • http://www.flickr.com/photos/azlijamil01/231592469/
    • Problems with metadata
    • “Metacrap” [Doctorow 2001] • People lie • People are lazy • People are stupid • Mission: Impossible – know thyself • Schema’s aren’t neutral • Metrics influence results • There’s more than one way to describe something!
    • Metadata is context-dependent! • description of a resource strongly depends on - the role of a reader - at what time the document is considered - in which terminology and language it is written - on which tasks he/she is currently working - what expertise and experience is available • valuable / bootless / annoyance?
    • Some philosophy… Wittgenstein ‘Die Bedeutung eines Wortes ist sein Gebrauch in der Sprache’ (‘The meaning of a word is its use in the language’) Transferred into the world of (digital) resources: ‘The meaning of a resource is its use in the community’
    • Combining different ways to generate metadata
    • How can metadata be generated? • metadata generated by experts • metadata generated by users and user interactions - generated explicitly (e.g. tags, comments, ratings) - collected by tracking and observation components (sensors, log files, context information) • metadata generated automatically - content analysis • statistic-based methods • NLP • ontology based approaches • … - inference
    • Benefits, weaknesses of the approaches • expert metadata (centralistic approaches) • expertise about respective domain, formats, standards • cope with complex, difficult, time-consuming tasks • expensive • static (lifecycle? context?) • often biased, subjective (motivations?) • user generated metadata (distributed approaches) • flat, simple ( cheap, fast) • different opinions, viewpoints (collective intelligence) • metadata generated automatically • scalable • restricted to the available knowledge • usually only works reasonably well with text [MemmelSchirruTomadakiWolpers2008]
    • Combine the strengths of each approach!
    • Resource profiles – combining metadata
    • [Downes 2004] Resource profiles A ‘multi-faceted, wide ranging description of a resource’ which is characterised by the following features: • not conform to a particular XML schema, but a patchwork of metadata formats which are assembled as needed in order to form a description that is most appropriate for the given resource • not authored by a particular author - it consists of a large set of information which is authored by many people • may be distributed, in pieces, across a multitude of locations • there is no single canonical or authoritative resource profile associated with a given resource
    • http://aloe-project.de
    • ALOE – a social resource and metadata hub • possibility to integrate and aggregate any kind of existing resources and metadata - wherever they is located - via upload or reference (persistence?) - not “just” several repositories - not by a single authority! • possibility to contribute new resources and metadata • access, preview, (inline)player, editor… • integration of advanced functionalities
    • Resource types images (bmp, gif, jpg, png, tif, …) documents (pdf, odt, odp, sxw, doc, ppt, …) videos (avi, mpeg, mov, …) web pages audio (aac, mp3, …)
    • Resource locations • the Web • intranet • desktop • …
    • ALOE – some features • upload and share arbitrary types of digital resources • share and organize bookmarks • tag, rate, and comment on resources and bookmarks • initiate groups and communicate with other users • publish as private, public, or only for certain groups • find resources with different types of search filters • rank search results according to different criteria • associate arbitrary metadata sets with resources • Web Service API (SOAP, REST)
    • Application: Tag Recommendations
    • Tag recommendations using multiple sources
    • http://www.flickr.com/photos/fliegender/sets/1161829
    • Thank you for your attention! http://www.dfki.de/~memmel http://elearningnuts.de http://aloe-project.de