Your SlideShare is downloading. ×
  • Like
  • Save
ALOE - Combining User Generated Content and Traditional Metadata
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

ALOE - Combining User Generated Content and Traditional Metadata

  • 5,320 views
Published

A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).

A presentation for the workshop "Metadata 2.0" in Leuven, Belgium (2008/02/07).

Published in Business , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,320
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
8

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Leuven, 07.02.2008 Combining User Generated Content and Traditional Metadata Martin Memmel German Research Center for AI Kaiserslautern, Germany www.dfki.de/~memmel
  • 2. From scarcity to abundance – a short time travel
  • 3. Information was scarce and in the hand of few http://de.wikipedia.org/wiki/Skriptorium
  • 4. Gutenberg: Invention of movable type printing (1439) http://etc.usf.edu/clipart/11300/11358/gutenberg_11358.htm
  • 5. Biblioteca Angelica, Roma http://curiousexpeditions.org/2007/09/a_librophiliacs_love_letter_1.html
  • 6. Plan for the National Library (1785) Étienne-Louis Boullée (1728-1799)
  • 7. Welcome to the digital age!
  • 8. Online Education Database http://oedb.org/library/features/236-open-courseware-collections
  • 9. The Long Tail [Anderson 2004] quot;We sold more books today that didn't sell at all yesterday than we sold today of all the books that did sell yesterday.quot; http://en.wikipedia.org/wiki/The_Long_Tail
  • 10. http://www.markluthringer.com/RidgemontTypologies/cellphones.html
  • 11. http://www.google.com
  • 12. We need information about the available resources!
  • 13. Web2.0 & User Generated Content
  • 14. http://flickr.com/photos/stabilo-boss
  • 15. Spell with flickr: http://metaatem.net/words/
  • 16. http://en.wikipedia.org/wiki/User_generated_content
  • 17. http://elearningnuts.de
  • 18. http://www.flickr.com/photos/pavel1998/407853059/
  • 19. http://youtube.com/watch?v=36bXgqmQ6rE
  • 20. http://del.icio.us/search/?fr=del_icio_us&p=learntec&type=all
  • 21. http://www.facebook.com
  • 22. http://www.redesignme.org/
  • 23. http://labs.adobe.com/technologies/knowhow/
  • 24. We are experiencing a paradigm shift!
  • 25. http://www.time.com/time/magazine/article/0,9171,1569514,00.html
  • 26. From Traditional Media to Social Media • from consumers to producers (prosumers) • more democracy, less control • it‘s about the user • users are active, contribution is easy • everybody can reach a broad audience • networking, communication • open, public, sharing
  • 27. Is there also a paradigm shift in the creation of metadata?
  • 28. #views #favorites ratings tags comments http://aloe-project.de/
  • 29. Information generated explicitly by users • tags, comments, ratings • relations (e.g., sets) • profiles • conversations • … gives an understanding of the individuals who contributed (social browsing!)
  • 30. Human Computation
  • 31. http://www.espgame.org/
  • 32. http://recaptcha.net/
  • 33. Collective Intelligence
  • 34. http://www.flickr.com/photos/azlijamil01/231592469/
  • 35. Problems with metadata
  • 36. “Metacrap” [Doctorow 2001] • People lie • People are lazy • People are stupid • Mission: Impossible – know thyself • Schema’s aren’t neutral • Metrics influence results • There’s more than one way to describe something!
  • 37. Metadata is context-dependent! • description of a resource strongly depends on - the role of a reader - at what time the document is considered - in which terminology and language it is written - on which tasks he/she is currently working - what expertise and experience is available • valuable / bootless / annoyance?
  • 38. Some philosophy… Wittgenstein ‘Die Bedeutung eines Wortes ist sein Gebrauch in der Sprache’ (‘The meaning of a word is its use in the language’) Transferred into the world of (digital) resources: ‘The meaning of a resource is its use in the community’
  • 39. Combining different ways to generate metadata
  • 40. How can metadata be generated? • metadata generated by experts • metadata generated by users and user interactions - generated explicitly (e.g. tags, comments, ratings) - collected by tracking and observation components (sensors, log files, context information) • metadata generated automatically - content analysis • statistic-based methods • NLP • ontology based approaches • … - inference
  • 41. Benefits, weaknesses of the approaches • expert metadata (centralistic approaches) • expertise about respective domain, formats, standards • cope with complex, difficult, time-consuming tasks • expensive • static (lifecycle? context?) • often biased, subjective (motivations?) • user generated metadata (distributed approaches) • flat, simple ( cheap, fast) • different opinions, viewpoints (collective intelligence) • metadata generated automatically • scalable • restricted to the available knowledge • usually only works reasonably well with text [MemmelSchirruTomadakiWolpers2008]
  • 42. Combine the strengths of each approach!
  • 43. Resource profiles – combining metadata
  • 44. [Downes 2004] Resource profiles A ‘multi-faceted, wide ranging description of a resource’ which is characterised by the following features: • not conform to a particular XML schema, but a patchwork of metadata formats which are assembled as needed in order to form a description that is most appropriate for the given resource • not authored by a particular author - it consists of a large set of information which is authored by many people • may be distributed, in pieces, across a multitude of locations • there is no single canonical or authoritative resource profile associated with a given resource
  • 45. http://aloe-project.de
  • 46. ALOE – a social resource and metadata hub • possibility to integrate and aggregate any kind of existing resources and metadata - wherever they is located - via upload or reference (persistence?) - not “just” several repositories - not by a single authority! • possibility to contribute new resources and metadata • access, preview, (inline)player, editor… • integration of advanced functionalities
  • 47. Resource types images (bmp, gif, jpg, png, tif, …) documents (pdf, odt, odp, sxw, doc, ppt, …) videos (avi, mpeg, mov, …) web pages audio (aac, mp3, …)
  • 48. Resource locations • the Web • intranet • desktop • …
  • 49. ALOE – some features • upload and share arbitrary types of digital resources • share and organize bookmarks • tag, rate, and comment on resources and bookmarks • initiate groups and communicate with other users • publish as private, public, or only for certain groups • find resources with different types of search filters • rank search results according to different criteria • associate arbitrary metadata sets with resources • Web Service API (SOAP, REST)
  • 50. Application: Tag Recommendations
  • 51. Tag recommendations using multiple sources
  • 52. http://www.flickr.com/photos/fliegender/sets/1161829
  • 53. Thank you for your attention! http://www.dfki.de/~memmel http://elearningnuts.de http://aloe-project.de