Your SlideShare is downloading. ×
Folksonomies Indexing Und Retrieval In Bibliotheken
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Folksonomies Indexing Und Retrieval In Bibliotheken

956
views

Published on

Präsentation im Rahmen des Studiengangs "Library and Information Studies" an der Karl-Franzens-Universität Graz.

Präsentation im Rahmen des Studiengangs "Library and Information Studies" an der Karl-Franzens-Universität Graz.

Published in: Education

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
956
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
30
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Folksonomies Inhaltserschließung und Retrieval im Web 2.0 und in Bibliotheken Dr. phil. Isabella Peters Heinrich-Heine-Universität Düsseldorf Abteilung für Informationswissenschaft 1 Uni Graz – 17. Dezember 2009
  • 2. Folksonomies: Indexing without Rules “Anything goes” “Against method”, 1975 (Paul K. Feyerabend, Austro-American philosopher) Tagging • no rules • no methods – or even against methods • indexing a single document – synonyms – why not? (New York – NY – Big Apple – … ) – homonyms – never heard! (not: Java [Programming Language] – Java [Island], but Java) – translations – why not? (Singapore – Singapur – …) – typing errors – nobody is perfect (Syngapur) – hierarchical relations (hyponymy) – why not? (Düsseldorf – North Rhine-Westfalia – Germany) – hierarchical relations (meronymy) – why not? (tree – branch – leaf) 2
  • 3. Indexing – in general 3
  • 4. Tri-partite System of Folksonomies Folksonomies consist always of 3 parts 1) document (resource) 2) prosumer (user) 3) tag 4
  • 5. Users – Tags - Documents shared users thematically linked thematically linked shared documents 5
  • 6. Shared Documents & Thematically Linked Users more like this ... thematically linked similar documents detection of documents more like me ... similar users detection of communities shared documents 6
  • 7. More like me! Or: More like This User! • starting point: single user (ego) • processing – (1) tag-specific similarity • all tags of ego: a(t) • all tags of another user B: b(t) • common tags of ego and another user B: g(t) – (2) document-specific similarity • all tagged documents of ego: a(d) • all tagged documents of another user B: b(d) • common tagged documents of ego and another user B: g(d) – calculation of similarity • tag-specific: Jaccard-Sneath: Sim(tag; Ego,B) = g(t) / [a(t) + b(t) – g(t)] • document-specific: Jaccard-Sneath: Sim(doc; Ego,B) = g(d) / [a(d) + b(d) – g(d)] • ranking of Bi by similarity to ego (say, top 10 tag-specific and top 10 document- specific users) • merging of both lists (exclusion of duplicates) • cluster analysis (k-nearest neighbours, single linkage, complete linkage, group average linkage) – result presentation: social network of ego in the centre 7
  • 8. More like me! Or: More like This User! Sim(tag) = 0.45 Sim(doc) = 0.36 Sim(tag) = 0.21 Sim(doc) = 0.25 Sim(tag) = 0.33 Sim(tag) = 0.15 Sim(doc) = 0.29 Sim(doc) = 0.17 Sim(tag) = 0.08 Sim(doc) = 0.11 Sim(tag) = 0.17 Sim(doc) = 0.23 single linkage clustering Sim(tag) = 0.65 (fictitious example) Sim(doc) = 0.55 8
  • 9. Narrow Folksonomies • only one tagger (the content creator) • no multiple tagging • example: YouTube Tags 9
  • 10. Extended Narrow Folksonomies • more than one tagger • no multiple tagging • example: Flickr Tags Source: Vander Wal (2005) Add Tags Option 10
  • 11. Broad Folksonomies • more than one tagger • multiple tagging • example: Delicious Tags Source: Vander Wal (2005) 11
  • 12. Folksonomies make use of Collective Intelligence Collective Intelligence • “Wisdom of the Crowds” (Surowiecki) • “Hive Minds” (Kroski) – “Vox populi” (Galton) – “Crowdsourcing” • no discussions, diversity of opinions, decentralisation • users tag a document independently from each other • statistical aggregation of data Collaborative Intelligence • discussions and consensus • prototype service: Wikipedia (but: 90 + 9 + 1 – rule) “Madness of the Crowds” • e.g., soccer fans – hooligans • no diversity of opinion – no independence – no decentralisation – no (statistical) aggregation 12
  • 13. Power Tags • Power Law Distribution • Inverse-logistic Distribution Power Tags Power Tags 13
  • 14. Power Law Tag Distribution Tags zu w w w .visitlondon.com Users 70 60 Power Tags f (x)= C / xa 50 40 80/20-Rule 30 20 Long Tail 10 0 t n en m nd re s on ay io ra de el K re m is av tu at nd la id U nd nd ui in ur rm ng ul Tr ol G Lo Lo ta To C Lo H fo E er In nt Tags E 14 Source: http:// del.icio.us
  • 15. As 0 5 10 15 20 25 30 35 so cia tio ns Lib Users ra In ry In fo fo r rm mat at ion io ns cie nc e Long Trunk Te IA ch Source: http:// del.icio.us no Pr lo of gy es sio n Re al se ar ch Us ab ilit y Sc ien ce Power Tags Lib ra In rie fo s rm at ion We ar b ch ite ctu re Or ga IT niz Tags zu www.asis.org at Inverse-logistic Tag Distribution Ar io ns ch ite ct Or u ga re nz at ion Long Tail Co mp f (x)= In ut fo e rm Con rs at fe e io n_ renc In ar e fo ch rm ite at ct ur ion _s e cie nc e So cie -C‘(x-1)b ty 15 Tags
  • 16. Use of Power Tags • Power Tags as factor in relevance ranking documents tagged with Power Tags appear higher in ranking • Power Tags as candidate tags for Tag Gardening which (semantic) relation do they have with co - occuring tags? 16
  • 17. Benefits of Indexing with Folksonomies • authentic user language – solution of the “vocabulary problem” • actuality • multiple interpretations – many perspectives – bridging the semantic gap • raise access to information resources • follow “desire lines” of users • cheap indexing method – shared indexing • the more taggers, the more the system becomes better – network effects • capable of indexing mass information on the Web • resources for development of knowledge organization systems • mass quality “control” • searching - browsing – serendipity • neologisms • identify communities and “small worlds” • collaborative recommender system • make people sensitive to information indexing 17
  • 18. Disadvantages of Indexing with Folksonomies • absence of controlled vocabulary • different basic levels (in the sense of Eleanor Rosch) • different interests – loss of context information • language merging • hidden paradigmatic relations • merging of formal (bibliographical) and aboutness tags • no specific fields • tags make evaluations (“stupid”) • spam-tags • syncategoremata (user-specific tags, “me”) • performative tags (“to do”, “to read”) • other misleading keywords solution: Tag Gardening with methods of Information Linguistics, user collaboration in giving meaning to tags and combination with existing knowledge organization systems 18
  • 19. Goal of Tag Gardening: Emergent Semantics Quelle: Peters, I., & Weller, K. (2008). Tag Gardening for Folksonomy Enrichment and 19 Maintenance. Webology, 5(3), Article 58, from http://www.webology.ir/2008/v5n3/a58.html.
  • 20. Maintenance of KOS and Folksonomy new terms – new relations Folksonomy KOS Tag Gardening Quelle: Christiaens, S. (2006). Metadata Mechanism: From Ontology to Folksonomy…and Back. Lecture 20 Notes in Computer Science, 4277, 199–207.
  • 21. Feedback Loop in Practice: Tagging of OPACs 2 possibilities: • 1) tagging of resources within the library’s website • 2) tagging of resources outside the library’s firewall 21
  • 22. Tagging of OPACS: Within Library’s Website: PennTags 22 http://tags.library.upenn.edu/
  • 23. Tagging of OPACS: Within Library’s Website: Ann Arbor District Library 23 http://www.aadl.org/catalog
  • 24. Tagging of OPACS: Within Library’s Website: University Library Hildesheim 24 http://www.uni-hildesheim.de/mybib/all_tags
  • 25. Tagging of OPACS: Within Library’s Website • advantages: – user behaviour can be directly observed and exploited for own applications – used knowledge organization system (KOS) can profit from user behaviour and user language – users will be “attracted” to the library – library will appear “trendy” 25
  • 26. Tagging of OPACS: Within Library’s Website • disadvantages: – development and implementation (costs and manpower) of the tagging service have to be taken over from the library – if only users may tag: librarians may loose their work motivation or may have a feeling of uselessness – “lock in” effect of users - - no “fresh” ideas 26
  • 27. Tagging of Resources Outside the Library‘s Firewall: LibraryThing http://www.librarything.com/search 27
  • 28. Tagging of Resources Outside the Library‘s Firewall: BibSonomy http://www.bibsonomy.org/ 28
  • 29. Tagging of Resources Outside the Library‘s Firewall • advantages: – development and implementation (costs and manpower) of the tagging service haven‘t to be taken over from the library – the library may profit from the “know- how” of the provider of the tagging system – users may profit from tagging activities of hundreds of other users no lock in - – library appears “trendy” 29
  • 30. Tagging of Resources Outside the Library‘s Firewall • disadvantages – user behaviour cannot be observed or exploited – your users support other tagging service – used KOS cannot profit from user behaviour 30
  • 31. Exkurs: Sentiment Tags • negative tags: “awful” – “foolish”, … • positive tags: “amazing” – “useful”, … • applicable for sentiment analysis of documents Quelle: Yanbe, Y., Jatowt, A., Nakamura, S., & Tanaka, K. (2007). Can Social Bookmarking Enhance Search in the 31 Web? In Proceedings of the 7th ACM/IEEE Joint Conference on Digital Libraries, Vancouver, Canada (pp. 107–116).
  • 32. Summary • knowing how folksonomies work is important for their adequate application in both – knowledge representation and – information retrieval • knowing why folksonomies work is a secret ☺ 32
  • 33. Knowledge Representation and Information Retrieval • two sides of the same coin • Immanuel Kant: Thoughts without content are empty, intuitions without concepts are blind... Feedback Loop Knowledge Representation Information Retrieval without Information Retrieval is without Knowledge empty. Representation is blind. 33
  • 34. Folksonomies and Knowledge Organization Systems • two sides of the same coin • no rivals- work best in combination! Feedback Loop flexible, up-to-date, user-centric precise, rigid, complete 34
  • 35. Viele Grüße aus Düsseldorf. Erschienen 2009 im Verlag Saur, de Gruyter Kontakt: isabella.peters@uni duesseldorf.de - 35