0
Tags as Tools for Social Classification Dr. Isabella Peters Department of Information Science Institute for Language and I...
Outline <ul><li>Theoretical assumptions :   </li></ul><ul><li>Social classification can be based on folksonomies </li></ul...
Assumption I <ul><li>Social classification can be based on folksonomies  </li></ul><ul><li>Folksonomy = sum of all tags of...
Assumption I <ul><li>Social classification can be based on folksonomies  </li></ul><ul><li>Resource folksonomy reflects vi...
Method I <ul><li>Aim: Finding tag pairs for construction of social classification </li></ul><ul><li>Step 1: Calculating Po...
Method I <ul><li>Step 2: Calculating co-occurrence for Power Tags and tags of platform folksonomy </li></ul><ul><li>Basis ...
Research Question I <ul><li>Step 3: Determination of Power Tags I and II can be carried out automatically </li></ul><ul><u...
Research Question I <ul><li>Examples:   </li></ul><ul><li>1. a) Power Tags I </li></ul><ul><ul><li>Android </li></ul></ul>...
Assumption II <ul><li>Power Tags are most relevant tags  </li></ul><ul><li>To build social classifications based on Power ...
Method II <ul><li>Power Tags are most relevant tags  </li></ul><ul><li>Investigation of 30 resources downloaded from delic...
Research Question II <ul><li>Are Power Tags most relevant for a resource?  </li></ul><ul><li>Determination of relevance: 5...
Research Question II <ul><li>Are Power Tags most relevant for a resource?  </li></ul><ul><li>Result: only the first two ta...
Assumption III <ul><li>Tag distributions on resource level become stable  </li></ul><ul><li>Studies showed that the shape ...
Assumption III <ul><li>Tag distributions on resource level become stable  </li></ul><ul><li>If this assumption is true and...
Open Research Question III <ul><li>When do tag distributions become stable?  </li></ul><ul><li>To automate classification ...
Open Research Question & Method III <ul><li>When do tag distributions become stable?   </li></ul><ul><li>Comparison of tag...
Conclusion <ul><li>Social Classification can be based on folksonomies – Power Tags are concept candidates </li></ul><ul><l...
Conclusion What type of  tag distribution ? Tag  distribution  stable? Extraction of  Power Tags I & II Pairs of  relevant...
Comments?  Questions?   Isabella Peters: isabella.peters@uni-duesseldorf.de Greetings from Düsseldorf! This presentation i...
References <ul><li>Halpin, H., Robu, V. and Shepherd, H. (2007): The Complex Dynamics of Collaborative Tagging. In: Carey ...
Upcoming SlideShare
Loading in...5
×

Tags as tools for social classification

719

Published on

Presentation held at the 34th Annual Conference of the German Classification Society, Karlsruhe, 21-23 July 2010.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
719
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Tags as tools for social classification"

  1. 1. Tags as Tools for Social Classification Dr. Isabella Peters Department of Information Science Institute for Language and Information Heinrich-Heine-University Düsseldorf, Germany 34th Annual Conference of the German Classification Society, July 2010
  2. 2. Outline <ul><li>Theoretical assumptions : </li></ul><ul><li>Social classification can be based on folksonomies </li></ul><ul><li>Power Tags are most relevant tags </li></ul><ul><li>Tag distributions on resource level become stable </li></ul><ul><li>Three main research questions: </li></ul><ul><li>How to build social classifications (automatically) ? </li></ul><ul><li>Are Power Tags most relevant for a resource? </li></ul><ul><li>(When do tag distributions become stable?) </li></ul><ul><li>Results </li></ul><ul><li>Based on study with students of University of Düsseldorf </li></ul>
  3. 3. Assumption I <ul><li>Social classification can be based on folksonomies </li></ul><ul><li>Folksonomy = sum of all tags of all users of a collaborative information service (e.g. delicious) </li></ul><ul><li>Platform folksonomy vs. resource folksonomy </li></ul><ul><li>Broad folksonomy (delicious) vs. narrow folksonomy (youtube) </li></ul><ul><li>Social classification = collaborative knowledge representation with natural-language terms = “social categorization” </li></ul>
  4. 4. Assumption I <ul><li>Social classification can be based on folksonomies </li></ul><ul><li>Resource folksonomy reflects via tags collective user intelligence </li></ul><ul><li> in giving meaning to the resource </li></ul><ul><li>Most popular tags are the most important tags for the resource </li></ul><ul><li>= Power Tags </li></ul><ul><li>Only observable in broad folksonomies </li></ul><ul><li>because of multiple tagging! </li></ul><ul><li>Folksonomies deliver concept </li></ul><ul><li>candidates for social classification </li></ul>
  5. 5. Method I <ul><li>Aim: Finding tag pairs for construction of social classification </li></ul><ul><li>Step 1: Calculating Power Tags for resource </li></ul><ul><ul><li>Number n of Power Tags depends on type of tag distribution </li></ul></ul><ul><ul><ul><li>Power law </li></ul></ul></ul><ul><ul><ul><li> n = exponent </li></ul></ul></ul><ul><ul><ul><li>Inverse-logistic distribution </li></ul></ul></ul><ul><ul><ul><li> n = tags left from turning point </li></ul></ul></ul>Social classification can be based on folksonomies Power Law Inverse-logistic distribution
  6. 6. Method I <ul><li>Step 2: Calculating co-occurrence for Power Tags and tags of platform folksonomy </li></ul><ul><li>Basis = Power Tags I from resource level </li></ul><ul><li>Power Tags II = co-occurring tags from platform level </li></ul><ul><li>Tag pair is most valuable for social categorization </li></ul><ul><li> Because of reflecting collective user intelligence </li></ul>Social classification can be based on folksonomies Power Tags I Power Tags II
  7. 7. Research Question I <ul><li>Step 3: Determination of Power Tags I and II can be carried out automatically </li></ul><ul><ul><li>1) Identifying distribution type </li></ul></ul><ul><ul><li>2) Labeling first n tags as Power Tags I </li></ul></ul><ul><ul><li>3) Identifying co-occurring tags </li></ul></ul><ul><ul><li>4) Identifying distribution type </li></ul></ul><ul><ul><li>5) Extracting first n tags as Power Tags II </li></ul></ul><ul><ul><li>6) Combining Power Tags I and Power Tags II as tag pairs </li></ul></ul><ul><li>Step 4: Intellectual determination of relationship between Power Tags I and Power Tags II  collaborative or individual </li></ul>How to build social classifications (automatically) ?
  8. 8. Research Question I <ul><li>Examples: </li></ul><ul><li>1. a) Power Tags I </li></ul><ul><ul><li>Android </li></ul></ul><ul><li>1. b) Power Tags II </li></ul><ul><ul><li>Mobile </li></ul></ul><ul><ul><li>Google </li></ul></ul><ul><li>2. a) Power Tags I </li></ul><ul><ul><li>Web 2.0 </li></ul></ul><ul><li>2. b) Power Tags II </li></ul><ul><ul><li>Tools </li></ul></ul><ul><ul><li>Social </li></ul></ul><ul><ul><li>Blog </li></ul></ul><ul><ul><li>Socialsoftware </li></ul></ul><ul><ul><li>Bookmarks </li></ul></ul>How to build social classifications (automatically) ? <ul><ul><li>Community </li></ul></ul><ul><ul><li>Tagging </li></ul></ul><ul><ul><li>Web </li></ul></ul><ul><ul><li>AJAX </li></ul></ul><ul><ul><li>online </li></ul></ul>association related term Google RT association related term mobile RT Android relation descriptor set hierarchy broader term web BT meronymy narrower term partitive blog NTP meronymy narrower term partitive bookmarks NTP meronymy narrower term partitive tagging NTP meronymy narrower term partitive community NTP meronymy narrower term partitive ajax NTP association related term online RT synonymy used for Socialsoftware UF Web 2.0 relation descriptor set
  9. 9. Assumption II <ul><li>Power Tags are most relevant tags </li></ul><ul><li>To build social classifications based on Power Tags an important precondition must be fulfilled: </li></ul><ul><ul><li>Power Tags ARE the most relevant tags for a resource </li></ul></ul><ul><li>Problem: relevance judgments as well as tagging behaviour are highly subjective and error-prone (regarding spelling etc.) </li></ul><ul><li>Is the collective intelligence of users capable of “ironing out” too personal and erroneous tags so that all users are satisfied with high-frequent tags? </li></ul>
  10. 10. Method II <ul><li>Power Tags are most relevant tags </li></ul><ul><li>Investigation of 30 resources downloaded from delicious in February 2010 </li></ul><ul><li>Participants: 20 students of Information Science at the HHU Düsseldorf </li></ul><ul><li>All resources tagged with “folksonomy” and tagged from at least 100 users </li></ul><ul><ul><li>To guarantee that students are technical able to judge relevance of tags </li></ul></ul><ul><ul><li>To guarantee that broad tag distributions can be used as test sample </li></ul></ul><ul><li>User evaluation </li></ul><ul><ul><li>Tag is relevant for resource = indicated with 1 </li></ul></ul><ul><ul><li>Tag is not relevant for resource = indicated with 0 </li></ul></ul><ul><ul><li>Students had access to resource </li></ul></ul><ul><ul><li>Students did not know the delicious-rank of the tags </li></ul></ul><ul><ul><li>Relevance distribution of tags for every resource by student judgments </li></ul></ul>
  11. 11. Research Question II <ul><li>Are Power Tags most relevant for a resource? </li></ul><ul><li>Determination of relevance: 50% and more of students judged tag as relevant </li></ul><ul><li>Extraction of Top 10-delicious-tags </li></ul><ul><li>How many students called these Top 10-tags relevant? </li></ul><ul><li>Calculation of relative frequency of students relevance judgments </li></ul>Ø Pearson ≈ 0,49 N = 30
  12. 12. Research Question II <ul><li>Are Power Tags most relevant for a resource? </li></ul><ul><li>Result: only the first two tags are relevant </li></ul><ul><li>Strong indication for Power Tags </li></ul><ul><li>Problems in relevance judgments </li></ul><ul><li>Bias to german tags </li></ul><ul><li>No unification of spelling variants  solution: tag gardening (NLP) </li></ul><ul><li>No combination of phrase tags </li></ul>
  13. 13. Assumption III <ul><li>Tag distributions on resource level become stable </li></ul><ul><li>Studies showed that the shape of tag distributions remains stable after reaching a particular number of tags and users </li></ul><ul><ul><li>Kipp & Campbell (2006) </li></ul></ul><ul><ul><li>Maarek et al. (2006) </li></ul></ul><ul><ul><li>Halpin, Robu, & Shepherd (2007) </li></ul></ul><ul><ul><li>Maass, Kowatsch, & Münster (2007) </li></ul></ul><ul><ul><li>Maier & Thalmann (2007) </li></ul></ul>
  14. 14. Assumption III <ul><li>Tag distributions on resource level become stable </li></ul><ul><li>If this assumption is true and “stable” is considered as </li></ul><ul><ul><li>No rank permutation of tags appear anymore </li></ul></ul><ul><ul><li>Relative number of tags does not change anymore </li></ul></ul><ul><li>it means that … </li></ul><ul><ul><li>Power Tags I and II are like controlled vocabulary for a resource </li></ul></ul><ul><ul><li>Users gained consenus in describing and tagging the resource – visualized in Power Tags </li></ul></ul><ul><ul><li>Tags in Long Tail of distribution may be synonyms, tags with typing errors, narrower concepts, etc. </li></ul></ul>
  15. 15. Open Research Question III <ul><li>When do tag distributions become stable? </li></ul><ul><li>To automate classification processes we need to know after which number of tagging users a tag distribution remains stable and when no changes in the ranking of tags appear anymore </li></ul><ul><li>After that we can extract </li></ul><ul><li>Power Tags for social </li></ul><ul><li>classification for the </li></ul><ul><li>particular resource </li></ul>
  16. 16. Open Research Question & Method III <ul><li>When do tag distributions become stable? </li></ul><ul><li>Comparison of tag distribution with n users and final tag distribution (downloaded at a point in time) </li></ul><ul><li>Calculation of relative frequency of every tag rel. freq (t 1 … t n ) for particular user numbers </li></ul><ul><li>Calculation of average distance between final tag distribution and tag distribution with n users </li></ul><ul><ul><li>Subtraction of ∑rel. freq (t n ,fd) of final distribution and ∑rel. freq (t n ,td) of tag distribution with n users </li></ul></ul><ul><li>Stability achieved when </li></ul><ul><li>∑ rel. freq (t n ,fd) - ∑rel. freq (t n ,td) < threshold value </li></ul>
  17. 17. Conclusion <ul><li>Social Classification can be based on folksonomies – Power Tags are concept candidates </li></ul><ul><li>Extraction of Power Tags I and II pairs can be carried out automatically </li></ul><ul><li>Determination of the relationship inherent in tag pairs requires intellectual processing </li></ul><ul><li>Power Tags are most relevant tags </li></ul><ul><li>Relevance of tags can be enhanced through unification and combination of similar tags (here: not synonyms but spelling variants)  tag gardening </li></ul><ul><li>Ongoing research: when do tag distributions become stable? </li></ul>
  18. 18. Conclusion What type of tag distribution ? Tag distribution stable? Extraction of Power Tags I & II Pairs of relevant Power Tags Candidate vocabulary Definition of concepts and of semantic relations Intellectual structuring Social knowledge organization system Automatic processing Intellectual processing
  19. 19. Comments? Questions? Isabella Peters: isabella.peters@uni-duesseldorf.de Greetings from Düsseldorf! This presentation is available on SlideShare: http://www.slideshare.net/isabellapeters.
  20. 20. References <ul><li>Halpin, H., Robu, V. and Shepherd, H. (2007): The Complex Dynamics of Collaborative Tagging. In: Carey L. Williamson, C. L., Zurko, M. E., Patel-Schneider, P. F. and Shenoy, P. J. (Eds.): Proceedings of the 16th International WWW Conference, Ban, Alberta, Canada. ACM, New York, 211-220. </li></ul><ul><li>Kipp, M., & Campbell, D. (2006). Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices. In Proceedings of the 17th Annual Meeting of the American Society for Information Science and Technology, Austin, Texas, USA . </li></ul><ul><li>Maarek, Y., Marnasse, N., Navon, Y., & Soroka, V. (2006). Tagging the Physical World. In Proceedings of the Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, Scotland . </li></ul><ul><li>Maass, W., Kowatsch, T., & Münster, T. (2007). Vocabulary Patterns in Free-for-all Collaborative Indexing Systems. In Proceedings of International Workshop on Emergent Semantics and Ontology Evolution, Busan, Korea (pp. 45–57). </li></ul><ul><li>Maier, R., & Thalmann, S. (2007). Kollaboratives Tagging zur inhaltlichen Beschreibung von Lern- und Wissensressourcen. In R. Tolksdorf & J. Freytag (Eds.), Proceedings of XML Tage, Berlin, Germany, Proceedings of XML Tage, Berlin, Germany (pp. 75–86). Berlin: Freie Universität Berlin. </li></ul><ul><li>Peters, I. (2009). Folksonomies: Indexing and Retrieval in Web 2.0. Berlin: De Gruyter, Saur. </li></ul><ul><li>Peters, I., & Stock, W. G. (2010). &quot;Power Tags&quot; in Information Retrieval. Library Hi Tech, 28(1), 81-93. </li></ul><ul><li>Peters, I., & Weller, K. (2008). Tag Gardening for Folksonomy Enrichment and Maintenance. Webology, 5(3), Article 58, from http://www.webology.ir/2008/ v5n3/a58.html. </li></ul><ul><li>Stock, W.G. (2006). On Relevance Distributions. Journal of the American Society for Information Science and Technology , 57(8), 1126-1129. </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×