• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Structure of tagging on Wikipedia pages and other networks -- Wikimania 2010 -- Gdansk
 

Structure of tagging on Wikipedia pages and other networks -- Wikimania 2010 -- Gdansk

on

  • 835 views

 

Statistics

Views

Total Views
835
Views on SlideShare
835
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
2

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • please send me an email to illes.farkas@gmail.com with this request. thanks.
    Are you sure you want to
    Your message goes here
    Processing…
  • Please use a free license, I would Like to upload these slides to Commons and add them in Wikimania wiki.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Structure of tagging on Wikipedia pages and other networks -- Wikimania 2010 -- Gdansk Structure of tagging on Wikipedia pages and other networks -- Wikimania 2010 -- Gdansk Presentation Transcript

    • Tagging structure in a protein-protein interaction network, a co-authorship network and the (English) Wikipedia Proteins Math co-authors Wikipedia articles Network Nodes + Links Node tags Directed Acyclic Graph Gergely Palla , Illés J. Farka s , P é ter Pollner , Imre Derényi, Tamás Vicsek Category tree FIFA World Cup FIFA World Cup Players Classification of co-authored papers Combinatorics Graph theory Biochemical functions Growth Cell growth E ötvös University and Hungarian Academy of Sciences (Budapest, Hungary) Interactions (“MIPS”) Co-authorships (“MathSciNet”) Hyperlinks in Wikipedia CFinder.org
    • G. Palla, I. J. Farkas, P. Pollner, I. Derényi, T. Vicsek  Fundamental statistical features and self-similar properties of tagged networks   New Journal of Physics   10 , 123026 (2008)  http://CFinder.org --> Publications Full version:
    • There is no clearly separated group of “most popular tags” Transition between popular and less popular tags is continuous Portion ( x ) of all nodes in the network Probability that a given tag and its descendants label a portion x of all nodes CFinder.org
    • There is no clear group of “most heavily tagged nodes” Transition between strongly and less strongly tagged nodes is continuous Number of tags ( n ) on a node (protein, author, wiki article) of the network Probability that a node has n tags CFinder.org
    • On the large scale there is no “link saturation” within a topic In fact, link density within a large topic is almost the same as outside CFinder.org Number of links among these nodes Number of Wikipedia articles selected from those labeled with “Japan” and descendant terms Maximum possible Wikipedia average
    • CFinder.org A B C D E F G H J Meaningful removal of loops from the category hierarchy How to achieve tree structure (DAG) with the lowest number of removals Example (Oct.2007):
      • Finding all loops is easy.
      • But which of their links should be removed?
      • Goals:
      • Remove lowest possible number of links
      • - Smallest “damage” to existing category hierarchy
      subcategory subcategory Category:Urdu Category:Hindustani
    • CFinder.org Meaningful removal of loops from the category hierarchy How to achieve tree structure (DAG) with the lowest number of removals Example (Oct.2007):
      • Identify “loop subgraph”
      • by iteratively removing nodes with 1 link
      • (2) Iteratively remove “least important” links
      • from the loop subgraph
      • (3) Add non-loop links again
      A B C D E F G H J subcategory subcategory
      • Finding all loops is easy.
      • But which of their links should be removed?
      • Goals:
      • Remove lowest possible number of links
      • - Smallest “damage” to existing category hierarchy
      Category:Urdu Category:Hindustani
    • Gergely Palla Illés Farka s P é ter Pollner Imre Derényi Tamás Vicsek
    • http://CFinder.org -- with network data and free analysis software We thank for support from: