1 
Hyperlink Formation in Social Bookmarking Systems: 
Who is Who Online? 
1st European Conference 
on Social Networks (EUSN) 
1-4 July 2014, Barcelona, Spain 
http://www.eusn.org 
Juan D. Borrero, jdiego@uhu.es 
Estrella Gualda, estrella@uhu.es 
José Carpio, jose.carpio@dti.uhu.es 
University of Huelva, Spain
2 
Why this paper? 
Network centrality 
 High centrality is a property of large networks 
like on the Internet (Barabási and Réka, 1999; 
Barabási et al., 2000). 
 The Zipf/power law is a defining characteristic 
of large-scale networks such as the Web (e.g. 
Barabási and Réka, 1999), which implies a high 
degree of network centralization. 
 It is unclear whether this is also a feature of the 
properties of sub-networks.
3 
Why this paper? 
Node centrality and preferences 
 The Web links are of 
conferrers of “authority” 
(Kleinberg, 1999) or 
“endorsement” (Davenport 
and Cronin, 2000) and can 
indicate preferences (see 
Baldassarri and Diani, 2007). 
 We argue that te prominence 
of a website, as measured by 
its status or public 
recognition, also determines 
its centrality. 
 Little literature has in 
account implicit relations 
between nodes within online 
networks (e.g., Ackland and 
O’Neil, 2011). 
 The ties among websites 
bookmarked can indicate 
preferences.
4 
Why this paper? 
Social Bookmarking 
 Provides a huge amount of 
user-generated annotations 
for web content. 
 Reflects the interests of 
millions of users. 
 Collective knowledge 
(Wisdom-of-crowds). 
 Research areas: (Web-) 
Search & Content 
classification, Ontology 
building, Trend detection, 
Recommendation, Sociology 
… 
Source: http://blog.hubspot.com/blog/tabid/6307/bid/7372/9-Reasons-Why- 
Your-Social-Media-Strategy-Isn-t-Working.aspx/
5 
Hyperlink Formation in Social Bookmarking Systems: Who is Who Online? 
Context and Topic of Study 
Delicious 
Delicious as a hyperlink network 
Globalization of Agriculture 
Objectives and Hypotheses 
Methodology 
Data collection 
Analysis 
Results 
Network centralization 
Top authoritative nodes 
Visualization network 
Discussion and Conclusions 
Further Research 
Possible Applications 
Outline
6 
Context and Topic of Study 
Delicious 
A free social bookmarking web service for storing, sharing and discovering web 
bookmarks
7 
Context and Topic of Study 
Delicious 
Collective nature 
• Content is created, annotated and viewed by its users. 
• Users can tag each of their bookmarks on the Delicious website, and provides 
knowledge about the URL marked. 
• View bookmarks added or annotated by other users
8 
Context and Topic of Study 
Delicious as a hyperlink network 
The structure of Social tagging –Delicious- website can be viewed as a 
network of three different and interconnected node types (tripartite 
network): the user who make the annotation, the link to the resource 
(urls) and one or more tags. 
we see the hyperlink network (uurl) 
we can also see indirected links (e.g. 
between urls - straight lines), that represent 
a unipartite network 
u 
u’ 
u’’ 
url 
url’ 
url’’ 
T’’’ 
T’’ 
T’ 
T
9 
Context and Topic of Study 
Topic 
Globalization 
Implies large market as result of the reduction transaction costs of international 
trade 
implications 
Globalization of agriculture 
- trade (foods, goods) 
- prices (food, goods) 
- food consumption (bulk products versus processed products) 
- R&D 
- rules and laws (subsidies, WTO related to poverty) 
effects 
Asymmetries 
more easily 
Discussion/diffusion Web 2.0
10 
Objectives and Hypotheses 
1. To analize the delicious sub-network centrality 
2. To link node centrality with public recognition 
3. To discover implicit relations between urls
11 
Objectives and Hypotheses 
1. To analize the delicious sub-network centrality 
H1: In the globalization of agriculture network on 
delicious (sub-network) Zipf law is fulfilled. 
2. To link node centrality with public recognition 
H2: Websites with a higher public recognition will be 
more likely to receive a large number of links 
3. To discover implicit relations between urls 
H3: The ties among urls can indicate preferences
12 
Methodology 
Data collection 
(A) Start point. Identify the search 
attributes. Authoritative source as 
baseline to find keywords 
connected to the idea of 
‘globalization of agriculture’ 
(B) Perl program web-crawling was 
made to gather the sample of users, 
URLs and tags from 22 April 2011 to 
21 May 2011. 
(C) Results 3,668 users on 4,913 
URLs. 
(D) Program in Haskell to reduce 
the amount of data by cutting the 
URLs.
13 
Methodology 
Analysis 
Social Network Analysis (SNA) with Pajek and 
Gephi software, 
1.studying the properties of centrality 
2.visualizing through graphs.
Results 
H1: Network centralization 
Hyperlink Network (userURL). The degree of variability in URL centrality scores according to indegree. 
14 
2,148 URLs arranged in rank order by number of 
inbound links (URL’s Indegree: Sum of total inbound 
links) 
Only 10 URLs from 2,148 (0.47%) account for 17.97% of links. 
1% URLs (22 URLs from 2,148) account for 26.50% of links. 
Zipf law 
The network is highly centralized within a few nodes. 
How come that a few websites receive more links?
Results 
H2: Top authoritative nodes in the Delicious “Globalization of agriculture” hyperlink 
network (userURL) 
Ten most 
centralized 
websites 
15 
Six of them were 
well-know media-based 
(online 
newspapers) and 
activists 
Indegree 
Value URL Description 
1 259 www.nytimes.com Online newspaper 
2 170 www.independent.co.uk Online newspaper 
3 155 www.naomiklein.org Activist media site 
4 144 www.news.bbc.co.uk/ Online newspaper 
5 124 www.globalresearch.ca Activist media site 
6 95 www.spiegel.de/ Online newspaper 
7 94 www.guardian.co.uk/ Online newspaper 
8 94 www.economist.com/ Online newspaper 
9 87 www.corpwatch.org Activist media site 
10 172 www.theatlantic.com Online magazine 
Alexa.com 
130 
568 
1,010,476 
63 
10,795 
170 
14,493 
1,747 
335,338 
1,063 
www.uab.cat 29,555 The popularity of the website 
bookmarked determines its centrality
16 
Results 
H2: Top authoritative nodes in unipartite network (URL-URL) 
Ties based on preferences at bookmarking URLs 
Ten most centralized websites 
Degree Closeness Betweenness 
1 537 www.nytimes.com 0.4421 www.nytimes.com 0.0930 www.nytimes.com 
2 386 www.news.bbc.co.uk 0.4180 www.news.bbc.co.uk 0.0593 www.news.bbc.co.uk 
3 337 www.economist.com 0.4068 www.guardian.co.uk 0.0366 www.globalresearch.ca 
4 324 www.guardian.co.uk 0.3992 www.economist.com 0.0341 www.guardian.co.uk 
5 286 www.ft.com 0.3886 www.rodrik.typepad.com 0.0293 www.naomiklein.org 
6 257 www.rodrik.typepad.com 0.3868 www.ft.com 0.0290 www.economist.com 
7 243 www.en.wikipedia.org 0.3854 www.en.wikipedia.org 0.0262 www.wikipedia.org 
8 222 www.youtube.com 0.3820 www.spiegel.de 0.0207 www.youtube.com 
9 218 www.spiegel.de 0.3814 www.washingtonpost.com 0.0191 www.spiegel.de 
10 217 www.globalresearch.ca 0.3800 www.globalresearch.ca 0.0184 www.ft.com 
Again the majority were well-know media-based (online 
newspapers) and Web 2.0 (YouTube and Wikipedia). 
The popularity of the website 
bookmarked determines its centrality 3 6 
Alexa
17 
Results 
H3: Visualization URL-URL unipartite network. 
Colour: Communities. Layout: ForceAtlas2 from Gephi. 
We found two 
different 
communities 
1.the mass 
media websites 
belong to the 
blue one, and 
2.the main 
activists 
websites are 
included in the 
green cluster 
Ties based 
on user 
preferences 
Complete 
network 
Main 
communities 
Size: Degree 
Size: Betweenness
• Very unequal distribution of power of the URLs bookmarked in 
the topic globalization of agriculture (O1) 
– Most bookmarked URLs seem to reflect preferences of USERs at bookmarking 
websites. 
– userURL network reflects a big centralization of preferences, and the existence 
of long tail. 
• Maybe not by chance, main nodes represent authoritative 
websites in the world (O2) 
– Well know mass media and popular activists surpassed by far other resources 
bookmarked. 
• The collaborative practice of bookmarking websites in Delicious 
can allow us to discover virtual communities (O3) 
– URL-URL unipartite network produces clusters of URLs. 
– Identification of key collective actors (represented here through URLs) allow a 
better comprehension of leadership, influence processes, and power-related 
structures. 
18 
Discussion and conclusions
19 
Further research 
• Why is ‘that’ so important URL in the network of 
globalization of agriculture? 
– Key URLs in this type of network could configure and reconfigure the 
evolution of the network (TIME), and structure and even manipulate 
the type of interchange of resources in Delicious or in similar 
bookmarking sites. 
• Go in-depth about users. 
– To identify of key actors that share URLs. 
• Distinction between scientifics and other professionals. 
– To distinct users in the way how they use some tags 
• Distinction between scientifics / other professionals or users? 
• Identify users with the same patterns at tagging, or URLs that were 
similarly labelled: study structural equivalences 
• Is it by chance? Do the most prominent actors correspond to a profile of very active 
and participative people? Do they usually work (or have as hobby) in this area and 
this is why tag so many URLs in Delicious?
Possible Applications 
• Producing and “manipulating” public opinion (at 
recommending and describing websites) and markets 
20 
– If we know the interests of users belonging to a network, 
we could also be able to make recommendations 
• For social practitioners, is a good way to identify key 
informants in a community through which to 
disseminate useful and important information. 
• Important for researchers interested in formulating 
strategies for intervention and mobilization. 
• Applications in advertising, e-commerce, social 
movements, security… 
• …

Hyperlink Formation in Social Bookmarking Systems: Who is Who Online?

  • 1.
    1 Hyperlink Formationin Social Bookmarking Systems: Who is Who Online? 1st European Conference on Social Networks (EUSN) 1-4 July 2014, Barcelona, Spain http://www.eusn.org Juan D. Borrero, jdiego@uhu.es Estrella Gualda, estrella@uhu.es José Carpio, jose.carpio@dti.uhu.es University of Huelva, Spain
  • 2.
    2 Why thispaper? Network centrality  High centrality is a property of large networks like on the Internet (Barabási and Réka, 1999; Barabási et al., 2000).  The Zipf/power law is a defining characteristic of large-scale networks such as the Web (e.g. Barabási and Réka, 1999), which implies a high degree of network centralization.  It is unclear whether this is also a feature of the properties of sub-networks.
  • 3.
    3 Why thispaper? Node centrality and preferences  The Web links are of conferrers of “authority” (Kleinberg, 1999) or “endorsement” (Davenport and Cronin, 2000) and can indicate preferences (see Baldassarri and Diani, 2007).  We argue that te prominence of a website, as measured by its status or public recognition, also determines its centrality.  Little literature has in account implicit relations between nodes within online networks (e.g., Ackland and O’Neil, 2011).  The ties among websites bookmarked can indicate preferences.
  • 4.
    4 Why thispaper? Social Bookmarking  Provides a huge amount of user-generated annotations for web content.  Reflects the interests of millions of users.  Collective knowledge (Wisdom-of-crowds).  Research areas: (Web-) Search & Content classification, Ontology building, Trend detection, Recommendation, Sociology … Source: http://blog.hubspot.com/blog/tabid/6307/bid/7372/9-Reasons-Why- Your-Social-Media-Strategy-Isn-t-Working.aspx/
  • 5.
    5 Hyperlink Formationin Social Bookmarking Systems: Who is Who Online? Context and Topic of Study Delicious Delicious as a hyperlink network Globalization of Agriculture Objectives and Hypotheses Methodology Data collection Analysis Results Network centralization Top authoritative nodes Visualization network Discussion and Conclusions Further Research Possible Applications Outline
  • 6.
    6 Context andTopic of Study Delicious A free social bookmarking web service for storing, sharing and discovering web bookmarks
  • 7.
    7 Context andTopic of Study Delicious Collective nature • Content is created, annotated and viewed by its users. • Users can tag each of their bookmarks on the Delicious website, and provides knowledge about the URL marked. • View bookmarks added or annotated by other users
  • 8.
    8 Context andTopic of Study Delicious as a hyperlink network The structure of Social tagging –Delicious- website can be viewed as a network of three different and interconnected node types (tripartite network): the user who make the annotation, the link to the resource (urls) and one or more tags. we see the hyperlink network (uurl) we can also see indirected links (e.g. between urls - straight lines), that represent a unipartite network u u’ u’’ url url’ url’’ T’’’ T’’ T’ T
  • 9.
    9 Context andTopic of Study Topic Globalization Implies large market as result of the reduction transaction costs of international trade implications Globalization of agriculture - trade (foods, goods) - prices (food, goods) - food consumption (bulk products versus processed products) - R&D - rules and laws (subsidies, WTO related to poverty) effects Asymmetries more easily Discussion/diffusion Web 2.0
  • 10.
    10 Objectives andHypotheses 1. To analize the delicious sub-network centrality 2. To link node centrality with public recognition 3. To discover implicit relations between urls
  • 11.
    11 Objectives andHypotheses 1. To analize the delicious sub-network centrality H1: In the globalization of agriculture network on delicious (sub-network) Zipf law is fulfilled. 2. To link node centrality with public recognition H2: Websites with a higher public recognition will be more likely to receive a large number of links 3. To discover implicit relations between urls H3: The ties among urls can indicate preferences
  • 12.
    12 Methodology Datacollection (A) Start point. Identify the search attributes. Authoritative source as baseline to find keywords connected to the idea of ‘globalization of agriculture’ (B) Perl program web-crawling was made to gather the sample of users, URLs and tags from 22 April 2011 to 21 May 2011. (C) Results 3,668 users on 4,913 URLs. (D) Program in Haskell to reduce the amount of data by cutting the URLs.
  • 13.
    13 Methodology Analysis Social Network Analysis (SNA) with Pajek and Gephi software, 1.studying the properties of centrality 2.visualizing through graphs.
  • 14.
    Results H1: Networkcentralization Hyperlink Network (userURL). The degree of variability in URL centrality scores according to indegree. 14 2,148 URLs arranged in rank order by number of inbound links (URL’s Indegree: Sum of total inbound links) Only 10 URLs from 2,148 (0.47%) account for 17.97% of links. 1% URLs (22 URLs from 2,148) account for 26.50% of links. Zipf law The network is highly centralized within a few nodes. How come that a few websites receive more links?
  • 15.
    Results H2: Topauthoritative nodes in the Delicious “Globalization of agriculture” hyperlink network (userURL) Ten most centralized websites 15 Six of them were well-know media-based (online newspapers) and activists Indegree Value URL Description 1 259 www.nytimes.com Online newspaper 2 170 www.independent.co.uk Online newspaper 3 155 www.naomiklein.org Activist media site 4 144 www.news.bbc.co.uk/ Online newspaper 5 124 www.globalresearch.ca Activist media site 6 95 www.spiegel.de/ Online newspaper 7 94 www.guardian.co.uk/ Online newspaper 8 94 www.economist.com/ Online newspaper 9 87 www.corpwatch.org Activist media site 10 172 www.theatlantic.com Online magazine Alexa.com 130 568 1,010,476 63 10,795 170 14,493 1,747 335,338 1,063 www.uab.cat 29,555 The popularity of the website bookmarked determines its centrality
  • 16.
    16 Results H2:Top authoritative nodes in unipartite network (URL-URL) Ties based on preferences at bookmarking URLs Ten most centralized websites Degree Closeness Betweenness 1 537 www.nytimes.com 0.4421 www.nytimes.com 0.0930 www.nytimes.com 2 386 www.news.bbc.co.uk 0.4180 www.news.bbc.co.uk 0.0593 www.news.bbc.co.uk 3 337 www.economist.com 0.4068 www.guardian.co.uk 0.0366 www.globalresearch.ca 4 324 www.guardian.co.uk 0.3992 www.economist.com 0.0341 www.guardian.co.uk 5 286 www.ft.com 0.3886 www.rodrik.typepad.com 0.0293 www.naomiklein.org 6 257 www.rodrik.typepad.com 0.3868 www.ft.com 0.0290 www.economist.com 7 243 www.en.wikipedia.org 0.3854 www.en.wikipedia.org 0.0262 www.wikipedia.org 8 222 www.youtube.com 0.3820 www.spiegel.de 0.0207 www.youtube.com 9 218 www.spiegel.de 0.3814 www.washingtonpost.com 0.0191 www.spiegel.de 10 217 www.globalresearch.ca 0.3800 www.globalresearch.ca 0.0184 www.ft.com Again the majority were well-know media-based (online newspapers) and Web 2.0 (YouTube and Wikipedia). The popularity of the website bookmarked determines its centrality 3 6 Alexa
  • 17.
    17 Results H3:Visualization URL-URL unipartite network. Colour: Communities. Layout: ForceAtlas2 from Gephi. We found two different communities 1.the mass media websites belong to the blue one, and 2.the main activists websites are included in the green cluster Ties based on user preferences Complete network Main communities Size: Degree Size: Betweenness
  • 18.
    • Very unequaldistribution of power of the URLs bookmarked in the topic globalization of agriculture (O1) – Most bookmarked URLs seem to reflect preferences of USERs at bookmarking websites. – userURL network reflects a big centralization of preferences, and the existence of long tail. • Maybe not by chance, main nodes represent authoritative websites in the world (O2) – Well know mass media and popular activists surpassed by far other resources bookmarked. • The collaborative practice of bookmarking websites in Delicious can allow us to discover virtual communities (O3) – URL-URL unipartite network produces clusters of URLs. – Identification of key collective actors (represented here through URLs) allow a better comprehension of leadership, influence processes, and power-related structures. 18 Discussion and conclusions
  • 19.
    19 Further research • Why is ‘that’ so important URL in the network of globalization of agriculture? – Key URLs in this type of network could configure and reconfigure the evolution of the network (TIME), and structure and even manipulate the type of interchange of resources in Delicious or in similar bookmarking sites. • Go in-depth about users. – To identify of key actors that share URLs. • Distinction between scientifics and other professionals. – To distinct users in the way how they use some tags • Distinction between scientifics / other professionals or users? • Identify users with the same patterns at tagging, or URLs that were similarly labelled: study structural equivalences • Is it by chance? Do the most prominent actors correspond to a profile of very active and participative people? Do they usually work (or have as hobby) in this area and this is why tag so many URLs in Delicious?
  • 20.
    Possible Applications •Producing and “manipulating” public opinion (at recommending and describing websites) and markets 20 – If we know the interests of users belonging to a network, we could also be able to make recommendations • For social practitioners, is a good way to identify key informants in a community through which to disseminate useful and important information. • Important for researchers interested in formulating strategies for intervention and mobilization. • Applications in advertising, e-commerce, social movements, security… • …