SlideShare a Scribd company logo
1 of 58
Download to read offline
Institute of Theoretical Physics 
University of Cologne 
Diploma Thesis 
Analysis and Simulation 
of Scienti
c Networks 
Felix Putsch 
July 14, 2003 
425 
453 
482 
492 
529 
251 
181 
530 
336 
202 199 
80 
222 
41 
27 
7 398 
223 
224 
477 371 
102 
105 
134 
338 
359 
supervised by Prof. D. Stauer
Hereby, I con
rm (according to the Prufungsordnung of July 12, 1996, x20(5)) having 
composed this diploma thesis alone, using no other than the mentioned sources and tools. 
Citations have been marked. 
Hiermit versichere ich gemass x20(5) der Prufungsordnung vom 12. Juli 1996, dass ich 
diese Diplomarbeit alleine erstellt und keine anderen als die angegebenen Quellen und 
Hilfsmittel verwendet habe. Zitate wurden kenntlich gemacht. 
* fxp@thp.uni-koeln.de
Though the mountains divide 
And the oceans are wide 
It's a small world after all 
R. M. and R. B. Sherman
Contents 
1. Introduction 7 
1.1. Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 
1.2. Six Degrees of Separation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 
1.3. Small World Eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 
1.4. Science Collaboration Networks . . . . . . . . . . . . . . . . . . . . . . . 8 
2. Network Models 9 
2.1. Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 
2.1.1. Small World Eect . . . . . . . . . . . . . . . . . . . . . . . . . 9 
2.1.2. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 
2.1.3. Scale-Free Behavior . . . . . . . . . . . . . . . . . . . . . . . . . 9 
2.2. Regular Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 
2.3. Erd}os-Renyi Random Networks . . . . . . . . . . . . . . . . . . . . . . . 11 
2.4. Watts-Strogatz Small-World Networks . . . . . . . . . . . . . . . . . . . 11 
2.5. Barabasi-Albert Network Model . . . . . . . . . . . . . . . . . . . . . . . 11 
2.6. New Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 
3. Empirical Collaboration Network 15 
3.1. Typology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 
3.1.1. Citation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 
3.1.2. Collaboration Graph . . . . . . . . . . . . . . . . . . . . . . . . . 15 
3.2. Building the net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 
3.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 
3.2.2. Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 
3.3. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 
3.3.1. Authors per paper . . . . . . . . . . . . . . . . . . . . . . . . . . 16 
3.3.2. Connections per author . . . . . . . . . . . . . . . . . . . . . . . 18 
3.3.3. Double vs. unique links . . . . . . . . . . . . . . . . . . . . . . . 19 
3.3.4. Cluster sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 
3.4. Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 
5
Contents 
3.4.1. Erd}os-Renyi random graphs . . . . . . . . . . . . . . . . . . . . . 21 
3.4.2. Watts-Strogatz small-world networks . . . . . . . . . . . . . . . . 21 
3.4.3. Barabasi-Albert networks . . . . . . . . . . . . . . . . . . . . . . 22 
4. Spin models 23 
4.1. Leadership eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 
4.1.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 
4.1.2. Phase transition . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 
4.1.3. Degree distribution . . . . . . . . . . . . . . . . . . . . . . . . . 24 
4.1.4. Spin 
ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 
4.2. Cluster limited Ising models . . . . . . . . . . . . . . . . . . . . . . . . . 25 
4.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 
4.2.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 
4.2.3. Bias adjusti
cation . . . . . . . . . . . . . . . . . . . . . . . . . 26 
4.2.4. Linear Relationship . . . . . . . . . . . . . . . . . . . . . . . . . 29 
5. Barabasi-Albert network models 31 
5.1. Modi
ed Barabasi-Albert model . . . . . . . . . . . . . . . . . . . . . . 31 
5.2. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 
5.2.1. Isolated clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 
5.2.2. Merging clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 
5.2.3. scale-free behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 34 
6. Conclusion 37 
A. Acknowledgements 39 
B. Source code 41 
B.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 
B.2. Spin 
ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 
B.3. Modi
ed Barabasi-Albert model . . . . . . . . . . . . . . . . . . . . . . 51 
C. Figures 55 
Bibliography 55 
6
1. Introduction 
1.1. Networks 
In recent times, hearing the word network 
immediately arouses the idea of physically 
wired networks as those formed by tele- 
phone lines or computer links. Though, net- 
work is a concept a good deal more general 
than only this. 
Mathematically spoken, networks are 
graphs, i.e. a set of nodes (of whatever 
kind) connected by edges (links, connec- 
tions) between certain pairs. 
This abstraction has been known for a 
long time. Probably the
rst paper of graph 
theory was written by Euler [1], the so- 
called bridge problem of Konigsberg. Eu- 
ler discusses whether or not it is possible 
to make a round walk, passing of each of 
Konigsberg's nine bridges exactly once (
g- 
ure 1.1). 
The concept of networks can be applied 
to lots of theoretical or experimental sub- 
jects [2{4], nodes being people [5], Internet 
servers [6], scientists [7, 8] or others, the 
range of links comprises e-mails [9], friend- 
ships [5], citations [10, 11] and more. 
Thus, there are numerable dierent kinds 
of networks, physical ones (e.g. hard wired) 
as well as logical (e.g. dependencies) or 
social ones (e.g. contacts, friendships), 
stretching out to topics far from wired net- 
works [9, 12]. The area is under vigorous 
research. Good reviews can be found in [2{ 
4, 13, 14]. 
1.2. Six Degrees of Separation 
Out of personal experience, nearly every- 
body has been confronted with what we call 
small world eect. There are numerous ex- 
amples: 
At a party, we
nd out to know some 
stranger we just started talking to by only a 
few middle-persons (or technically spoken 
we are only separated from him by a low 
degree). E.g., he could be our street neigh- 
bors' colleague's son. We hear My god, 
how world is small. 
Rumors are another example. We are as- 
tonished to experience the pace at which 
they spread. After a few hours and thus 
only a few possibilities of telling rumors to 
others, whole city seems to know. 
Milgram [15] made an experiment on 
this. He instructed a set of people to try 
to send a letter to some stranger, only by 
using personal contacts. He found out that 
an astonishing short chain of social links is 
needed for this task, which entered in every- 
day's language as Six Degrees of Separa- 
tion. Recently, this has been reviewed on 
a more popular basis by a German weekly 
newspaper [5]. 
1.3. Small World Eect 
Six Degrees of Separation is only one man- 
ifestation of a more general principle: the 
small world eect [16{18]. 
Observations of many real-world net- 
7
1. Introduction 
Figure 1.1.: Konigsberg bridge problem: Is it possible to make a round trip, passing each bridge exactly 
once? [1] 
works in computer science, biology, chem- 
istry, linguistic, sociology, etc. have re- 
vealed a crucial dierence from regular lat- 
tices. 
Regarding average (or sometimes max- 
imum) path lengths on such networks, we 
would expect to see an increase linearly with 
the number of nodes. Instead, we examine 
distances growing logarithmically with sys- 
tem size. 
This behavior is not only an amus- 
ing eect but has far-spreading conse- 
quences [19]. Prominent examples are In- 
ternet's stability against attacks [20], dis- 
ease spreading [21, 22] or path
nding 
strategies [23, 24]. 
1.4. Science Collaboration 
Networks 
In context of science, the network between 
scientists as nodes of the graph is of par- 
ticular interest. This network belongs to 
the group of social ones, with humans as 
nodes. Unlike many other forms of social 
relationships, that are mostly quite diÆcult 
to capture objectively, the
eld of published 
papers is very widespread covered by the 
Science Citation Index [25] and so easily 
available to research. 
8
2. Network Models 
There are many types of networks com- 
peting to describe observations made in 
socio-physics. After discussing which mea- 
surements describe a given networks struc- 
ture, we will give a short overview about 
what we think to be the most important 
ones and discuss advantages and possible 
disadvantages. 
2.1. Measurements 
2.1.1. Small World Eect 
As illustrated in the introduction, we are 
interested in the correlation of network size 
and average (maximum) path lengths. We 
investigate if there is linear, logarithmic or 
other behavior. In case of a logarithmic one, 
the net is said to show the Small World 
Eect. 
2.1.2. Clustering 
In friendship network, we
nd friends of one 
person often to be friends themselves. This 
is true for most social networks and even 
other ones. Links are not spread randomly 
but arranged in clusters. 
To describe thing mathematically, we in- 
troduce a clustering coeÆcient Ci of a node 
i describing the portion of m established 
links between all ki next neighbors com- 
pared to the maximum possible number of 
M = m 
2 , i.e. 
Cn = 
m 
M 
: 
This value is averaged over all vertices to 
give a clustering coeÆcient C for the whole 
network. 
Typical values experienced are far above 
results expected for random networks [2, p. 
50]. 
2.1.3. Scale-Free Behavior 
A third observation regarding social net- 
works is its distribution of degrees. Regard- 
ing frequency of vertices of given coordina- 
tion numbers, we do not
nd an exponential 
but a power law [26]. 
In all, we have three possible means 
to classify networks. Many social graphs 
show small path lengths, high clustering 
and scale-free behavior. 
2.2. Regular Lattices 
The simplest form of a lattice is a symmet- 
rical formation of nodes connected by edges 
between all pairs (or all pairs of adjacent) 
nodes as shown in
gure 2.1a,b. Reasons 
to choose this linking are e.g. to simulate 
neighborship in a town etc. 
Such network show a high degree of clus- 
tering, as wished. The average path lengths 
are very long, though and scale with sys- 
tem size. So, small world behavior cannot 
be found which makes the model inappro- 
priate for our needs. 
9
2. Network Models 
Figure 2.1.: network types: a, b regular lattices, c random network, c scale-free graph [14] 
10
2.5. Barabasi-Albert Network Model 
2.3. Erd}os-Renyi Random 
Networks 
Random networks are the extremum on the 
other side of the spectrum. A number of 
nodes is wired by pure chance, i.e. we throw 
the dices to select two nodes and place an 
edge between them. 
Such graphs have been
rst proposed by 
Solomono and Rapoport [27] and have 
been extensively studied by Erd}os and Renyi 
[28]. Actual results have been reviewed in 
[29]. A typical result can be seen in
g- 
ure 2.1c. 
We
nd that average path lengths behave 
logarithmically with network size. While 
this is as desired for small world simulation, 
obviously there is no clustering. 
2.4. Watts-Strogatz Small-World 
Networks 
The idea is plausible to try combining both 
presented models to sum up their corre- 
sponding advantages. A big step towards 
this goal was done by Watts and Strogatz 
[16]. 
Their model starts with a circular graph 
that is regularly wired (
gure 2.2). Step 
by step, edges are chosen by chance and 
rewired to an arbitrary destination node. 
Thus, a small fraction of links are long- 
range ones. To illustrate, this could be 
habitants of a street of neighbors having 
relationships with far-away relatives. 
At
rst, the model seems to ful
ll our de- 
sires. It shows small average path lengths as 
well as high clustering. Looking at the de- 
gree distribution, i.e. the frequency of nodes 
Figure 2.3.: Barabasi and Albert [30] 
of a certain degree, we
nd strong dier- 
ences from real-world data as there is no 
scale-free behavior. 
2.5. Barabasi-Albert Network 
Model 
Barabasi and Albert [30] started a new idea. 
Their model consists of two ingredients: 
growth and preferential attachment. 
We start with a graph of m0 = 3 vertices, 
each one connected to each other. Now, in 
each time step, we add a node that is con- 
nected to others by m = 3 links. The new 
node being one side of the links, the other 
one is chosen at random from the existing 
network. The probability of a vertex being 
selected is proportional to the number of 
links already attached to it. 
To stay in the image: If you already have 
lots of friends, you are more likely to get 
new ones. The rich get richer.1 
These rules result in a network (
g- 
ure 2.1d) that is capable of reproducing 
small-world behavior, as well as being scale- 
free. Research has found good collapse with 
1 Whoever has will be given more, and he will 
have an abundance. [31]. 
11
2. Network Models 
(a) (b) (c) 
Figure 2.2.: Watts-Strogatz network model: We start with a regular lattice (a) formed to a ring (b) 
and re-wire a small fraction of links to random destinations (c) [18] 
empirical networks, included e.g. the world 
wide web [6]. Good introductions can be 
found in [32, 33]. 
Clustering is present to a certain degree, 
but still much too small regarding experi- 
mental values. 
2.6. New Approaches 
Recently, new network models have been 
developed to cope with inconveniences en- 
countered with present ones. 
Ravasz and Barabasi [35] examined net- 
works of a self-similar structure imitating 
the idea of hierarchical organization in soci- 
ology. Combining high clustering and scale- 
free behavior, their model does not show 
short path lengths, though. 
Klemm and Eguiluz [34]2 developed an 
auspicious model joining all three demands 
in one network. The authors present a gen- 
eralization of the Barabasi-Albert model, 
adding aging of nodes and some random 
behavior. 
There will be further research to be done 
2cf. also [36] 
on this model to verify if it copes with re- 
ality. 
An overview of all models can be found 
in
gure 2.4. 
12
2.6. New Approaches 
high clustering 
short average 
Watts−Strogatz 
Klemm−Eguiluz 
Erdos−Renyi 
Barabasi−Albert 
Ravasz−Barabasi 
scale−free 
regular lattice 
path lengths 
small world 
hierarchical model 
random network 
Figure 2.4.: overview over recent network models (Erd}os and Renyi [28], Watts and Strogatz [16], 
Barabasi and Albert [30], Klemm and Eguiluz [34], Ravasz and Barabasi [35]) 
13
2. Network Models 
14
3. Empirical Collaboration Network 
In context of science, the network be- 
tween scientists as nodes of graph is of par- 
ticular interest. 
3.1. Typology 
First, we want to deal with the de
nition 
of a collaboration graph. As to the nodes, 
we have the choice to identify each vertex 
either with an author or with a paper. 
The second possibility is also area of re- 
search [37], but we think studying the rela- 
tionship of scientists as the paper's authors 
oers more insight in how research works. 
So we will make each scientist a node of 
our network. 
As what concerns the edges, there are ba- 
sically two possible choices|both covered 
equally by the database used [25]. 
3.1.1. Citation Graph 
We might chose to consider citations from 
one author to another as links [10], thus 
resulting in a directed graph. 
Starting at a given paper, we can enlarge 
our network by following links recursively 
up to a certain depth, e.g. by depth-
rst 
or width-
rst algorithms. Each new work 
will cite several to many still un-included. 
Roughly spoken, the number of publications 
to include will raise exponentially with the 
maximum depth chosen. 
Quickly, we arrive at huge amounts of 
data. Additionally, there is no canonical 
end of the hunt for new links. In the ex- 
treme case we could be caught in a giant 
cluster containing all or nearly all of the pa- 
pers ever published. We see no possibility to 
narrow this down in a reasonable way with- 
out fear of introducing arbitrary boundary 
conditions. 
3.1.2. Collaboration Graph 
Second possibility to de
ne edges of a graph 
is creating links by co-authorship in one or 
several papers [7, 8]. If n scientists pub- 
lish a paper together, they are connected 
to each other by n 
2 edges. 
As an additional advantage, we have the 
choice to start with an arbitrary set of au- 
thors, establishing links between them by 
looking at all papers they are involved. This 
will result in a graph of limited size. 
Of course, we should think carefully 
about reasonable selection, to avoid edge 
eects. We will discuss this in the next sec- 
tion. 
3.2. Building the net 
3.2.1. Proceeding 
As solution, we choose the following pro- 
ceeding: We start with one paper. As one 
part of our work will be the comparison 
of real world data to Barabasi-Albert net- 
works, we take the corresponding paper [30] 
as center of investigation. 
15
3. Empirical Collaboration Network 
In order to determine the set of authors 
we want to deal with, we select all 185 pa- 
pers that cite this paper.1 Secondly, we 
construct a list of unique authors from all 
these papers. A
rst approach delivers 559 
scientists, whereof some turned out to be 
identical but appearing in particular papers 
with typos. We
nish with a set of 555 
authors to whom we attribute consecutive 
numbers. 
The last step of the network creation pro- 
cess consists in establishing links between 
all these authors. This is done by selecting 
one paper after the other and introducing 
connections between each possible pair of 
this paper's authors (i.e. n 
2 links for n au- 
thors). 
Eventually, this gives us a graph of 555 
nodes representing scienti
c collaboration 
in the area of Barabasi-Albert networks. 
The network size is relatively small com- 
pared to all data in the Science Citation 
Index [25] (approx. 107 papers). Studying 
properties of this subnet, we hope getting 
an insight to what leads to the structure 
observed. Veri
cation with bigger networks 
remains a task for the future. 
3.2.2. Visualization 
To get an idea of what we are dealing about, 
we visualize the graph using a spring model 
[38, 39]. In order to give manageable results 
we remove a paper on the Human Genome 
Project [40] with 274 authors. Brief exami- 
nation yields that this is no harm, as scien- 
1We have to be careful not to mix citation data 
from dierent dates as new papers are continu- 
ously added to the database. Base of our inves- 
tigation is October 21st, 2002. 
tists participating in this work did not co- 
operate with others in our graph, and form 
a big cluster on their own. The result is 
shown in
gure 3.1. 
3.3. Analysis 
3.3.1. Authors per paper 
authors frequency 
1 37 
2 69 
3 47 
4 21 
5 6 
6 4 
7 1 
Table 3.1.: Authors per paper 
First thing we are interested in is the fre- 
quency distribution of papers per author. 
We expect to see many papers with few au- 
thors and vice versa (table 3.1). 
70 
60 
50 
40 
30 
20 
10 
0 
~x^3.58*exp(-x/0.54) 
1 2 3 4 5 6 
frequency 
number of authors 
~x^-3.3 
Figure 3.2.: frequency distribution of the number 
of authors per paper 
16
3.3. Analysis 
493 423 
net of cooperation 
458 
201 
343 
374 373 
8 249 
40 
1 
296 
156 
457 
15 124 
30 
315 
345 
422 
274 
4 
209 
307 
393 
6 
470 
245 
272 
7 
27 
223 
224 
477 371 
41 
102 
105 
134 
338 
359 
398 
425 
453 
482 
492 
529 
251 
181 
530 
9 
383 
401 
480 
280 384 257 
12 
271 
463 
434 
312 13 
334 
17 
52 
18 
154 
107 
177 
20 
37 
164 
429 
433 
322 
24 
350 
308 
99 
314 
26 
385 
38 
267 
408 
290 
400 
39 
318 
336 
202 199 
43 
386 
47 
405 
454 
484 
49 
145 
332 
212 
50 
449 
51 
122 
55 
87 
270 
57 
62 
389 
58 
204 
258 
365 
111 461 
504 
226 
220 
269 
61 
452 
339 
533 
68 
79 
404 
69 
215 
478 
74 
502 
77 
375 
498 
80 
222 
81 
92 
82 
306 
128 
84 
554 
187 
85 
185 
86 
117 
317 
94 
129 
95 
264 
479 
552 
275 96 
555 346 
101 
179 
110 
378 488 
113 
292 431 
243 
165 
116 
418 
329 
120 
171 
219 
436 
472 
512 
123 
344 
125 
259 
130 
205 
136 
494 
138 
153 
160 
435 
142 
213 
406 
197 
143 
144 
246 
370 
147 
440 
485 
151 
395 
309 
163 
237 
252 
234 
255 
253 
254 
283 
166 
341 
377 
491 
381 
167 
302 
416 
169 
320 
217 
239 
218 
230 
233 232 
282 
235 
278 430 
241 
263 
515 
265 
444 
273 
279 
316 
285 
486 
536 
299 
390 
466 
301 
319 
450 
542 
361 
380 
372 
421 
376 
469 
506 
391 
411 
451 
524 
Figure 3.1.: collaboration network with 555 nodes (plotted using GraphViz package [38, 39]) 
17
3. Empirical Collaboration Network 
Indeed, the considered graph (
gure 3.2) 
shows this behavior with one remarkable ex- 
ception. There are much too few papers 
written by only one author. This could be 
due to the fact that collaboration helps in 
science, but the more (scienti
c!) partners 
you have, the slower gets your scienti
c out- 
put as communication overhead increases. 
In other words: establishing scienti
c re- 
lationships with other authors is not easy. 
You have to agree on the
eld of research, 
coordinate your eorts etc. Postulating 
that cooperation with more scientists is al- 
ways favorable, we can explain the statistics 
by diÆculty of
nding new partners. This 
even increases corresponding to the num- 
ber of co-workers you already have, as ad- 
ditional coordination is needed. The risk of 
research overlap raises, too. 
The power law predicted by Lotka [11] 
with an exponent of 2 cannot be con-
rmed. This could be due to insuÆcient 
statistics for this test. Other recent stud- 
ies of collaboration networks found an ex- 
ponent of 2:1 or 2:4 [41] which is another 
indication for statistical errors predominat- 
ing our results of study. 
3.3.2. Connections per author 
Next, we study the number of connections 
per author, which is the number of other 
scientists an author ever published papers 
with. This number is weighted by the num- 
ber of papers, i.e. a coauthor with whom a 
scientist published n papers contributes n 
connections (table 3.2). 
Again, we expect to see a frequency de- 
crease with increasing number of connec- 
tions. The experimental data (
gure 3.3) 
links weighted unique 
0 16 16 
1 51 67 
2 75 69 
3 61 67 
4 24 27 
5 21 22 
6 10 2 
7 5 6 
8 4 3 
9 2 
10 3 
11 1 
13 2 1 
14 1 
15 1 
17 1 1 
20 2 
29 1 
273 274 274 
Table 3.2.: Connections per author 
100 
10 
1 
0.1 
0.01 
1 10 
frequency 
x^-2.85 
x^-3.53 
number of connections 
Figure 3.3.: frequency distribution of the num- 
ber of connections per author 
(black:weighted|grey:unique) 
shows smaller frequencies for isolated au- 
18
3.3. Analysis 
thors that never publish with others as well 
as for authors with only one coauthor. This 
is comparable to the eect observed in the 
last graph. The most productive seem to be 
authors with two or three colleagues they 
are working with. 
Statistical data in the area of highly con- 
nected authors shows a truncated power 
law. The exponent of approximately 2:85 
falls well in the region bounded by analysis 
of other scale-free networks (www: around 
2:3 [6, 26, 42]). The sharp or exponen- 
tial cuto at very high connection numbers 
has been reported for other networks, too 
[7, 43]. Mossa et al. [44] oer an expla- 
nation using a model with limited (local) 
information on the network. Surely, no sci- 
entist knows all others, so this could lead 
to the observed eect. 
3.3.3. Double vs. unique links 
We are interested how things change when 
we cease weighting connections by num- 
ber of papers published together, i.e. we 
only take into account how many other 
unique scientists a researcher published pa- 
pers with. Results can be found in table 3.2. 
We see that despite the dierent num- 
bers, results are qualitatively the same. Sci- 
entists working together with two other au- 
thors are the most productive. 
Depending on whether your glass is half 
full or half empty there are two contrary 
explanations: 
1. It is common practice that you name 
persons as authors of your work that 
did not contribute to it, out of a feeling 
of debt, may it be sponsors or others. 
2. Science lives from cooperation. Work- 
ing together on one subject increases 
scienti

More Related Content

Similar to Analysis and Simulation of Scienti c Networks

Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_Oman
Evan Oman
 
SeniorThesisFinal_Biswas
SeniorThesisFinal_BiswasSeniorThesisFinal_Biswas
SeniorThesisFinal_Biswas
Aditya Biswas
 
Robofish - Final Report (amended)
Robofish - Final Report (amended)Robofish - Final Report (amended)
Robofish - Final Report (amended)
Adam Zienkiewicz
 
Anwar_Shahed_MSc_2015
Anwar_Shahed_MSc_2015Anwar_Shahed_MSc_2015
Anwar_Shahed_MSc_2015
Shahed Anwar
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertation
Kurt Portelli
 
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
AmrYassin23
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italy
AimonJamali
 
Place Cell Latex report
Place Cell Latex reportPlace Cell Latex report
Place Cell Latex report
Jacob Senior
 
Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16
Sebastian
 
Nelson_Rei_Bernardino_PhD_Thesis_2008
Nelson_Rei_Bernardino_PhD_Thesis_2008Nelson_Rei_Bernardino_PhD_Thesis_2008
Nelson_Rei_Bernardino_PhD_Thesis_2008
Nelson Rei Bernardino
 

Similar to Analysis and Simulation of Scienti c Networks (20)

Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_Oman
 
Data mining of massive datasets
Data mining of massive datasetsData mining of massive datasets
Data mining of massive datasets
 
Mining of massive datasets
Mining of massive datasetsMining of massive datasets
Mining of massive datasets
 
SeniorThesisFinal_Biswas
SeniorThesisFinal_BiswasSeniorThesisFinal_Biswas
SeniorThesisFinal_Biswas
 
Robofish - Final Report (amended)
Robofish - Final Report (amended)Robofish - Final Report (amended)
Robofish - Final Report (amended)
 
Anwar_Shahed_MSc_2015
Anwar_Shahed_MSc_2015Anwar_Shahed_MSc_2015
Anwar_Shahed_MSc_2015
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertation
 
T401
T401T401
T401
 
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
Extending the Scalability of Linkage Learning Genetic Algorithms Theory & Pra...
 
Thesis. A comparison between some generative and discriminative classifiers.
Thesis. A comparison between some generative and discriminative classifiers.Thesis. A comparison between some generative and discriminative classifiers.
Thesis. A comparison between some generative and discriminative classifiers.
 
Study of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution GeneralizationStudy of different approaches to Out of Distribution Generalization
Study of different approaches to Out of Distribution Generalization
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italy
 
thesis.MSc
thesis.MScthesis.MSc
thesis.MSc
 
mscthesis
mscthesismscthesis
mscthesis
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital Libraries
 
Place Cell Latex report
Place Cell Latex reportPlace Cell Latex report
Place Cell Latex report
 
Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16
 
Nelson_Rei_Bernardino_PhD_Thesis_2008
Nelson_Rei_Bernardino_PhD_Thesis_2008Nelson_Rei_Bernardino_PhD_Thesis_2008
Nelson_Rei_Bernardino_PhD_Thesis_2008
 
Kretz dis
Kretz disKretz dis
Kretz dis
 
Lecture notes on mobile communication
Lecture notes on mobile communicationLecture notes on mobile communication
Lecture notes on mobile communication
 

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 

Recently uploaded (20)

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 

Analysis and Simulation of Scienti c Networks

  • 1. Institute of Theoretical Physics University of Cologne Diploma Thesis Analysis and Simulation of Scienti
  • 2. c Networks Felix Putsch July 14, 2003 425 453 482 492 529 251 181 530 336 202 199 80 222 41 27 7 398 223 224 477 371 102 105 134 338 359 supervised by Prof. D. Stauer
  • 4. rm (according to the Prufungsordnung of July 12, 1996, x20(5)) having composed this diploma thesis alone, using no other than the mentioned sources and tools. Citations have been marked. Hiermit versichere ich gemass x20(5) der Prufungsordnung vom 12. Juli 1996, dass ich diese Diplomarbeit alleine erstellt und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. Zitate wurden kenntlich gemacht. * fxp@thp.uni-koeln.de
  • 5. Though the mountains divide And the oceans are wide It's a small world after all R. M. and R. B. Sherman
  • 6.
  • 7. Contents 1. Introduction 7 1.1. Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2. Six Degrees of Separation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3. Small World Eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4. Science Collaboration Networks . . . . . . . . . . . . . . . . . . . . . . . 8 2. Network Models 9 2.1. Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1. Small World Eect . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.3. Scale-Free Behavior . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2. Regular Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Erd}os-Renyi Random Networks . . . . . . . . . . . . . . . . . . . . . . . 11 2.4. Watts-Strogatz Small-World Networks . . . . . . . . . . . . . . . . . . . 11 2.5. Barabasi-Albert Network Model . . . . . . . . . . . . . . . . . . . . . . . 11 2.6. New Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3. Empirical Collaboration Network 15 3.1. Typology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1. Citation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.2. Collaboration Graph . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2. Building the net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.2. Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.1. Authors per paper . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.2. Connections per author . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.3. Double vs. unique links . . . . . . . . . . . . . . . . . . . . . . . 19 3.3.4. Cluster sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4. Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5
  • 8. Contents 3.4.1. Erd}os-Renyi random graphs . . . . . . . . . . . . . . . . . . . . . 21 3.4.2. Watts-Strogatz small-world networks . . . . . . . . . . . . . . . . 21 3.4.3. Barabasi-Albert networks . . . . . . . . . . . . . . . . . . . . . . 22 4. Spin models 23 4.1. Leadership eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.2. Phase transition . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.3. Degree distribution . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.4. Spin ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2. Cluster limited Ising models . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.3. Bias adjusti
  • 9. cation . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.4. Linear Relationship . . . . . . . . . . . . . . . . . . . . . . . . . 29 5. Barabasi-Albert network models 31 5.1. Modi
  • 10. ed Barabasi-Albert model . . . . . . . . . . . . . . . . . . . . . . 31 5.2. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2.1. Isolated clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2.2. Merging clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.2.3. scale-free behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6. Conclusion 37 A. Acknowledgements 39 B. Source code 41 B.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 B.2. Spin ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 B.3. Modi
  • 11. ed Barabasi-Albert model . . . . . . . . . . . . . . . . . . . . . . 51 C. Figures 55 Bibliography 55 6
  • 12. 1. Introduction 1.1. Networks In recent times, hearing the word network immediately arouses the idea of physically wired networks as those formed by tele- phone lines or computer links. Though, net- work is a concept a good deal more general than only this. Mathematically spoken, networks are graphs, i.e. a set of nodes (of whatever kind) connected by edges (links, connec- tions) between certain pairs. This abstraction has been known for a long time. Probably the
  • 13. rst paper of graph theory was written by Euler [1], the so- called bridge problem of Konigsberg. Eu- ler discusses whether or not it is possible to make a round walk, passing of each of Konigsberg's nine bridges exactly once (
  • 14. g- ure 1.1). The concept of networks can be applied to lots of theoretical or experimental sub- jects [2{4], nodes being people [5], Internet servers [6], scientists [7, 8] or others, the range of links comprises e-mails [9], friend- ships [5], citations [10, 11] and more. Thus, there are numerable dierent kinds of networks, physical ones (e.g. hard wired) as well as logical (e.g. dependencies) or social ones (e.g. contacts, friendships), stretching out to topics far from wired net- works [9, 12]. The area is under vigorous research. Good reviews can be found in [2{ 4, 13, 14]. 1.2. Six Degrees of Separation Out of personal experience, nearly every- body has been confronted with what we call small world eect. There are numerous ex- amples: At a party, we
  • 15. nd out to know some stranger we just started talking to by only a few middle-persons (or technically spoken we are only separated from him by a low degree). E.g., he could be our street neigh- bors' colleague's son. We hear My god, how world is small. Rumors are another example. We are as- tonished to experience the pace at which they spread. After a few hours and thus only a few possibilities of telling rumors to others, whole city seems to know. Milgram [15] made an experiment on this. He instructed a set of people to try to send a letter to some stranger, only by using personal contacts. He found out that an astonishing short chain of social links is needed for this task, which entered in every- day's language as Six Degrees of Separa- tion. Recently, this has been reviewed on a more popular basis by a German weekly newspaper [5]. 1.3. Small World Eect Six Degrees of Separation is only one man- ifestation of a more general principle: the small world eect [16{18]. Observations of many real-world net- 7
  • 16. 1. Introduction Figure 1.1.: Konigsberg bridge problem: Is it possible to make a round trip, passing each bridge exactly once? [1] works in computer science, biology, chem- istry, linguistic, sociology, etc. have re- vealed a crucial dierence from regular lat- tices. Regarding average (or sometimes max- imum) path lengths on such networks, we would expect to see an increase linearly with the number of nodes. Instead, we examine distances growing logarithmically with sys- tem size. This behavior is not only an amus- ing eect but has far-spreading conse- quences [19]. Prominent examples are In- ternet's stability against attacks [20], dis- ease spreading [21, 22] or path
  • 17. nding strategies [23, 24]. 1.4. Science Collaboration Networks In context of science, the network between scientists as nodes of the graph is of par- ticular interest. This network belongs to the group of social ones, with humans as nodes. Unlike many other forms of social relationships, that are mostly quite diÆcult to capture objectively, the
  • 18. eld of published papers is very widespread covered by the Science Citation Index [25] and so easily available to research. 8
  • 19. 2. Network Models There are many types of networks com- peting to describe observations made in socio-physics. After discussing which mea- surements describe a given networks struc- ture, we will give a short overview about what we think to be the most important ones and discuss advantages and possible disadvantages. 2.1. Measurements 2.1.1. Small World Eect As illustrated in the introduction, we are interested in the correlation of network size and average (maximum) path lengths. We investigate if there is linear, logarithmic or other behavior. In case of a logarithmic one, the net is said to show the Small World Eect. 2.1.2. Clustering In friendship network, we
  • 20. nd friends of one person often to be friends themselves. This is true for most social networks and even other ones. Links are not spread randomly but arranged in clusters. To describe thing mathematically, we in- troduce a clustering coeÆcient Ci of a node i describing the portion of m established links between all ki next neighbors com- pared to the maximum possible number of M = m 2 , i.e. Cn = m M : This value is averaged over all vertices to give a clustering coeÆcient C for the whole network. Typical values experienced are far above results expected for random networks [2, p. 50]. 2.1.3. Scale-Free Behavior A third observation regarding social net- works is its distribution of degrees. Regard- ing frequency of vertices of given coordina- tion numbers, we do not
  • 21. nd an exponential but a power law [26]. In all, we have three possible means to classify networks. Many social graphs show small path lengths, high clustering and scale-free behavior. 2.2. Regular Lattices The simplest form of a lattice is a symmet- rical formation of nodes connected by edges between all pairs (or all pairs of adjacent) nodes as shown in
  • 22. gure 2.1a,b. Reasons to choose this linking are e.g. to simulate neighborship in a town etc. Such network show a high degree of clus- tering, as wished. The average path lengths are very long, though and scale with sys- tem size. So, small world behavior cannot be found which makes the model inappro- priate for our needs. 9
  • 23. 2. Network Models Figure 2.1.: network types: a, b regular lattices, c random network, c scale-free graph [14] 10
  • 24. 2.5. Barabasi-Albert Network Model 2.3. Erd}os-Renyi Random Networks Random networks are the extremum on the other side of the spectrum. A number of nodes is wired by pure chance, i.e. we throw the dices to select two nodes and place an edge between them. Such graphs have been
  • 25. rst proposed by Solomono and Rapoport [27] and have been extensively studied by Erd}os and Renyi [28]. Actual results have been reviewed in [29]. A typical result can be seen in
  • 27. nd that average path lengths behave logarithmically with network size. While this is as desired for small world simulation, obviously there is no clustering. 2.4. Watts-Strogatz Small-World Networks The idea is plausible to try combining both presented models to sum up their corre- sponding advantages. A big step towards this goal was done by Watts and Strogatz [16]. Their model starts with a circular graph that is regularly wired (
  • 28. gure 2.2). Step by step, edges are chosen by chance and rewired to an arbitrary destination node. Thus, a small fraction of links are long- range ones. To illustrate, this could be habitants of a street of neighbors having relationships with far-away relatives. At
  • 29. rst, the model seems to ful
  • 30. ll our de- sires. It shows small average path lengths as well as high clustering. Looking at the de- gree distribution, i.e. the frequency of nodes Figure 2.3.: Barabasi and Albert [30] of a certain degree, we
  • 31. nd strong dier- ences from real-world data as there is no scale-free behavior. 2.5. Barabasi-Albert Network Model Barabasi and Albert [30] started a new idea. Their model consists of two ingredients: growth and preferential attachment. We start with a graph of m0 = 3 vertices, each one connected to each other. Now, in each time step, we add a node that is con- nected to others by m = 3 links. The new node being one side of the links, the other one is chosen at random from the existing network. The probability of a vertex being selected is proportional to the number of links already attached to it. To stay in the image: If you already have lots of friends, you are more likely to get new ones. The rich get richer.1 These rules result in a network (
  • 32. g- ure 2.1d) that is capable of reproducing small-world behavior, as well as being scale- free. Research has found good collapse with 1 Whoever has will be given more, and he will have an abundance. [31]. 11
  • 33. 2. Network Models (a) (b) (c) Figure 2.2.: Watts-Strogatz network model: We start with a regular lattice (a) formed to a ring (b) and re-wire a small fraction of links to random destinations (c) [18] empirical networks, included e.g. the world wide web [6]. Good introductions can be found in [32, 33]. Clustering is present to a certain degree, but still much too small regarding experi- mental values. 2.6. New Approaches Recently, new network models have been developed to cope with inconveniences en- countered with present ones. Ravasz and Barabasi [35] examined net- works of a self-similar structure imitating the idea of hierarchical organization in soci- ology. Combining high clustering and scale- free behavior, their model does not show short path lengths, though. Klemm and Eguiluz [34]2 developed an auspicious model joining all three demands in one network. The authors present a gen- eralization of the Barabasi-Albert model, adding aging of nodes and some random behavior. There will be further research to be done 2cf. also [36] on this model to verify if it copes with re- ality. An overview of all models can be found in
  • 35. 2.6. New Approaches high clustering short average Watts−Strogatz Klemm−Eguiluz Erdos−Renyi Barabasi−Albert Ravasz−Barabasi scale−free regular lattice path lengths small world hierarchical model random network Figure 2.4.: overview over recent network models (Erd}os and Renyi [28], Watts and Strogatz [16], Barabasi and Albert [30], Klemm and Eguiluz [34], Ravasz and Barabasi [35]) 13
  • 37. 3. Empirical Collaboration Network In context of science, the network be- tween scientists as nodes of graph is of par- ticular interest. 3.1. Typology First, we want to deal with the de
  • 38. nition of a collaboration graph. As to the nodes, we have the choice to identify each vertex either with an author or with a paper. The second possibility is also area of re- search [37], but we think studying the rela- tionship of scientists as the paper's authors oers more insight in how research works. So we will make each scientist a node of our network. As what concerns the edges, there are ba- sically two possible choices|both covered equally by the database used [25]. 3.1.1. Citation Graph We might chose to consider citations from one author to another as links [10], thus resulting in a directed graph. Starting at a given paper, we can enlarge our network by following links recursively up to a certain depth, e.g. by depth-
  • 40. rst algorithms. Each new work will cite several to many still un-included. Roughly spoken, the number of publications to include will raise exponentially with the maximum depth chosen. Quickly, we arrive at huge amounts of data. Additionally, there is no canonical end of the hunt for new links. In the ex- treme case we could be caught in a giant cluster containing all or nearly all of the pa- pers ever published. We see no possibility to narrow this down in a reasonable way with- out fear of introducing arbitrary boundary conditions. 3.1.2. Collaboration Graph Second possibility to de
  • 41. ne edges of a graph is creating links by co-authorship in one or several papers [7, 8]. If n scientists pub- lish a paper together, they are connected to each other by n 2 edges. As an additional advantage, we have the choice to start with an arbitrary set of au- thors, establishing links between them by looking at all papers they are involved. This will result in a graph of limited size. Of course, we should think carefully about reasonable selection, to avoid edge eects. We will discuss this in the next sec- tion. 3.2. Building the net 3.2.1. Proceeding As solution, we choose the following pro- ceeding: We start with one paper. As one part of our work will be the comparison of real world data to Barabasi-Albert net- works, we take the corresponding paper [30] as center of investigation. 15
  • 42. 3. Empirical Collaboration Network In order to determine the set of authors we want to deal with, we select all 185 pa- pers that cite this paper.1 Secondly, we construct a list of unique authors from all these papers. A
  • 43. rst approach delivers 559 scientists, whereof some turned out to be identical but appearing in particular papers with typos. We
  • 44. nish with a set of 555 authors to whom we attribute consecutive numbers. The last step of the network creation pro- cess consists in establishing links between all these authors. This is done by selecting one paper after the other and introducing connections between each possible pair of this paper's authors (i.e. n 2 links for n au- thors). Eventually, this gives us a graph of 555 nodes representing scienti
  • 45. c collaboration in the area of Barabasi-Albert networks. The network size is relatively small com- pared to all data in the Science Citation Index [25] (approx. 107 papers). Studying properties of this subnet, we hope getting an insight to what leads to the structure observed. Veri
  • 46. cation with bigger networks remains a task for the future. 3.2.2. Visualization To get an idea of what we are dealing about, we visualize the graph using a spring model [38, 39]. In order to give manageable results we remove a paper on the Human Genome Project [40] with 274 authors. Brief exami- nation yields that this is no harm, as scien- 1We have to be careful not to mix citation data from dierent dates as new papers are continu- ously added to the database. Base of our inves- tigation is October 21st, 2002. tists participating in this work did not co- operate with others in our graph, and form a big cluster on their own. The result is shown in
  • 47. gure 3.1. 3.3. Analysis 3.3.1. Authors per paper authors frequency 1 37 2 69 3 47 4 21 5 6 6 4 7 1 Table 3.1.: Authors per paper First thing we are interested in is the fre- quency distribution of papers per author. We expect to see many papers with few au- thors and vice versa (table 3.1). 70 60 50 40 30 20 10 0 ~x^3.58*exp(-x/0.54) 1 2 3 4 5 6 frequency number of authors ~x^-3.3 Figure 3.2.: frequency distribution of the number of authors per paper 16
  • 48. 3.3. Analysis 493 423 net of cooperation 458 201 343 374 373 8 249 40 1 296 156 457 15 124 30 315 345 422 274 4 209 307 393 6 470 245 272 7 27 223 224 477 371 41 102 105 134 338 359 398 425 453 482 492 529 251 181 530 9 383 401 480 280 384 257 12 271 463 434 312 13 334 17 52 18 154 107 177 20 37 164 429 433 322 24 350 308 99 314 26 385 38 267 408 290 400 39 318 336 202 199 43 386 47 405 454 484 49 145 332 212 50 449 51 122 55 87 270 57 62 389 58 204 258 365 111 461 504 226 220 269 61 452 339 533 68 79 404 69 215 478 74 502 77 375 498 80 222 81 92 82 306 128 84 554 187 85 185 86 117 317 94 129 95 264 479 552 275 96 555 346 101 179 110 378 488 113 292 431 243 165 116 418 329 120 171 219 436 472 512 123 344 125 259 130 205 136 494 138 153 160 435 142 213 406 197 143 144 246 370 147 440 485 151 395 309 163 237 252 234 255 253 254 283 166 341 377 491 381 167 302 416 169 320 217 239 218 230 233 232 282 235 278 430 241 263 515 265 444 273 279 316 285 486 536 299 390 466 301 319 450 542 361 380 372 421 376 469 506 391 411 451 524 Figure 3.1.: collaboration network with 555 nodes (plotted using GraphViz package [38, 39]) 17
  • 49. 3. Empirical Collaboration Network Indeed, the considered graph (
  • 50. gure 3.2) shows this behavior with one remarkable ex- ception. There are much too few papers written by only one author. This could be due to the fact that collaboration helps in science, but the more (scienti
  • 51. c!) partners you have, the slower gets your scienti
  • 52. c out- put as communication overhead increases. In other words: establishing scienti
  • 53. c re- lationships with other authors is not easy. You have to agree on the
  • 54. eld of research, coordinate your eorts etc. Postulating that cooperation with more scientists is al- ways favorable, we can explain the statistics by diÆculty of
  • 55. nding new partners. This even increases corresponding to the num- ber of co-workers you already have, as ad- ditional coordination is needed. The risk of research overlap raises, too. The power law predicted by Lotka [11] with an exponent of 2 cannot be con-
  • 56. rmed. This could be due to insuÆcient statistics for this test. Other recent stud- ies of collaboration networks found an ex- ponent of 2:1 or 2:4 [41] which is another indication for statistical errors predominat- ing our results of study. 3.3.2. Connections per author Next, we study the number of connections per author, which is the number of other scientists an author ever published papers with. This number is weighted by the num- ber of papers, i.e. a coauthor with whom a scientist published n papers contributes n connections (table 3.2). Again, we expect to see a frequency de- crease with increasing number of connec- tions. The experimental data (
  • 57. gure 3.3) links weighted unique 0 16 16 1 51 67 2 75 69 3 61 67 4 24 27 5 21 22 6 10 2 7 5 6 8 4 3 9 2 10 3 11 1 13 2 1 14 1 15 1 17 1 1 20 2 29 1 273 274 274 Table 3.2.: Connections per author 100 10 1 0.1 0.01 1 10 frequency x^-2.85 x^-3.53 number of connections Figure 3.3.: frequency distribution of the num- ber of connections per author (black:weighted|grey:unique) shows smaller frequencies for isolated au- 18
  • 58. 3.3. Analysis thors that never publish with others as well as for authors with only one coauthor. This is comparable to the eect observed in the last graph. The most productive seem to be authors with two or three colleagues they are working with. Statistical data in the area of highly con- nected authors shows a truncated power law. The exponent of approximately 2:85 falls well in the region bounded by analysis of other scale-free networks (www: around 2:3 [6, 26, 42]). The sharp or exponen- tial cuto at very high connection numbers has been reported for other networks, too [7, 43]. Mossa et al. [44] oer an expla- nation using a model with limited (local) information on the network. Surely, no sci- entist knows all others, so this could lead to the observed eect. 3.3.3. Double vs. unique links We are interested how things change when we cease weighting connections by num- ber of papers published together, i.e. we only take into account how many other unique scientists a researcher published pa- pers with. Results can be found in table 3.2. We see that despite the dierent num- bers, results are qualitatively the same. Sci- entists working together with two other au- thors are the most productive. Depending on whether your glass is half full or half empty there are two contrary explanations: 1. It is common practice that you name persons as authors of your work that did not contribute to it, out of a feeling of debt, may it be sponsors or others. 2. Science lives from cooperation. Work- ing together on one subject increases scienti
  • 59. cal output whilst reducing er- rors. The author of this paper will not judge. 3.3.4. Cluster sizes size frequency 1 16 2 28 3 15 4 13 5 2 6 3 7 2 8 2 9 2 10 1 26 1 274 1 Table 3.3.: Cluster sizes Our last focus is on subnets of science that exist in our net of collaboration. Au- thors group into several clusters by connec- tions established between them. We inves- tigate the frequency of clusters of a given size. Our expectation is getting a frequency increase for growing cluster sizes up to a peak, and then a decay as clusters grow very big, comparable to the statistics we saw already. The experimental data (table 3.3,
  • 60. g- ure 3.5) shows this behavior, but with one surprise: although the most frequent cluster size is 2 due to a big number of publications 19
  • 61. 3. Empirical Collaboration Network 429 283 105 322 217 316 466 151 380 85 389 430 435 a b 122 130 a b 255 187 237 226 222 134 336 27 338 425 359 398 453 482 492 529 251 181 530 80 199 202 7 223 224 477 371 41 102 296 1 421 384 15 124 259 143 457 c 30 345 422 502 391 373 377 302 370 8 341 491 381 171 285 486 536 458 249 8 201 470 40 6 274 272 245 416 10 160 280 257 299 390 309 395 213 142 406 153 120 219 472 512 436 138 317 2 3 4 5 26 a b a b b a a b 6 385 b a 117 77 375 498 524 230 423 493 167 86 253 254 282 163 252 234 144 246 343 374 58 204 258 365 461 504 111 156 9 166 57 62 315 18 107 154 434 220 61 452 339 84 128 554 13 312 334 177 9 383 401 480 235 278 69 215 332 478 4 209 307 393 74 411 169 320 101 179 372 26 319 301 239 49 145 232 233 123 344 451 218 269 450 449 12 271 463 241 263 515 68 79 404 212 542 147 440 485 55 87 270 376 469 506 50 95 533 17 361 52 185 494 386 273 43 39 318 265 444 82 306 125 197 110 7 129 113 292 431 243 378 488 38 267 408 290 400 94 136 264 479 552 96 275 346 555 47 405 454 484 24 308 350 99 314 81 92 279 51 205 116 165 329 418 20 37 164 433 Figure 3.4.: Collaboration clusters, ordered by size of participants. The shaded box no. 458 represents my professor, D. Stauer. 20
  • 62. 3.4. Comparison 10 1 0.1 0.01 1 10 frequency cluster size x^-2.61 Figure 3.5.: frequency distribution of the cluster size with two authors, most scientists maintain collaboration with three others. The can only be due to scientists be- ing member of research groups involved in dierent themes, thus connecting dierent clusters formed by single papers. This can directly be veri
  • 63. ed by the graphical repre- sentation (
  • 64. gure
  • 65. gure 3.4) of the clusters, ordered by size. 3.4. Comparison We have collected some statistical
  • 66. gures to express the structure of the network in concern. Now we want to
  • 67. nd out whether classical or current network models bear similar results. 3.4.1. Erd}os-Renyi random graphs Connections per author In a random graph the distribution is a binominal one, i.e. the probability of a node with k con- nections in a net with N nodes is P(k) / N 1 k !pk(1 p)N1k: In the limit of large N this approaches a Poisson distribution around the expectation value hki = pN P(k) / ehki hkik k! : This is contrary to the power-law statistic observed in the collaboration network. Cluster sizes For random graphs, perco- lation theory predicts that the cluster size distribution shows an exponential decay for big cluster sizes [2, 45]. Again, the considered network show rather a power-law decay than an exponen- tial one. Unsurprisingly, the structure of scienti
  • 68. c collaboration diers basically from that of a random network. 3.4.2. Watts-Strogatz small-world networks Connections per author The degree distribution of Watts-Strogatz small-world networks is similar to that of a random graph [2]. It has a peak and decays ex- ponentially for large connection numbers, contrary to the collaboration network. Cluster sizes The usual case in Watts- Strogatz networks is re-wiring of only a small portion of links. Thus, the network stays well connected, mostly forming one giant cluster. 21
  • 69. 3. Empirical Collaboration Network This model does not describe scienti
  • 70. c collaboration as well. 3.4.3. Barabasi-Albert networks Connections per author Barabasi- Albert networks show a vertex degree distribution as P(k) / k3 [30, 46]. In our science collaboration network we found out exponents of 2:85 resp. 3:53 which is only a slight deviation. Cluster sizes In Barabasi-Albert net- works, new sites are added with links to already existing nodes. Consequently only one giant cluster forms. Obviously this dif- fers crucially from the net of co-authorship. 22
  • 71. 4. Spin models 4.1. Leadership eect 4.1.1. Ising model In 1925, Ising [47] published a paper on a model of spin interaction that later became very famous. The idea to this had been given to him by his teacher Lenz [48], so it is sometimes referenced as Lenz-Ising model.1 The idea is to consider spins (e.g. on a square lattice) and an interaction Hamilto- nian H = Xi6=j Ji;jSiSj where Si are the spins and Ji;j is a matrix describing the interaction forces. Usually we consider the case Si;j = ( J: i; j nearest neighbors 0: else, i.e. only allow equal interaction between nearest neighbors (J 0 for ferromagnetic behavior). In the following chapter, we investigate how such a model behaves on our con- structed collaboration network. 1 A generalization of the Ising model is the Potts model [49, 50]. Instead of Ising spins with two possible states +1 and 1, Potts allows k 2 dierent spin values. The Hamiltonian is H = JXi6=j Æi;k; i; j being nearest neighbors. Applying it to the scienti
  • 73. nd re- sults very similar to those of Ising's model. Figure 4.1.: Ising [47] We use a Metropolis [51] Ising model, i.e. probabilities for a single spin ip of p / eE=kBT if E 0 and 1 otherwise. To determine E we sum up spins of all vertices connected to a given node. In prin- ciple, we have the choice between two pro- ceedings: consider only unique links between two nodes count a connection several times ac- cording to the number of links, i. e. the number of papers the correspond- ing authors published together. Both possibilities have been examined.2 2change the switch NODOUBLE in line 17 of source 23
  • 74. 4. Spin models 4.1.2. Phase transition The results of both experiments show qual- itative similarity. We observe a rounded phase transition at about kBT=J = 0:8 op- posed to a value of about 2:3 on the regular square lattice. A closer look reveals that decay of mag- netization with raising temperature is ex- ponential. This result corresponds to re- search on Ising models on Barabasi-Albert networks by Aleksiejuk, Ho lyst, and Stauf- fer [52], who also found an exponential law. Anyhow, the critical temperatures found by me and by Aleksiejuk et al. [52] dier by more than one order of magnitude. This can easily be explained by dierent coor- dination numbers in both networks. The collaboration graph holds a maximum of 14 neighbors of a single vertex, the graph of Aleksiejuk et al. [52] exceeds this by sev- eral orders of magnitude. This makes it far more diÆcult to break the ferromagnetic bonds, resulting in a higher critical temper- ature. 4.1.3. Degree distribution Dorogovtsev et al. [53] studied random graphs with given degree distributions P(k) of a vertex of degree k. They deduced an estimate for the critical temperature of an Ising model on such networks as J kBTc = 1 2 ln hk2i hk2i 2hki!: Considering the collaboration network as a random graph with given degree distribu- tion (table 4.1), we use their formula and code in section B.1 degree several unique 0 16 16 1 51 67 2 75 69 3 61 67 4 24 27 5 21 22 6 10 2 7 5 6 8 4 3 9 2 10 3 11 1 13 2 1 14 1 15 1 17 1 1 20 2 29 1 Table 4.1.: Degree distribution get Tc=J = 2:91, counting only unique links between scientists. Using all links, we
  • 75. nd Tc=J = 5:46. Both values are far from critical temper- atures observed in simulation. This is a strong clue towards the statement that col- laboration networks are crucially dierent from random networks, even with the same degree distribution. 4.1.4. Spin ip model Following a suggestion of Ho lyst3, we can determine the importance of most con- nected authors of our collaboration network by successive ipping of most connected 3 personal correspondence, cf. [52] 24
  • 76. 4.2. Cluster limited Ising models spins and pinning them in their new posi- tion. In other words: After some time of equili- bration, we chose the author who has most connections to others, and change his/her spin permanently to a value of 1, opposite to all others (at T = 0, or nearly all others else). Subsequently, we allow the system to relax some time, after which we perma- nently ip the second most connected spin, and so on. 250 200 150 100 50 0 0 50000 100000 150000 200000 250000 300000 M t unique multiple Figure 4.2.: Ising model with successive spin ips. After 105 steps of equilibration, we ip the most connected spin and stick it to its new value. After some relaxation of 104 steps, this step is repeated. Network with multiple and network with unique links used. Av- eraged over 1000 runs. T = 0:2. Results are shown in
  • 77. gure 4.2. We ob- serve two things: 1. Even after switching 20 most con- nected spins, the system does not ip in the opposite state with all spins pointing down. In simulations of Alek- siejuk et al. [52], less than 6 spins were enough to ip a whole network of 30; 000 nodes. This is quite obviously due to the fact that we don't have a contiguous graph, but one consisting of dierent clusters. A spin ip in one cluster is not able to aect spins in others. In pictures of spins representing opin- ions (yes/no, etc. [54]), this means a few authors with view diering from the broad mass of scientists are hardly capable of changing the global opinion, may they even be the most connected (known) ones. 2. Allowing multiple links in our net, we expect the magnetization to break down much faster, as a ip of a spin is able to in uence others in a stronger way. Yet, the simulation shows contrary results. The graph containing only unique links shows a much steeper de- cay of magnetization (
  • 78. gure 4.2). Possible explanation is the fact, that choosing most connected spins in a network only having unique links picks authors with connections to many other authors, whereas in a network al- lowing multiple links, there are as well spins connected in a strong manner to only few others. It seems that in order to spread new opinions, it is more advantageous to have small in uence on many other people than a big impact on only few ones. 4.2. Cluster limited Ising models The network we are looking at consists of many distinct clusters of dierent sizes (
  • 79. g- 25
  • 80. 4. Spin models ure 3.4). We may ask if they dier regard- ing their properties, or if they behave alike. 4.2.1. Proceeding We split up the network into sub-nets, each containing one cluster, numbered sequen- tially, 2, 3a, 3b,. . . 26 (numbers and letters as in
  • 81. gure 3.4). On each net, we run an Ising model for temperatures from 0:1 to 6:9 in 0:2 steps. Each of these simulations runs for 106 steps, with magnetization measured ev- ery 100 steps, thus resulting in 104 mea- surements per run, to give good statistics. 4.2.2. Results 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 M/N kT/J Figure 4.3.: Ising model on the dierent clusters of a collaboration net, averaged over 104 measurements per temperature and net. Examining the results (
  • 82. gure 4.3), we see dierent curves that all decay with raising temperature, but show no apparent similar- ities. We wonder why the curves seemingly do not converge to zero but to
  • 83. nite values. Obviously, the network is so small that macroscopic magnetization ips occur fre- quently, even at moderate temperatures. That means, expectation value of magne- tization4 at high temperatures is not zero, but something around one! 4.2.3. Bias adjusti
  • 84. cation To validate this hypothesis, we simulate the networks at very high temperature (kBT=J = 50), in order to determine M1 = M(T = 1) (table 4.2). net M1 2 1.03 3a 1.71 3b 1.51 4a 1.59 4b 1.54 4c 1.55 5a 2.02 5b 1.96 6a 2.00 6b 1.94 7a 2.27 7b 2.23 8a 2.26 8b 2.29 9a 2.55 9b 2.55 10 2.54 26 4.25 Table 4.2.: Bias 4 all over this publication we consider the (un- signed) value jMj as magnetization, not M! Doing the latter leads to false results. E.g. at low temperatures, averaging M over very long times would give zero, as ips of the whole sys- tem occur (though at very low probabilities). 26
  • 85. 4.2. Cluster limited Ising models Using these values, we rescale our simu- lation results from
  • 86. gure 4.3 by use of the scaling function5 M N ! M M1 N M1 ; where N is the total number of authors in the cluster (
  • 87. gure 4.4). 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 (M-M¥)/(N-M¥) kT/J 2 3a 3b 4a 4b 4c 5a 5b 6a 6b 7a 7b 8a 8b 9a 9b 10 26 Figure 4.4.: Same data as in
  • 88. gure 4.3 but ad- justed to a common scale from 0 to 1 by eliminating bias from random uctuations. We
  • 89. nd a much cleaner image. Apart from one exception (6b) all curves are par- allel up to the M = 0:5-line, and even be- yond there are very few crossings. Each dierent cluster can now be char- acterized by the temperature # at which it achieves M = 0:5. This leads to a sort of melting temperature (table 4.3). To get an idea of this temperature's meaning, we sort the cluster's graphical rep- resentations by # (
  • 90. gure 4.5). Is seems plausible that # is a measure of coherence or connectiveness of a clus- 5 This function is a linear approximation that gives 1 for M ! N and zero for M ! M1. net # 2 1.7 3a 2.8 3b 1.5 4a 3.3 4b 2.1 4c 2.6 5a 4.0 5b 2.2 6a 4.8 6b 3.5 7a 2.8 7b 1.8 8a 2.5 8b 3.7 9a 2.2 9b 3.1 10 1.9 26 3.8 Table 4.3.: Melting temperatures ter. Single bonds lead to lower melting tem- peratures, fully connected subsets to higher ones. This oers a possible explanation why 6b shows a dierent behavior from all others in
  • 91. gure 4.4: This cluster consists of two parts. One of them is a completely con- nected set of
  • 92. ve nodes, the other one a single vertex. Both are linked by one single edge. Probably, this con ict of interest leads to the observed anomality. Additionally, connectedness described by # seems to be crucially dierent from the classi
  • 93. cation given by standard clustering coeÆcient. E.g. the net consisting of three completely connected vertices 3a yields a clustering coeÆcient of 1 but a low melting 27
  • 94. 4. Spin models 1.5 2.0 2.5 3.0 3.5 4.0 4.5 113 258 365 461 504 111 153 160 223 30 27 7 224 371 105 222336 202 199 80 251 181 530 529 492 482 453 425 398 359 338 134 102 41 477 204 58 156 466 390 299 488 378 243 431 292 110 226 422 345 457 296 124 15 1 554 187 128 84 257 384 280 138 285 536 486 267 408 38 220 61 24 201 290 381 400 491 452 377 314 99 350 308 282 283 254 253 255 234 252 237 163 166 341 418 329 165 116 374 20 373 370 343 246 144 542 450 433 429 164 37 319 40 6 8 249 458 339 470 322 301 129 94 274 272 245 Figure 4.5.: collaboration clusters from
  • 95. gure 3.4, ordered by # 28
  • 96. 4.2. Cluster limited Ising models temperature. 4.2.4. Linear Relationship Surprisingly, we
  • 97. nd a linear relationship be- tween N and E=# (
  • 98. gure 4.6). Thus, we postulate E # = aN b and conclude a formula for #: #calc(E;N) = E aN b : Fitting parameters to our measurements (yielding a = 0:72; b = 0:89), this gives good prediction of melting temperatures. Results can be seen in
  • 99. gure 4.7, as well as a diagram showing errors being inferior to 10% in most cases. #calc = E aN b N!1 ! hki 2a : We see that, in the limit of high N, the melting temperature # is proportional to the average number of edges per site hki = 2E=N, a result known from mean
  • 101. 4. Spin models 20 18 16 14 12 10 8 6 4 2 0 0 5 10 15 20 25 30 E/J N 0.76N-1.0 Figure 4.6.: E=# vs. N shows a surprisingly linear correlation. 5 4.5 4 3.5 3 2.5 2 1.5 a=0.72; b=0.89 1.5 2 2.5 3 3.5 4 4.5 5 J calculated J measured 0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15 -0.2 a=0.72; b=0.89 2 4 8 16 DJ N Figure 4.7.: # determined by #calc(E;N) = E aNb vs. measurements. E is the cluster's total number of edges. 30
  • 102. 5. Barabasi-Albert network models 5.1. Modi
  • 103. ed Barabasi-Albert model Network model of Barabasi and Albert [30] was introduced in section 2.5. We pointed out that it shows rather good
  • 104. ts with em- pirical networks, but lacks support for dis- jointed ones, as the algorithm only delivers one giant cluster. Thus, to cope with networks consisting of several components, we must modify the model. We chose a very simple approach: In each step of adding nodes, we start a new cluster of m0 = 3 nodes with probability p. Vertices added in consecutive time steps can connect to any node in any component respecting the same probability rule as in the standard model. In the case m = 1, components can only grow (isolated clusters), whereas in the case m 1, new nodes are able to connect two or more existing com- ponents of the network (merging clus- ters). 5.2. Simulation In order to compare simulation results with real-world data from a scienti
  • 105. c collabora- tion network [55], we let the network grow up to the size of 555 nodes. For proper statistics, this is repeated 104 times. 5.2.1. Isolated clusters scale-free behavior In case of m = 1, i.e. considering isolated clusters, we can be sure to get scale-free behavior within the distinct clusters, as the probabilities for at- tachment of a new node to an existing one are the same as in a single Barabasi-Albert network (modulo a constant factor due to a new node having the choice between dierent clusters to connect to). However, complete network is a priori not necessarily scale-free, as total statistics is made up by the sum of all scale-free sub- networks or clusters. So, we have to focus later on the question, if scale-free behavior prevails. cluster size distribution Next, we exam- ine the number of clusters of dierent sizes (
  • 107. nd that high probability of starting a new net leads to many smaller networks, whereas low values privilege bigger networks. Yet, we make an interesting observation: low probabili- ties lead to a cluster-size distribution that is not monotonic any more, but favors big networks. Looking at
  • 108. gure 5.1 which shows the number of points in clusters of a given size instead of the sheer cluster count, makes this more plausible. For p = 0, we will see a graph / Æ(555), as there is only one giant clus- ter, 31
  • 109. 5. Barabasi-Albert network models 1e+06 100000 10000 1000 100 10 1 10 100 frequency cluster size 0.01 0.04 0.1 0.4 0.8 1e+06 100000 10000 1000 100 10 100 node count cluster size 0.01 0.04 0.1 0.4 0.8 Figure 5.1.: Frequency of clusters (left) resp. number of nodes in clusters of a given size (right) vs. cluster size at dierent probabilities for a new net. Simulation was run 104 times with a network growing up to 555 nodes. The curve for p = 0:01 is the one with the rightmost peak; to the left follow the other p-values in descending order. for p = 1 a graph / Æ(m0 = 3), be- cause there are only embryonic sub- nets. What we observe for 0 p 1 is the transition between both extremes. 4 2 1 -exponent 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 exponent of power law region probability for a new net e2.25x Figure 5.2.: Negative exponent of the power law part of the curves in
  • 110. gure 5.1 vs. probability p for a new net. The line corresponds to exponent = e2:25p. power law region For all p, we start with a power law region, regarding the distribu- tion of small and medium cluster sizes. The exponent varies with network-birth proba- bility p. In the semi-logarithmic plot in
  • 111. g- ure 5.2 it is shown, that the exponential relation e2:25p describes our data rather well. Of course, this formula cannot be true for general p as for p ! 1 we expect m ! 1! Regarding empirical data from sections 3.2's collaboration graph, we
  • 112. nd good overlap with the model (
  • 113. gure 5.3). In- terestingly, the model is even able to ex- plain facts formerly regarded as statistical anomalies, as the observation of a giant cluster of a size exceeding largely all oth- ers in the network (section 3.2.2). 5.2.2. Merging clusters Now, we modify the model by examining m 1. In this case, newly added vertices develop several links to existing nodes (and thus existing clusters), being able to con- 32
  • 114. 5.2. Simulation 10 1 0.1 0.01 collaboration network p=4%’ 1 10 100 frequency total authors in clusters of given size Figure 5.3.: Semi-logarithmic plot comparing the simulation with p = 0:04 using the isolated clusters model of
  • 115. gure 5.1 and statistical data from a science collaboration network (section 3.2). nect hitherto separated networks. In this paper, we limit our considerations on the standard Barabasi-Albert case m = m0 = 3. Using dierent p, we quickly recognize that low and medium probabilities make the simulations nearly always end up with a single giant cluster containing all vertices. Points of interest are higher p in the region of 60{90%. cluster size distribution Again, we plot the total number of nodes contained in clus- ters of a given size (
  • 116. gure 5.4). For small cluster sizes, we observe an non-uniform be- havior of the graph. The explanation is as follows: newly born clusters have a size of m0 = 3 and thus appear very often. Also, cluster of sizes 4 or 7 are very probable, whereas a cluster of size 5 is very rare, be- cause it can only be formed by a new clus- ter to which two new ones have connected without gluing it to a second cluster. In a semi-logarithmic plot (
  • 118. nd a parabolic dependence for high cluster sizes (i.e. a Gaussian distribution around a mean depending on p). Appearently, the merging clusters cannot cope with reality. 33
  • 119. 5. Barabasi-Albert network models 1e+06 100000 10000 1000 100 10 1 2 4 8 16 32 frequency degree p=60% p=80% 1e+07 1e+06 100000 10000 1000 100 10 1 0 5 10 15 20 25 30 35 frequency degree p=60% p=80% Figure 5.5.: Frequency of nodes with a certain degree. Simulation was run 104 times with networks growing up to 555 nodes. Left plot is linear, right plot semi-logarithmic. M = 1 (isolated clusters). 1e+07 1e+06 100000 10000 1000 100 10 1 2 4 8 16 32 64 128 frequency degree p=1% p=40% 1e+07 1e+06 100000 10000 1000 100 10 1 0 20 40 60 80 100 120 140 160 180 200 frequency degree p=1% p=40% Figure 5.6.: Same plots as
  • 120. gure 5.5 using M = 3 (merging clusters). 5.2.3. scale-free behavior In
  • 121. gure 5.5 we can see that there is no pure scale-free behavior. There seems to be power-law behavior for small degrees and an exponential cuto (
  • 122. gure 5.5) at higher values. Similar results have been observed by Newman [7] for collaboration networks. One could argue that this eect is due to the fact that we do not plot the degree distribution for single clusters, but for the whole set of them. This demur only counts at
  • 123. rst sight, though. At p = 80% we have several small clusters but virtually only one giant cluster dominating the degree distri- bution for high degrees. So, the fact of averaging of many dierent sized clusters should manifest mainly in the area of small degrees opposite to our observations. Mossa et al. [44] oer a possible expla- nation for the exponential cuto encoun- tered. They use a model which attributes to each node only a restricted knowledge on the network, i.e. the vertex is not able 34
  • 124. 5.2. Simulation 1e+07 1e+06 100000 10000 1000 100 10 50 100 150 200 250 300 350 400 node count cluster size p=70% p=80% p=90% Figure 5.4.: Number of nodes in clusters of a given size vs. cluster size at dierent probabilities for a new net. Simula- tion was run 105 times with a net- work growing up to 555 nodes. to consider the whole graph's structure, but only a subset according to its limited view. M p ln(N0) k . 1 1% 16.8 2.27 60. 1 40% 17.1 2.45 9.2 3 60% 18.1 3.1 5.1 3 80% 20.1 4.8 3.5 Table 5.1.: CoeÆcients for
  • 126. 5. Barabasi-Albert network models 1e+06 100000 10000 1000 100 10 1 M=1; p= 1% M=1; p=40% M=3; p=60% M=3; p=80% 4 8 16 32 64 128 frequency degree Figure 5.7.: Plots of
  • 128. gure 5.6 show good
  • 129. t to exponentially truncated power laws y = N0xkex=. 36
  • 130. 6. Conclusion Dealing with real-world networks, scien- tists found three properties predominating: short average path lengths (Small World Eect), scale-free behavior, high clustering. Dierent models were developed to cope with this challenge, each having dierent advantages and disadvantages. The model of Barabasi and Albert [30] is a promis- ing one, but lacks support for discontiguous networks. We constructed a network of co- authorship with 555 authors. Only scien- tists were chosen that cite a speci
  • 131. c pa- per [30]. The resulting net shows scale-free characteristics but diers substantially from accepted computer models' results. Simulating Ising models on the network reveals strong robustness against distur- bances (spin ip experiment/leadership ef- fect) and shows coherence with mean
  • 133. nd the critical temperature of subnets of our graph being proportional to the average number of edges per site, in the limit of a high node count. In order to overcome the mentioned dis- advantages of the Barabasi-Albert model, we developed a modi
  • 134. ed version, allowing formation of multiple clusters. We saw a strong dependence of a node's edges count on the network structure, separating two cases: isolated clusters and merging clus- ters. Only the
  • 135. rst case leads to results
  • 136. t- ting reality. Comparison with statistics from our collaboration net shows similar behavior and is even able to explain facts at
  • 137. rst re- garded as statistical anomalies as the obser- vation of a giant cluster of a size exceeding largely all others in the network. Even ex- ponential cuto of nodes with high degrees, as encountered empirically, is reproduced. Re-evaluating the model with a higher number of authors would lead to better statistics and greater reliability. 37
  • 139. A. Acknowledgements I would like to thank D. Stauer1 for giv- ing me the idea of the subject and support- ing me by comments and discussions during my research. Thanks to M. Abd-Elmeguid2 for co- judging this work. My thanks for writing excellent software goes to the authors of LATEX, BibTEX, pdfLATEX, dvips LATEXpackages: KOMAScript, natbib, custom-bib, graphicx, listings, units, hyperref, hypernat, colortbl, color SciTE gnuplot Ruby, C++, Perl, bash dotty, neato[39] Thanks to A. Sindermann for supporting my work by addressing computer network problems. Special thanks to K. Godthardt for sup- porting me. 1Institute of Theoretical Physics, University of Cologne 2II. Institute of Experimental Physics, University of Cologne 39
  • 141. B. Source code B.1. Ising model This C++ program simulates an Ising model on a given graph. In section 4.1.1 it was used on our collaboration network. // This program reads network data from a file and simulates // an Ising model an this graph. // Metropolis probabilities are used . // 5 // Felix Puetsch fxp@thp .uni -koeln.de , 2003 -01 -22 #inc lude i o s t r e am #inc lude f s t r eam #inc lude s t d i o . h 10 #inc lude a s s e r t . h #de f ine MAX INT 2147483647 #de f ine MAX CONN 30 15 #de f ine MAX NODE 600 #de f ine NODOUBLE 1 // # define NOASSERT 20 us ing namespace s t d ; // === random number generator ================================ 25 c l a s s Random f p r i v a t e : i n t s t a t e ; p u b l i c : Random( i n t s e ed ) ; 30 i n t ge t ( ) f r e turn s t a t e =65539; g // 16807 g ; Random: : Random( i n t s e ed ) f a s s e r t ( s e ed % 2 == 1) ; 35 s t a t e = s e ed ; g // === Vertex ================================================= 41
  • 142. B. Source code 40 c l a s s I s i n g ; c l a s s Ve r t e x f p r i v a t e : i n t number , conn count , s p i n ; 45 Ve r t e x n e i g h b o u r [MAX CONN] ; I s i n g i s i n g ; p u b l i c : Ve r t e x ( I s i n g i s , i n t nr ) ; ~ Ve r t e x ( ) ; 50 i n t g e tSp i n ( ) f r e turn s p i n ; g void s e t S p i n ( i n t s ) f s p i n = s ; g void addConn ( Ve r t e x to , i n t nodoubl e = 0) ; i n t getNumber ( ) f r e turn number ; g i n t getConnCount ( ) f r e turn conn count ; g 55 Ve r t e x getConn ( i n t i ) ; i n t s imu l a t e S t e p ( ) ; g ; // === Ising ================================================== 60 c l a s s I s i n g f p u b l i c : Random rnd ; i n t e n l imi t [ 2 MAX CONN+1] ; 65 p r i v a t e : i f s t r e am ne t ; char b u f f e r [ 8 0 ] ; i n t v c o u n t ; Ve r t e x v l i s t [MAX NODE ] ; 70 p u b l i c : I s i n g ( char fname ) ; void b u i l d n e t ( ) ; void debug ( i n t nr ) ; void r e s e t ( ) ; 75 void s imu l a t e ( double kT , i n t maxtime=1, i n t s t e p t ime=0) ; g ; // === Vertex ================================================= // ... Struktors ............................................. 80 Ve r t e x : : Ve r t e x ( I s i n g i s , i n t nr ) f conn count = 0 ; i s i n g = i s ; number = nr ; 85 g Ve r t e x : : ~ Ve r t e x ( ) f cout ~ Ve r t e x e n d l ; g 90 void Ve r t e x : : addConn ( Ve r t e x to , i n t nodoubl e ) f i f ( nodoubl e ) f o r ( i n t i =0; iconn count ; i++) 42
  • 143. B.1. Ising model i f ( n e i g h b o u r [ i ]==to ) r e turn ; 95 n e i g h b o u r [ conn count++] = to ; a s s e r t ( conn count MAX CONN) ; g Ve r t e x Ve r t e x : : getConn ( i n t i ) f 100 a s s e r t ( i conn count ) ; r e turn n e i g h b o u r [ i ] ; g i n t Ve r t e x : : s imu l a t e S t e p ( ) f 105 i n t spinsum=0; f o r ( i n t i =0; iconn count ; i++) spinsum += n e i g h b o u r [ i ]g e tSp i n ( ) ; spinsum = s p i n ; i f ( i s i n g rndge t ( ) i s i n g e n l imi t [ spinsum+MAX CONN] ) 110 s p i n =1; r e turn s p i n ; g // === Ising ================================================== 115 // ... Struktors ............................................. I s i n g : : I s i n g ( char fname ) f cout I s i n g e n d l ; rnd = new Random( 1 ) ; 120 ne t . open ( fname ) ; i f ( ! ne t . i s o p e n ( ) ) f c e r r i n p u t f i l e not found e n d l ; e x i t ( 1 ) ; g 125 b u i l d n e t ( ) ; f o r ( f l o a t kT=50; kT5 0 . 1 ; kT+=0.2) f c e r r s t a r t i n g s imu l a t i o n wi th kT= kT e n d l ; r e s e t ( ) ; s imu l a t e (kT, 1 0 0 0 0 0 0 , 1 0 0 ) ; 130 g g // ... Methods ............................................... 135 void I s i n g : : b u i l d n e t ( ) f cout b u i l d n e t e n d l ; f o r ( i n t i =0; iMAX NODE; i++) f v l i s t [ i ] = new Ve r t e x ( thi s , i ) ; v l i s t [ i ]s e t S p i n ( 1 ) ; 140 g i n t from , to ; whi le ( ! ne t . e o f ( ) ) f ne t . g e t l i n e ( b u f f e r , 8 0 ) ; i f ( s s c a n f ( b u f f e r , %i %i ; , from , to ) != 2 ) f 145 c e r r i n p u t l i n e i g n o r e d : ' b u f f e r ' e n d l ; cont inue ; g 43
  • 144. B. Source code a s s e r t ( f rom MAX NODE) ; a s s e r t ( to MAX NODE) ; v l i s t [ f rom]addConn ( v l i s t [ to ] , NODOUBLE) ; 150 v l i s t [ to]addConn ( v l i s t [ f rom ] , NODOUBLE) ; g ne t . c l o s e ( ) ; v c o u n t = 0 ; f o r ( i n t i =1; iMAX NODE; i++) 155 i f ( v l i s t [ i ]getConnCount ( ) 0) v c o u n t=i ; e l s e v l i s t [ i ]s e t S p i n ( 0 ) ; c e r r v c o u n t nodes e n d l ; g 160 void I s i n g : : debug ( i n t nr ) f Ve r t e x node = v l i s t [ nr ] ; i n t nb = nodegetConnCount ( ) ; cout node nr has nb c o n n e c t i o n s : e n d l ; 165 f o r ( i n t i =0; inb ; i++) cout nodegetConn ( i )getNumber ( ) e n d l ; g void I s i n g : : r e s e t ( ) f 170 f o r ( i n t i =0; iMAX NODE; i++) v l i s t [ i ]s e t S p i n ( abs ( v l i s t [ i ]g e tSp i n ( ) ) ) ; g // ... Simulation ............................................. 175 void I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t e p t ime ) f i f ( ! s t e p t ime ) s t e p t ime=maxtime ; f o r ( i n t i=MAX CONN; i=MAX CONN; i++) e n l imi t [ i+MAX CONN] = 180 ( i n t ) (MAX INT ( 2 exp (2. i /kT)1) ) ; i n t mag , t ime=0; whi le ( t imemaxtime ) f f o r ( i n t s t e p =0; s t eps t e p t ime ; s t e p++) f mag = 0 ; 185 f o r ( i n t nr =1; nr=v c o u n t ; nr++) f a s s e r t ( v l i s t [ nr ] !=NULL) ; mag += v l i s t [ nr ]s imu l a t e S t e p ( ) ; g g 190 t ime += s t e p t ime ; cout t ime kT abs (mag) mag e n d l ; g g 195 // === main =================================================== i n t main ( i n t argc , char argv ) f a s s e r t ( c e r r debug mode on e n d l ) ; a s s e r t ( a r g c==2) ; 200 I s i n g i s i n g ( argv [ 1 ] ) ; r e turn 0 ; 44
  • 146. B. Source code B.2. Spin ip model This C++ program simulates an Ising model on a given graph. In regular time intervals, the most connected spins (hard coded) are pinned to an up position. In section 4.1.4 this was used on our collaboration network. // This program reads network data from a file and simulates // an Ising model an this graph. // additionally , every ... time steps the next most connected // spin is flipped permanently . 5 // Metropolis probabilities are used . // // Felix Puetsch fxp@thp .uni -koeln.de , 2003 -02 -05 #inc lude i o s t r e am 10 #inc lude f s t r eam #inc lude s t d i o . h #inc lude a s s e r t . h #de f ine MAX INT 2147483647 15 #de f ine MAX CONN 40 #de f ine MAX NODE 600 #de f ine NODOUBLE 0 20 // # define NOASSERT us ing namespace s t d ; 25 // === random number generator ================================ c l a s s Random f p r i v a t e : i n t s t a t e ; 30 p u b l i c : Random( i n t s e ed ) ; i n t ge t ( ) f r e turn s t a t e =65539; g // 16807 g ; 35 Random: : Random( i n t s e ed ) f a s s e r t ( s e ed % 2 == 1) ; s t a t e = s e ed ; g 40 // === Vertex ================================================= c l a s s I s i n g ; c l a s s Ve r t e x f 45 p r i v a t e : i n t number , conn count , s p i n , s t i c k y ; Ve r t e x n e i g h b o u r [MAX CONN] ; 46
  • 147. B.2. Spin ip model I s i n g i s i n g ; p u b l i c : 50 Ve r t e x ( I s i n g i s , i n t nr ) ; ~ Ve r t e x ( ) ; i n t g e tSp i n ( ) f r e turn s p i n ; g void s t i c k I t ( ) ; void s e t S p i n ( i n t s ) f s p i n = s ; g 55 void addConn ( Ve r t e x to , i n t nodoubl e = 0) ; i n t getNumber ( ) f r e turn number ; g i n t getConnCount ( ) f r e turn conn count ; g Ve r t e x getConn ( i n t i ) ; i n t s imu l a t e S t e p ( ) ; 60 g ; // === Ising ================================================== c l a s s I s i n g f 65 p u b l i c : Random rnd ; i n t e n l imi t [ 2 MAX CONN+1] ; p r i v a t e : i f s t r e am ne t ; 70 char b u f f e r [ 8 0 ] ; i n t v c o u n t ; Ve r t e x v l i s t [MAX NODE ] ; p u b l i c : I s i n g ( char fname ) ; 75 void b u i l d n e t ( ) ; void debug ( i n t nr ) ; void r e s e t ( ) ; void s imu l a t e ( double kT , i n t maxtime=1, i n t s t e p t ime =0 , i n t s t a r t t ime =0) ; g ; 80 // === Vertex ================================================= // ... Struktors ............................................. Ve r t e x : : Ve r t e x ( I s i n g i s , i n t nr ) f 85 conn count = 0 ; s t i c k y = 0 ; i s i n g = i s ; number = nr ; g 90 Ve r t e x : : ~ Ve r t e x ( ) f cout ~ Ve r t e x e n d l ; g 95 void Ve r t e x : : addConn ( Ve r t e x to , i n t nodoubl e ) f i f ( nodoubl e ) f o r ( i n t i =0; iconn count ; i++) i f ( n e i g h b o u r [ i ]==to ) r e turn ; n e i g h b o u r [ conn count++] = to ; 100 a s s e r t ( conn count MAX CONN) ; 47
  • 148. B. Source code g Ve r t e x Ve r t e x : : getConn ( i n t i ) f a s s e r t ( i conn count ) ; 105 r e turn n e i g h b o u r [ i ] ; g void Ve r t e x : : s t i c k I t ( ) f s t i c k y = 1 ; 110 s p i n = 1; c e r r number s t i c k e d . e n d l ; g i n t Ve r t e x : : s imu l a t e S t e p ( ) f 115 i f ( s t i c k y !=0) r e turn s p i n ; i n t spinsum=0; f o r ( i n t i =0; iconn count ; i++) spinsum += n e i g h b o u r [ i ]g e tSp i n ( ) ; spinsum = s p i n ; 120 i f ( i s i n g rndge t ( ) i s i n g e n l imi t [ spinsum+MAX CONN] ) s p i n =1; r e turn s p i n ; g 125 // === Ising ================================================== // ... Struktors ............................................. I s i n g : : I s i n g ( char fname ) f cout I s i n g e n d l ; 130 rnd = new Random( 1 ) ; ne t . open ( fname ) ; i f ( ! ne t . i s o p e n ( ) ) f c e r r i n p u t f i l e not found e n d l ; e x i t ( 1 ) ; 135 g b u i l d n e t ( ) ; double kT=0.2; f o r ( i n t i =1; i =100; i++) f c e r r s t a r t i n g s imu l a t i o n wi th kT= kT e n d l ; 140 r e s e t ( ) ; // the following is not beautiful , but quick :-) s imu l a t e (kT, 1 0 0 0 0 0 , 1 0 0 0 ) ; v l i s t [27] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 0 0 0 0 0 ) ; 145 a s s e r t ( v l i s t [27]g e tSp i n ( )==1) ; v l i s t [329] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 1 0 0 0 0 ) ; v l i s t [116] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 2 0 0 0 0 ) ; 150 v l i s t [223] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 3 0 0 0 0 ) ; v l i s t [251] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 4 0 0 0 0 ) ; v l i s t [237] s t i c k I t ( ) ; 48
  • 149. B.2. Spin ip model 155 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 5 0 0 0 0 ) ; v l i s t [491] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 6 0 0 0 0 ) ; v l i s t [365] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 7 0 0 0 0 ) ; 160 v l i s t [7] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 8 0 0 0 0 ) ; v l i s t [418] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 9 0 0 0 0 ) ; v l i s t [381] s t i c k I t ( ) ; 165 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 0 0 0 0 0 ) ; v l i s t [199] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 1 0 0 0 0 ) ; v l i s t [398] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 2 0 0 0 0 ) ; 170 v l i s t [15] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 3 0 0 0 0 ) ; v l i s t [492] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 4 0 0 0 0 ) ; v l i s t [461] s t i c k I t ( ) ; 175 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 5 0 0 0 0 ) ; v l i s t [371] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 6 0 0 0 0 ) ; v l i s t [249] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 7 0 0 0 0 ) ; 180 v l i s t [80] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 8 0 0 0 0 ) ; v l i s t [486] s t i c k I t ( ) ; s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 9 0 0 0 0 ) ; g 185 g // ... Methoden ............................................... void I s i n g : : b u i l d n e t ( ) f 190 cout b u i l d n e t e n d l ; f o r ( i n t i =0; iMAX NODE; i++) f v l i s t [ i ] = new Ve r t e x ( thi s , i ) ; v l i s t [ i ]s e t S p i n ( 1 ) ; g 195 i n t from , to ; whi le ( ! ne t . e o f ( ) ) f ne t . g e t l i n e ( b u f f e r , 8 0 ) ; i f ( s s c a n f ( b u f f e r , %i %i ; , from , to ) != 2 ) f c e r r i n p u t l i n e i g n o r e d : ' b u f f e r ' e n d l ; 200 cont inue ; g a s s e r t ( f rom MAX NODE) ; a s s e r t ( to MAX NODE) ; v l i s t [ f rom]addConn ( v l i s t [ to ] , NODOUBLE) ; v l i s t [ to]addConn ( v l i s t [ f rom ] , NODOUBLE) ; 205 g ne t . c l o s e ( ) ; v c o u n t = 0 ; f o r ( i n t i =1; iMAX NODE; i++) 49
  • 150. B. Source code i f ( v l i s t [ i ]getConnCount ( ) 0) v c o u n t=i ; 210 e l s e v l i s t [ i ]s e t S p i n ( 0 ) ; c e r r v c o u n t nodes e n d l ; g void I s i n g : : debug ( i n t nr ) f 215 Ve r t e x node = v l i s t [ nr ] ; i n t nb = nodegetConnCount ( ) ; cout node nr has nb c o n n e c t i o n s : e n d l ; f o r ( i n t i =0; inb ; i++) 220 cout nodegetConn ( i )getNumber ( ) e n d l ; g void I s i n g : : r e s e t ( ) f f o r ( i n t i =0; iMAX NODE; i++) 225 v l i s t [ i ]s e t S p i n ( abs ( v l i s t [ i ]g e tSp i n ( ) ) ) ; g // ... Simulation ............................................. 230 void I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t e p t ime , i n t s t a r t t ime ) f i f ( ! s t e p t ime ) s t e p t ime=maxtime ; f o r ( i n t i=MAX CONN; i=MAX CONN; i++) e n l imi t [ i+MAX CONN] = ( i n t ) (MAX INT ( 2 exp (2. i /kT)1) ) ; 235 i n t mag , t ime=s t a r t t ime ; whi le ( t imes t a r t t ime maxtime ) f f o r ( i n t s t e p =0; s t eps t e p t ime ; s t e p++) f mag = 0 ; f o r ( i n t nr =1; nr=v c o u n t ; nr++) f 240 a s s e r t ( v l i s t [ nr ] !=NULL) ; mag += v l i s t [ nr ]s imu l a t e S t e p ( ) ; g g t ime += s t e p t ime ; 245 cout t ime kT mag e n d l ; g g // === main =================================================== 250 i n t main ( i n t argc , char argv ) f a s s e r t ( c e r r debug mode on e n d l ) ; I s i n g i s i n g ( ne t . t x t ) ; r e turn 0 ; 255 g 50
  • 153. ed Barabasi-Albert model This Ruby program creates modi
  • 154. ed Barabasi-Albert models with a given
  • 155. nal size. In section 5 this was used on our collaboration network. #!/ home /fxp/bin /ruby -w $P = 0 . 8 $RUNS = 1E0 . t o i 5 $M = 1 $ c l u s t e r i n i t i a l s i z e = 3 $MAXINT = 21474836472+1 10 c l a s s Random @@ibm = 1 def rnd (max=n i l ) @@ibm = 65539 # 16807 # 65539 15 @@ibm = $MAXINT max ? @@ibmmax/$MAXINT : @@ibm end end 20 c l a s s Ve r t e x a t t r r e a d e r : c o n n e c t i o n s a t t r a c c e s s o r : nr @ @ t o t a l v e r t i c e s = 0 d e f i n i t i a l i z e 25 @c o n n e c t i o n s = [ ] @ @ t o t a l v e r t i c e s += 1 @nr = @ @ t o t a l v e r t i c e s end d e f i n s p e c t 30 I 'm node nr . #f @nr g conne c t ed to #f@c o n n e c t i o n s . c o l l e c t f j c j c . nr g . j o i n ( , ) g . end d e f a d d l i n k ( p a r t n e r ) @c o n n e c t i o n s = p a r t n e r end 35 d e f conne c t ( p a r t n e r ) a d d l i n k ( p a r t n e r ) p a r t n e r . a d d l i n k ( s e l f ) end end 40 c l a s s BA net d e f i n i t i a l i z e ( prob new=0) @p new = ( prob new $MAXINT) . t o i @nodes = [ ] 45 @k e r t e s z = [ ] @r = Random. new 1 0 0 . t ime s f @r . rnd g s t a r t n ew n e t 51
  • 156. B. Source code end 50 d e f s i z e @nodes . l e n g t h end d e f s t a r t n ew n e t max = @nodes . l e n g t h 55 $ c l u s t e r i n i t i a l s i z e . t ime s f @nodes Ve r t e x . new g $ c l u s t e r i n i t i a l s i z e . t ime s f j i j o r i g = i + max d e s t = ( ( i +1) % $ c l u s t e r i n i t i a l s i z e ) + max @nodes [ o r i g ] . conne c t ( @nodes [ d e s t ] ) 60 @k e r t e s z o r i g d e s t g end d e f i n s p e c t @nodes . c o l l e c t f j node j node . i n s p e c t g . j o i n (n n ) + n n#f@k e r t e s z . i n s p e c t g 65 end d e f add node i f @r . rnd @p new s t a r t n ew n e t e l s e 70 @nodes new node = Ve r t e x . new l = @k e r t e s z . l e n g t h $M. t ime s f d e s t = @k e r t e s z [ @r . rnd ( l 1) ] new node . conne c t ( @nodes [ d e s t ] ) 75 @k e r t e s z @nodes . l e n g t h 1 d e s t g end end d e f c h e c k s u b t r e e ( node , s u b t r e e ) 80 r e t u r n i f ! node . nr s u b t r e e node . nr node . nr = n i l node . c o n n e c t i o n s . each f j n j c h e c k s u b t r e e ( n , s u b t r e e ) g end 85 d e f a n a l y s i s @nodes . each f j node j ne x t i f ! node . nr t r e e = [ ] c h e c k s u b t r e e ( node , t r e e ) 90 # p t r e e . s o r t # [ 1 ] [ 2 , 3 ] [ 4 , 5 , 6 , 7 ] [ 8 . . . ] . . . bucke t = t r e e . l e n g t h i f ! $ s t a t i s t i k [ bucke t ] $ s t a t i s t i k [ bucke t ] = 1 95 e l s e $ s t a t i s t i k [ bucke t ] += 1 end g end 100 end 52
  • 158. ed Barabasi-Albert model $ s t a t i s t i k = fg $RUNS. t ime s f j i j $ s t d e r r . p r i n t #f i g . . . i f i%100==0 105 mynet = BA net . new($P) b e g i n mynet . add node end wh i l e mynet . s i z e 555 # p mynet 110 mynet . a n a l y s i s g $ s t a t i s t i k . k e y s . s o r t . each f j l j p r i n t f %3 i %6.4 f nn , l , $ s t a t i s t i k [ l ] g # . t o f /$RUNS g 53
  • 161. gures found in this publication were made by myself, except the following:
  • 162. gure 1.1 was found on http://- www.math.colostate.edu/~betten/- courses/M501/combi.html.
  • 163. gure 2.1 was found in [14].
  • 164. gure 2.3 was found on http://- www.science.nd.edu/physics/- Faculty/barabasi.html.
  • 165. gure 2.3 was found on http://- www.phys.psu.edu/~ralbert/.
  • 166. gure 2.2 was found in [18].
  • 167. gure 4.1 was found on http://- www.physik.tu-dresden.de/itp/- members/kobe/isingphbl/. 55
  • 169. Bibliography [1] L. Euler. Solutio problematis ad geometriam situs pertinentis. Commetarii Academiae Scientiarum Imperialis Petropolitanae 8 (1736). [2] R. Albert and A.-L. Barabasi. Statistical me-chanics of complex networks. Rev. Mod. Phys. 74, 47{97 (2002). [3] S. N. Dorogovtsev and J. F. F. Mendes. Evo-lution of networks. Adv. Phys. 51, 1079{1187 (2002). [4] A.-L. Barabasi. Linked. The New Science of Net- works (Perseus, Cambridge, Massachusetts, 2002). [5] K. Rieger. Sind Sie mit Marlon Brando befreun-det? Die Zeit 44, BL21 (1999). [6] A.-L. Barabasi, R. Albert, and H. Jeong. Scale-free characteristics of random networks: the topology of the world-wide web. Physica A 281, 69{77 (2000). [7] M. E. J. Newman. Scienti
  • 170. c collaboration net-works. Network construction and fundamental re-sults. Phys. Rev. E 64, 01631 (2001). [8] M. E. J. Newman. Scienti
  • 171. c collaboration net-works. Shortest paths, weighted networks, and cen-trality. Phys. Rev. E 64, 01632 (2001). [9] H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scale-free topology of e-mail networks. Phys. Rev. E 66, 035103 (2002). [10] S. Redner. How popular is your paper? an empir-ical study of the citation distribution. Europ. Phys. J. B 4, 131{134 (1998). [11] A. J. Lotka. The frequency distribution of scien-ti
  • 172. c productivity. J. Wash. Acad. Sc. 16, 317{323 (1926). [12] F. Liljeros, C. R. Edling, et al. The web of human sexual contacts. Nature 411, 907 (2001). [13] S. N. Dorogovtsev and J. F. F. Mendes. Evo- lution of networks: From Biological Nets to the Internet and WWW (Oxford University Press, Ox-ford, 2003). [14] S. H. Strogatz. Exploring complex networks. Na- ture 410, 268{276 (2001). [15] S. Milgram. The small world problem. Psych. Today 61 (1967). [16] D. J. Watts and S. H. Strogatz. Small world. Nature 393, 440{442 (1998). [17] D. Watts. Small worlds: the dynamics of net- works between order and randomness (Princeton University Press, Princeton, New Jersey, 1999). [18] M. E. J. Newman. Models of the small world|a review (2000). ArXiv:cond-mat/0001118. [19] M. Gladwell. The Tipping Point. How little things can make a big dierence. (Little Brown and Company, Boston, Massachusetts, 2000). [20] R. Cohen, K. Erez, D. ben Avraham, and S. Havlin. Resilience of the internet to random breakdowns. Phys. Rev. Lett. 85, 21, 4626{4628 (2000). [21] A. L. Lloyd and R. M. May. How viruses spread among computers and people. Science 292, 1316{ 1317 (2001). [22] M. E. J. Newman, S. Forrest, and J. Balthrop. Email networks and the spread of computer viruses. Phys. Rev. E 66, 035101 (2002). [23] B. J. Kim, C. N. Yoon, S. K. Han, and H. Jeong. Path
  • 173. nding strategies in scale-free networks. Phys. Rev. E 65, 027103 (2002). [24] J. M. Kleinberg. Navigation in a small world. Nature 406, 845 (2000). [25] Isi web of science (2002). http://www.isinet.com/- isi/products/citation/wos/index.html. [26] R. Albert, H. Jeong, and A.-L. Barabasi. Di-ameter of the world-wide web. Nature 401, 130 (1999). [27] R. Solomonoff and A. Rapoport. Connectivity of random nets. Bull. Math. Biophys. 13, 107{227 (1951). [28] P. Erd}os and A. Renyi. On random graphs. Pub. Mathem. 6, 290{297 (1959). [29] M. E. J. Newman. Random graphs as models of networks (2002). ArXiv:cond-mat/0202208. 57
  • 174. Bibliography [30] A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science 286, 509{512 (1999). [31] St. Matthew. New Testament. In The Bible, chap. 12, 13 (God, ca. 10AD). [32] D. Cohen. All the world's a net. New Scientist 174, 24{29 (2002). [33] A.-L. Barabasi and E. Bonabeau. Scale-free net-works. Sci. Amer. 3, 50{59 (2003). [34] K. Klemm and V. M. Eguiluz. Growing scale-free networks with small-world behavior. Phys. Rev. E 65, 057102 (2002). [35] E. Ravasz and A.-L. Barabasi. Hierarchical or-ganization in complex networks. Phys. Rev. E 67, 026112 (2003). [36] K. Klemm and V. M. Eguiluz. Highly clustered scale-free networks. Phys. Rev. E 65, 036123 (2002). [37] A. F. J. van Raan. Fractal dimension of co-citations. Nature 347, 626 (1990). [38] ATT. Graphviz|open source graph drawing software (2002). http://www.research.att.com/- sw/tools/graphviz/. [39] S. C. North. Drawing graphs with neato. ATT (2002). [40] J. C. Venter, M. D. Adams, E. W. Myers, et al. The sequence of the human genome. Sci- ence 291, 1304{1351 (2001). [41] A.-L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, and T. Vicsek. Evolution of the so-cial network of scienti
  • 175. c collaborations. Physica A 311, 590{614 (2002). [42] A.-L. Barabasi. The physics of the web (2001). http://www.physicsweb.org/article/- world/14/7/09. [43] L. A. N. Amaral, A. Scala, M. Barthelemy, and H. E. Stanley. Classes of small-world net-works. Proc. Nat. Acad. Sci. USA 97, 21, 509{512 (2000). [44] S. Mossa, M. Barthelemy, H. E. Stanley, and L. A. N. Amaral. Truncation of power law be-haviour in scale-free network models due to in-formation
  • 176. ltering. Phys. Rev. Lett. 89, 208701 (2002). [45] D. Stauffer and A. Aharony. Introduction to Percolation Theory (Taylor and Francis, London, 1994), second ed. [46] A.-L. Barabasi, R. Albert, and H. Jeong. Mean
  • 177. eld theory for scale-free random networks. Phys- ica A 272, 173{187 (1999). [47] E. Ising. Beitrag zur Theorie des Ferromag-netismus. Zeitschr. f. Phys. 31, 253{258 (1925). [48] W. Lenz. Beitrag zum Verstandnis der mag-netischen Erscheinungen in festen Korpern. Phys. Zeitschr. 21, 613{615 (1920). [49] R. B. Potts. Some generalized order-disorder transformations. Proc. Cambridge Phil. Soc. 48, 106 (1952). [50] F. Y. Wu. The Potts model. Rev. Mod. Phys. 54, 235{268 (1982). [51] N. Metropolis, A. W. Rosenbluth, and A. H. Teller. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 6, 1087{ 1092 (1953). [52] A. Aleksiejuk, J. A. Ho lyst, and D. Stauffer. Ferromagnetic phase transition in Barabasi-Albert networks. Physica A 310, 260{266 (2002). [53] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. Ising model on networks with an arbi-trary distribution of connections. Phys. Rev. E 66, 016104 (2002). [54] K. Kacperski and J. A. Ho lyst. Phase transi-tions as a persistent feature of groups with leaders in models of opinion formation. Physica A 287, 631{643 (2000). [55] F. Putsch. Analysis and modeling of science col-laboration networks. Adv. Compl. Sys. (2003). Submitted on May 6th. 58