This document is Felix Putsch's 2003 diploma thesis submitted to the Institute of Theoretical Physics at the University of Cologne under the supervision of Prof. D. Stauffer. The thesis analyzes and simulates scientific collaboration networks using network models such as regular lattices, Erdos-Renyi random networks, Watts-Strogatz small-world networks, and Barabasi-Albert scale-free networks. It builds an empirical collaboration network from bibliographic data and compares its properties to these theoretical network models. The thesis also uses spin models on the empirical network to study phenomena like leadership effects and phase transitions.
4. rm (according to the Prufungsordnung of July 12, 1996, x20(5)) having
composed this diploma thesis alone, using no other than the mentioned sources and tools.
Citations have been marked.
Hiermit versichere ich gemass x20(5) der Prufungsordnung vom 12. Juli 1996, dass ich
diese Diplomarbeit alleine erstellt und keine anderen als die angegebenen Quellen und
Hilfsmittel verwendet habe. Zitate wurden kenntlich gemacht.
* fxp@thp.uni-koeln.de
5. Though the mountains divide
And the oceans are wide
It's a small world after all
R. M. and R. B. Sherman
12. 1. Introduction
1.1. Networks
In recent times, hearing the word network
immediately arouses the idea of physically
wired networks as those formed by tele-
phone lines or computer links. Though, net-
work is a concept a good deal more general
than only this.
Mathematically spoken, networks are
graphs, i.e. a set of nodes (of whatever
kind) connected by edges (links, connec-
tions) between certain pairs.
This abstraction has been known for a
long time. Probably the
13. rst paper of graph
theory was written by Euler [1], the so-
called bridge problem of Konigsberg. Eu-
ler discusses whether or not it is possible
to make a round walk, passing of each of
Konigsberg's nine bridges exactly once (
14. g-
ure 1.1).
The concept of networks can be applied
to lots of theoretical or experimental sub-
jects [2{4], nodes being people [5], Internet
servers [6], scientists [7, 8] or others, the
range of links comprises e-mails [9], friend-
ships [5], citations [10, 11] and more.
Thus, there are numerable dierent kinds
of networks, physical ones (e.g. hard wired)
as well as logical (e.g. dependencies) or
social ones (e.g. contacts, friendships),
stretching out to topics far from wired net-
works [9, 12]. The area is under vigorous
research. Good reviews can be found in [2{
4, 13, 14].
1.2. Six Degrees of Separation
Out of personal experience, nearly every-
body has been confronted with what we call
small world eect. There are numerous ex-
amples:
At a party, we
15. nd out to know some
stranger we just started talking to by only a
few middle-persons (or technically spoken
we are only separated from him by a low
degree). E.g., he could be our street neigh-
bors' colleague's son. We hear My god,
how world is small.
Rumors are another example. We are as-
tonished to experience the pace at which
they spread. After a few hours and thus
only a few possibilities of telling rumors to
others, whole city seems to know.
Milgram [15] made an experiment on
this. He instructed a set of people to try
to send a letter to some stranger, only by
using personal contacts. He found out that
an astonishing short chain of social links is
needed for this task, which entered in every-
day's language as Six Degrees of Separa-
tion. Recently, this has been reviewed on
a more popular basis by a German weekly
newspaper [5].
1.3. Small World Eect
Six Degrees of Separation is only one man-
ifestation of a more general principle: the
small world eect [16{18].
Observations of many real-world net-
7
16. 1. Introduction
Figure 1.1.: Konigsberg bridge problem: Is it possible to make a round trip, passing each bridge exactly
once? [1]
works in computer science, biology, chem-
istry, linguistic, sociology, etc. have re-
vealed a crucial dierence from regular lat-
tices.
Regarding average (or sometimes max-
imum) path lengths on such networks, we
would expect to see an increase linearly with
the number of nodes. Instead, we examine
distances growing logarithmically with sys-
tem size.
This behavior is not only an amus-
ing eect but has far-spreading conse-
quences [19]. Prominent examples are In-
ternet's stability against attacks [20], dis-
ease spreading [21, 22] or path
17. nding
strategies [23, 24].
1.4. Science Collaboration
Networks
In context of science, the network between
scientists as nodes of the graph is of par-
ticular interest. This network belongs to
the group of social ones, with humans as
nodes. Unlike many other forms of social
relationships, that are mostly quite diÆcult
to capture objectively, the
18. eld of published
papers is very widespread covered by the
Science Citation Index [25] and so easily
available to research.
8
19. 2. Network Models
There are many types of networks com-
peting to describe observations made in
socio-physics. After discussing which mea-
surements describe a given networks struc-
ture, we will give a short overview about
what we think to be the most important
ones and discuss advantages and possible
disadvantages.
2.1. Measurements
2.1.1. Small World Eect
As illustrated in the introduction, we are
interested in the correlation of network size
and average (maximum) path lengths. We
investigate if there is linear, logarithmic or
other behavior. In case of a logarithmic one,
the net is said to show the Small World
Eect.
2.1.2. Clustering
In friendship network, we
20. nd friends of one
person often to be friends themselves. This
is true for most social networks and even
other ones. Links are not spread randomly
but arranged in clusters.
To describe thing mathematically, we in-
troduce a clustering coeÆcient Ci of a node
i describing the portion of m established
links between all ki next neighbors com-
pared to the maximum possible number of
M = m
2 , i.e.
Cn =
m
M
:
This value is averaged over all vertices to
give a clustering coeÆcient C for the whole
network.
Typical values experienced are far above
results expected for random networks [2, p.
50].
2.1.3. Scale-Free Behavior
A third observation regarding social net-
works is its distribution of degrees. Regard-
ing frequency of vertices of given coordina-
tion numbers, we do not
21. nd an exponential
but a power law [26].
In all, we have three possible means
to classify networks. Many social graphs
show small path lengths, high clustering
and scale-free behavior.
2.2. Regular Lattices
The simplest form of a lattice is a symmet-
rical formation of nodes connected by edges
between all pairs (or all pairs of adjacent)
nodes as shown in
22. gure 2.1a,b. Reasons
to choose this linking are e.g. to simulate
neighborship in a town etc.
Such network show a high degree of clus-
tering, as wished. The average path lengths
are very long, though and scale with sys-
tem size. So, small world behavior cannot
be found which makes the model inappro-
priate for our needs.
9
23. 2. Network Models
Figure 2.1.: network types: a, b regular lattices, c random network, c scale-free graph [14]
10
24. 2.5. Barabasi-Albert Network Model
2.3. Erd}os-Renyi Random
Networks
Random networks are the extremum on the
other side of the spectrum. A number of
nodes is wired by pure chance, i.e. we throw
the dices to select two nodes and place an
edge between them.
Such graphs have been
25. rst proposed by
Solomono and Rapoport [27] and have
been extensively studied by Erd}os and Renyi
[28]. Actual results have been reviewed in
[29]. A typical result can be seen in
27. nd that average path lengths behave
logarithmically with network size. While
this is as desired for small world simulation,
obviously there is no clustering.
2.4. Watts-Strogatz Small-World
Networks
The idea is plausible to try combining both
presented models to sum up their corre-
sponding advantages. A big step towards
this goal was done by Watts and Strogatz
[16].
Their model starts with a circular graph
that is regularly wired (
28. gure 2.2). Step
by step, edges are chosen by chance and
rewired to an arbitrary destination node.
Thus, a small fraction of links are long-
range ones. To illustrate, this could be
habitants of a street of neighbors having
relationships with far-away relatives.
At
30. ll our de-
sires. It shows small average path lengths as
well as high clustering. Looking at the de-
gree distribution, i.e. the frequency of nodes
Figure 2.3.: Barabasi and Albert [30]
of a certain degree, we
31. nd strong dier-
ences from real-world data as there is no
scale-free behavior.
2.5. Barabasi-Albert Network
Model
Barabasi and Albert [30] started a new idea.
Their model consists of two ingredients:
growth and preferential attachment.
We start with a graph of m0 = 3 vertices,
each one connected to each other. Now, in
each time step, we add a node that is con-
nected to others by m = 3 links. The new
node being one side of the links, the other
one is chosen at random from the existing
network. The probability of a vertex being
selected is proportional to the number of
links already attached to it.
To stay in the image: If you already have
lots of friends, you are more likely to get
new ones. The rich get richer.1
These rules result in a network (
32. g-
ure 2.1d) that is capable of reproducing
small-world behavior, as well as being scale-
free. Research has found good collapse with
1 Whoever has will be given more, and he will
have an abundance. [31].
11
33. 2. Network Models
(a) (b) (c)
Figure 2.2.: Watts-Strogatz network model: We start with a regular lattice (a) formed to a ring (b)
and re-wire a small fraction of links to random destinations (c) [18]
empirical networks, included e.g. the world
wide web [6]. Good introductions can be
found in [32, 33].
Clustering is present to a certain degree,
but still much too small regarding experi-
mental values.
2.6. New Approaches
Recently, new network models have been
developed to cope with inconveniences en-
countered with present ones.
Ravasz and Barabasi [35] examined net-
works of a self-similar structure imitating
the idea of hierarchical organization in soci-
ology. Combining high clustering and scale-
free behavior, their model does not show
short path lengths, though.
Klemm and Eguiluz [34]2 developed an
auspicious model joining all three demands
in one network. The authors present a gen-
eralization of the Barabasi-Albert model,
adding aging of nodes and some random
behavior.
There will be further research to be done
2cf. also [36]
on this model to verify if it copes with re-
ality.
An overview of all models can be found
in
35. 2.6. New Approaches
high clustering
short average
Watts−Strogatz
Klemm−Eguiluz
Erdos−Renyi
Barabasi−Albert
Ravasz−Barabasi
scale−free
regular lattice
path lengths
small world
hierarchical model
random network
Figure 2.4.: overview over recent network models (Erd}os and Renyi [28], Watts and Strogatz [16],
Barabasi and Albert [30], Klemm and Eguiluz [34], Ravasz and Barabasi [35])
13
37. 3. Empirical Collaboration Network
In context of science, the network be-
tween scientists as nodes of graph is of par-
ticular interest.
3.1. Typology
First, we want to deal with the de
38. nition
of a collaboration graph. As to the nodes,
we have the choice to identify each vertex
either with an author or with a paper.
The second possibility is also area of re-
search [37], but we think studying the rela-
tionship of scientists as the paper's authors
oers more insight in how research works.
So we will make each scientist a node of
our network.
As what concerns the edges, there are ba-
sically two possible choices|both covered
equally by the database used [25].
3.1.1. Citation Graph
We might chose to consider citations from
one author to another as links [10], thus
resulting in a directed graph.
Starting at a given paper, we can enlarge
our network by following links recursively
up to a certain depth, e.g. by depth-
40. rst algorithms. Each new work
will cite several to many still un-included.
Roughly spoken, the number of publications
to include will raise exponentially with the
maximum depth chosen.
Quickly, we arrive at huge amounts of
data. Additionally, there is no canonical
end of the hunt for new links. In the ex-
treme case we could be caught in a giant
cluster containing all or nearly all of the pa-
pers ever published. We see no possibility to
narrow this down in a reasonable way with-
out fear of introducing arbitrary boundary
conditions.
3.1.2. Collaboration Graph
Second possibility to de
41. ne edges of a graph
is creating links by co-authorship in one or
several papers [7, 8]. If n scientists pub-
lish a paper together, they are connected
to each other by n
2 edges.
As an additional advantage, we have the
choice to start with an arbitrary set of au-
thors, establishing links between them by
looking at all papers they are involved. This
will result in a graph of limited size.
Of course, we should think carefully
about reasonable selection, to avoid edge
eects. We will discuss this in the next sec-
tion.
3.2. Building the net
3.2.1. Proceeding
As solution, we choose the following pro-
ceeding: We start with one paper. As one
part of our work will be the comparison
of real world data to Barabasi-Albert net-
works, we take the corresponding paper [30]
as center of investigation.
15
42. 3. Empirical Collaboration Network
In order to determine the set of authors
we want to deal with, we select all 185 pa-
pers that cite this paper.1 Secondly, we
construct a list of unique authors from all
these papers. A
43. rst approach delivers 559
scientists, whereof some turned out to be
identical but appearing in particular papers
with typos. We
44. nish with a set of 555
authors to whom we attribute consecutive
numbers.
The last step of the network creation pro-
cess consists in establishing links between
all these authors. This is done by selecting
one paper after the other and introducing
connections between each possible pair of
this paper's authors (i.e. n
2 links for n au-
thors).
Eventually, this gives us a graph of 555
nodes representing scienti
45. c collaboration
in the area of Barabasi-Albert networks.
The network size is relatively small com-
pared to all data in the Science Citation
Index [25] (approx. 107 papers). Studying
properties of this subnet, we hope getting
an insight to what leads to the structure
observed. Veri
46. cation with bigger networks
remains a task for the future.
3.2.2. Visualization
To get an idea of what we are dealing about,
we visualize the graph using a spring model
[38, 39]. In order to give manageable results
we remove a paper on the Human Genome
Project [40] with 274 authors. Brief exami-
nation yields that this is no harm, as scien-
1We have to be careful not to mix citation data
from dierent dates as new papers are continu-
ously added to the database. Base of our inves-
tigation is October 21st, 2002.
tists participating in this work did not co-
operate with others in our graph, and form
a big cluster on their own. The result is
shown in
47. gure 3.1.
3.3. Analysis
3.3.1. Authors per paper
authors frequency
1 37
2 69
3 47
4 21
5 6
6 4
7 1
Table 3.1.: Authors per paper
First thing we are interested in is the fre-
quency distribution of papers per author.
We expect to see many papers with few au-
thors and vice versa (table 3.1).
70
60
50
40
30
20
10
0
~x^3.58*exp(-x/0.54)
1 2 3 4 5 6
frequency
number of authors
~x^-3.3
Figure 3.2.: frequency distribution of the number
of authors per paper
16
50. gure 3.2)
shows this behavior with one remarkable ex-
ception. There are much too few papers
written by only one author. This could be
due to the fact that collaboration helps in
science, but the more (scienti
52. c out-
put as communication overhead increases.
In other words: establishing scienti
53. c re-
lationships with other authors is not easy.
You have to agree on the
54. eld of research,
coordinate your eorts etc. Postulating
that cooperation with more scientists is al-
ways favorable, we can explain the statistics
by diÆculty of
55. nding new partners. This
even increases corresponding to the num-
ber of co-workers you already have, as ad-
ditional coordination is needed. The risk of
research overlap raises, too.
The power law predicted by Lotka [11]
with an exponent of 2 cannot be con-
56. rmed. This could be due to insuÆcient
statistics for this test. Other recent stud-
ies of collaboration networks found an ex-
ponent of 2:1 or 2:4 [41] which is another
indication for statistical errors predominat-
ing our results of study.
3.3.2. Connections per author
Next, we study the number of connections
per author, which is the number of other
scientists an author ever published papers
with. This number is weighted by the num-
ber of papers, i.e. a coauthor with whom a
scientist published n papers contributes n
connections (table 3.2).
Again, we expect to see a frequency de-
crease with increasing number of connec-
tions. The experimental data (
57. gure 3.3)
links weighted unique
0 16 16
1 51 67
2 75 69
3 61 67
4 24 27
5 21 22
6 10 2
7 5 6
8 4 3
9 2
10 3
11 1
13 2 1
14 1
15 1
17 1 1
20 2
29 1
273 274 274
Table 3.2.: Connections per author
100
10
1
0.1
0.01
1 10
frequency
x^-2.85
x^-3.53
number of connections
Figure 3.3.: frequency distribution of the num-
ber of connections per author
(black:weighted|grey:unique)
shows smaller frequencies for isolated au-
18
58. 3.3. Analysis
thors that never publish with others as well
as for authors with only one coauthor. This
is comparable to the eect observed in the
last graph. The most productive seem to be
authors with two or three colleagues they
are working with.
Statistical data in the area of highly con-
nected authors shows a truncated power
law. The exponent of approximately 2:85
falls well in the region bounded by analysis
of other scale-free networks (www: around
2:3 [6, 26, 42]). The sharp or exponen-
tial cuto at very high connection numbers
has been reported for other networks, too
[7, 43]. Mossa et al. [44] oer an expla-
nation using a model with limited (local)
information on the network. Surely, no sci-
entist knows all others, so this could lead
to the observed eect.
3.3.3. Double vs. unique links
We are interested how things change when
we cease weighting connections by num-
ber of papers published together, i.e. we
only take into account how many other
unique scientists a researcher published pa-
pers with. Results can be found in table 3.2.
We see that despite the dierent num-
bers, results are qualitatively the same. Sci-
entists working together with two other au-
thors are the most productive.
Depending on whether your glass is half
full or half empty there are two contrary
explanations:
1. It is common practice that you name
persons as authors of your work that
did not contribute to it, out of a feeling
of debt, may it be sponsors or others.
2. Science lives from cooperation. Work-
ing together on one subject increases
scienti
59. cal output whilst reducing er-
rors.
The author of this paper will not judge.
3.3.4. Cluster sizes
size frequency
1 16
2 28
3 15
4 13
5 2
6 3
7 2
8 2
9 2
10 1
26 1
274 1
Table 3.3.: Cluster sizes
Our last focus is on subnets of science
that exist in our net of collaboration. Au-
thors group into several clusters by connec-
tions established between them. We inves-
tigate the frequency of clusters of a given
size. Our expectation is getting a frequency
increase for growing cluster sizes up to a
peak, and then a decay as clusters grow
very big, comparable to the statistics we
saw already.
The experimental data (table 3.3,
60. g-
ure 3.5) shows this behavior, but with one
surprise: although the most frequent cluster
size is 2 due to a big number of publications
19
62. 3.4. Comparison
10
1
0.1
0.01
1 10
frequency
cluster size
x^-2.61
Figure 3.5.: frequency distribution of the cluster
size
with two authors, most scientists maintain
collaboration with three others.
The can only be due to scientists be-
ing member of research groups involved in
dierent themes, thus connecting dierent
clusters formed by single papers. This can
directly be veri
65. gure 3.4) of the clusters,
ordered by size.
3.4. Comparison
We have collected some statistical
66. gures
to express the structure of the network in
concern. Now we want to
67. nd out whether
classical or current network models bear
similar results.
3.4.1. Erd}os-Renyi random graphs
Connections per author In a random
graph the distribution is a binominal one,
i.e. the probability of a node with k con-
nections in a net with N nodes is
P(k) / N 1
k !pk(1 p)N1k:
In the limit of large N this approaches a
Poisson distribution around the expectation
value hki = pN
P(k) / ehki hkik
k!
:
This is contrary to the power-law statistic
observed in the collaboration network.
Cluster sizes For random graphs, perco-
lation theory predicts that the cluster size
distribution shows an exponential decay for
big cluster sizes [2, 45].
Again, the considered network show
rather a power-law decay than an exponen-
tial one.
Unsurprisingly, the structure of scienti
68. c
collaboration diers basically from that of
a random network.
3.4.2. Watts-Strogatz small-world
networks
Connections per author The degree
distribution of Watts-Strogatz small-world
networks is similar to that of a random
graph [2]. It has a peak and decays ex-
ponentially for large connection numbers,
contrary to the collaboration network.
Cluster sizes The usual case in Watts-
Strogatz networks is re-wiring of only a
small portion of links. Thus, the network
stays well connected, mostly forming one
giant cluster.
21
70. c
collaboration as well.
3.4.3. Barabasi-Albert networks
Connections per author Barabasi-
Albert networks show a vertex degree
distribution as P(k) / k3 [30, 46].
In our science collaboration network we
found out exponents of 2:85 resp. 3:53
which is only a slight deviation.
Cluster sizes In Barabasi-Albert net-
works, new sites are added with links to
already existing nodes. Consequently only
one giant cluster forms. Obviously this dif-
fers crucially from the net of co-authorship.
22
71. 4. Spin models
4.1. Leadership eect
4.1.1. Ising model
In 1925, Ising [47] published a paper on a
model of spin interaction that later became
very famous. The idea to this had been
given to him by his teacher Lenz [48], so it is
sometimes referenced as Lenz-Ising model.1
The idea is to consider spins (e.g. on a
square lattice) and an interaction Hamilto-
nian
H = Xi6=j
Ji;jSiSj
where Si are the spins and Ji;j is a matrix
describing the interaction forces. Usually
we consider the case
Si;j = ( J: i; j nearest neighbors
0: else,
i.e. only allow equal interaction between
nearest neighbors (J 0 for ferromagnetic
behavior).
In the following chapter, we investigate
how such a model behaves on our con-
structed collaboration network.
1
A generalization of the Ising model is the Potts
model [49, 50]. Instead of Ising spins with two
possible states +1 and 1, Potts allows k 2
dierent spin values. The Hamiltonian is
H = JXi6=j
Æi;k;
i; j being nearest neighbors.
Applying it to the scienti
73. nd re-
sults very similar to those of Ising's model.
Figure 4.1.: Ising [47]
We use a Metropolis [51] Ising model,
i.e. probabilities for a single spin
ip of
p / eE=kBT if E 0 and 1 otherwise.
To determine E we sum up spins of all
vertices connected to a given node. In prin-
ciple, we have the choice between two pro-
ceedings:
consider only unique links between two
nodes
count a connection several times ac-
cording to the number of links, i. e.
the number of papers the correspond-
ing authors published together.
Both possibilities have been examined.2
2change the switch NODOUBLE in line 17 of source
23
74. 4. Spin models
4.1.2. Phase transition
The results of both experiments show qual-
itative similarity. We observe a rounded
phase transition at about kBT=J = 0:8 op-
posed to a value of about 2:3 on the regular
square lattice.
A closer look reveals that decay of mag-
netization with raising temperature is ex-
ponential. This result corresponds to re-
search on Ising models on Barabasi-Albert
networks by Aleksiejuk, Ho lyst, and Stauf-
fer [52], who also found an exponential law.
Anyhow, the critical temperatures found
by me and by Aleksiejuk et al. [52] dier
by more than one order of magnitude. This
can easily be explained by dierent coor-
dination numbers in both networks. The
collaboration graph holds a maximum of 14
neighbors of a single vertex, the graph of
Aleksiejuk et al. [52] exceeds this by sev-
eral orders of magnitude. This makes it far
more diÆcult to break the ferromagnetic
bonds, resulting in a higher critical temper-
ature.
4.1.3. Degree distribution
Dorogovtsev et al. [53] studied random
graphs with given degree distributions P(k)
of a vertex of degree k. They deduced an
estimate for the critical temperature of an
Ising model on such networks as
J
kBTc
=
1
2
ln hk2i
hk2i 2hki!:
Considering the collaboration network as
a random graph with given degree distribu-
tion (table 4.1), we use their formula and
code in section B.1
degree several unique
0 16 16
1 51 67
2 75 69
3 61 67
4 24 27
5 21 22
6 10 2
7 5 6
8 4 3
9 2
10 3
11 1
13 2 1
14 1
15 1
17 1 1
20 2
29 1
Table 4.1.: Degree distribution
get Tc=J = 2:91, counting only unique links
between scientists. Using all links, we
75. nd
Tc=J = 5:46.
Both values are far from critical temper-
atures observed in simulation. This is a
strong clue towards the statement that col-
laboration networks are crucially dierent
from random networks, even with the same
degree distribution.
4.1.4. Spin
ip model
Following a suggestion of Ho lyst3, we can
determine the importance of most con-
nected authors of our collaboration network
by successive
ipping of most connected
3 personal correspondence, cf. [52]
24
76. 4.2. Cluster limited Ising models
spins and pinning them in their new posi-
tion.
In other words: After some time of equili-
bration, we chose the author who has most
connections to others, and change his/her
spin permanently to a value of 1, opposite
to all others (at T = 0, or nearly all others
else). Subsequently, we allow the system
to relax some time, after which we perma-
nently
ip the second most connected spin,
and so on.
250
200
150
100
50
0
0 50000 100000 150000 200000 250000 300000
M
t
unique
multiple
Figure 4.2.: Ising model with successive spin
ips.
After 105 steps of equilibration, we
ip the most connected spin and
stick it to its new value. After some
relaxation of 104 steps, this step is
repeated. Network with multiple and
network with unique links used. Av-
eraged over 1000 runs. T = 0:2.
Results are shown in
77. gure 4.2. We ob-
serve two things:
1. Even after switching 20 most con-
nected spins, the system does not
ip
in the opposite state with all spins
pointing down. In simulations of Alek-
siejuk et al. [52], less than 6 spins
were enough to
ip a whole network
of 30; 000 nodes.
This is quite obviously due to the fact
that we don't have a contiguous graph,
but one consisting of dierent clusters.
A spin
ip in one cluster is not able to
aect spins in others.
In pictures of spins representing opin-
ions (yes/no, etc. [54]), this means a
few authors with view diering from
the broad mass of scientists are hardly
capable of changing the global opinion,
may they even be the most connected
(known) ones.
2. Allowing multiple links in our net,
we expect the magnetization to break
down much faster, as a
ip of a spin is
able to in
uence others in a stronger
way.
Yet, the simulation shows contrary
results. The graph containing only
unique links shows a much steeper de-
cay of magnetization (
78. gure 4.2).
Possible explanation is the fact, that
choosing most connected spins in a
network only having unique links picks
authors with connections to many
other authors, whereas in a network al-
lowing multiple links, there are as well
spins connected in a strong manner to
only few others.
It seems that in order to spread new
opinions, it is more advantageous to
have small in
uence on many other
people than a big impact on only few
ones.
4.2. Cluster limited Ising models
The network we are looking at consists of
many distinct clusters of dierent sizes (
80. 4. Spin models
ure 3.4). We may ask if they dier regard-
ing their properties, or if they behave alike.
4.2.1. Proceeding
We split up the network into sub-nets, each
containing one cluster, numbered sequen-
tially, 2, 3a, 3b,. . . 26 (numbers and letters
as in
81. gure 3.4).
On each net, we run an Ising model
for temperatures from 0:1 to 6:9 in 0:2
steps. Each of these simulations runs for
106 steps, with magnetization measured ev-
ery 100 steps, thus resulting in 104 mea-
surements per run, to give good statistics.
4.2.2. Results
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
M/N
kT/J
Figure 4.3.: Ising model on the dierent clusters
of a collaboration net, averaged over
104 measurements per temperature
and net.
Examining the results (
82. gure 4.3), we see
dierent curves that all decay with raising
temperature, but show no apparent similar-
ities. We wonder why the curves seemingly
do not converge to zero but to
83. nite values.
Obviously, the network is so small that
macroscopic magnetization
ips occur fre-
quently, even at moderate temperatures.
That means, expectation value of magne-
tization4 at high temperatures is not zero,
but something around one!
4.2.3. Bias adjusti
84. cation
To validate this hypothesis, we simulate the
networks at very high temperature (kBT=J =
50), in order to determine M1 = M(T =
1) (table 4.2).
net M1
2 1.03
3a 1.71
3b 1.51
4a 1.59
4b 1.54
4c 1.55
5a 2.02
5b 1.96
6a 2.00
6b 1.94
7a 2.27
7b 2.23
8a 2.26
8b 2.29
9a 2.55
9b 2.55
10 2.54
26 4.25
Table 4.2.: Bias
4 all over this publication we consider the (un-
signed) value jMj as magnetization, not M!
Doing the latter leads to false results. E.g. at
low temperatures, averaging M over very long
times would give zero, as
ips of the whole sys-
tem occur (though at very low probabilities).
26
85. 4.2. Cluster limited Ising models
Using these values, we rescale our simu-
lation results from
86. gure 4.3 by use of the
scaling function5
M
N
!
M M1
N M1
;
where N is the total number of authors in
the cluster (
87. gure 4.4).
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
(M-M¥)/(N-M¥)
kT/J
2
3a
3b
4a
4b
4c
5a
5b
6a
6b
7a
7b
8a
8b
9a
9b
10
26
Figure 4.4.: Same data as in
88. gure 4.3 but ad-
justed to a common scale from 0 to
1 by eliminating bias from random
uctuations.
We
89. nd a much cleaner image. Apart
from one exception (6b) all curves are par-
allel up to the M = 0:5-line, and even be-
yond there are very few crossings.
Each dierent cluster can now be char-
acterized by the temperature # at which it
achieves M = 0:5. This leads to a sort of
melting temperature (table 4.3).
To get an idea of this temperature's
meaning, we sort the cluster's graphical rep-
resentations by # (
90. gure 4.5).
Is seems plausible that # is a measure
of coherence or connectiveness of a clus-
5 This function is a linear approximation that gives
1 for M ! N and zero for M ! M1.
net #
2 1.7
3a 2.8
3b 1.5
4a 3.3
4b 2.1
4c 2.6
5a 4.0
5b 2.2
6a 4.8
6b 3.5
7a 2.8
7b 1.8
8a 2.5
8b 3.7
9a 2.2
9b 3.1
10 1.9
26 3.8
Table 4.3.: Melting temperatures
ter. Single bonds lead to lower melting tem-
peratures, fully connected subsets to higher
ones.
This oers a possible explanation why 6b
shows a dierent behavior from all others
in
91. gure 4.4: This cluster consists of two
parts. One of them is a completely con-
nected set of
92. ve nodes, the other one a
single vertex. Both are linked by one single
edge. Probably, this con
ict of interest
leads to the observed anomality.
Additionally, connectedness described by
# seems to be crucially dierent from the
classi
93. cation given by standard clustering
coeÆcient. E.g. the net consisting of three
completely connected vertices 3a yields a
clustering coeÆcient of 1 but a low melting
27
96. 4.2. Cluster limited Ising models
temperature.
4.2.4. Linear Relationship
Surprisingly, we
97. nd a linear relationship be-
tween N and E=# (
98. gure 4.6). Thus, we
postulate
E
#
= aN b
and conclude a formula for #:
#calc(E;N) =
E
aN b
:
Fitting parameters to our measurements
(yielding a = 0:72; b = 0:89), this gives
good prediction of melting temperatures.
Results can be seen in
99. gure 4.7, as well
as a diagram showing errors being inferior
to 10% in most cases.
#calc =
E
aN b
N!1 !
hki
2a
:
We see that, in the limit of high N, the
melting temperature # is proportional to
the average number of edges per site hki =
2E=N, a result known from mean
103. ed Barabasi-Albert
model
Network model of Barabasi and Albert [30]
was introduced in section 2.5. We pointed
out that it shows rather good
104. ts with em-
pirical networks, but lacks support for dis-
jointed ones, as the algorithm only delivers
one giant cluster.
Thus, to cope with networks consisting
of several components, we must modify the
model. We chose a very simple approach:
In each step of adding nodes, we start a new
cluster of m0 = 3 nodes with probability p.
Vertices added in consecutive time steps
can connect to any node in any component
respecting the same probability rule as in
the standard model.
In the case m = 1, components can
only grow (isolated clusters), whereas
in the case m 1, new nodes are able
to connect two or more existing com-
ponents of the network (merging clus-
ters).
5.2. Simulation
In order to compare simulation results with
real-world data from a scienti
105. c collabora-
tion network [55], we let the network grow
up to the size of 555 nodes. For proper
statistics, this is repeated 104 times.
5.2.1. Isolated clusters
scale-free behavior In case of m = 1,
i.e. considering isolated clusters, we can be
sure to get scale-free behavior within the
distinct clusters, as the probabilities for at-
tachment of a new node to an existing one
are the same as in a single Barabasi-Albert
network (modulo a constant factor due to
a new node having the choice between
dierent clusters to connect to).
However, complete network is a priori not
necessarily scale-free, as total statistics is
made up by the sum of all scale-free sub-
networks or clusters. So, we have to focus
later on the question, if scale-free behavior
prevails.
cluster size distribution Next, we exam-
ine the number of clusters of dierent sizes
(
107. nd that high
probability of starting a new net leads to
many smaller networks, whereas low values
privilege bigger networks. Yet, we make
an interesting observation: low probabili-
ties lead to a cluster-size distribution that
is not monotonic any more, but favors big
networks.
Looking at
108. gure 5.1 which shows the
number of points in clusters of a given size
instead of the sheer cluster count, makes
this more plausible.
For p = 0, we will see a graph /
Æ(555), as there is only one giant clus-
ter,
31
109. 5. Barabasi-Albert network models
1e+06
100000
10000
1000
100
10
1
10 100
frequency
cluster size
0.01
0.04
0.1
0.4
0.8
1e+06
100000
10000
1000
100
10 100
node count
cluster size
0.01
0.04
0.1
0.4
0.8
Figure 5.1.: Frequency of clusters (left) resp. number of nodes in clusters of a given size (right) vs.
cluster size at dierent probabilities for a new net. Simulation was run 104 times with a
network growing up to 555 nodes. The curve for p = 0:01 is the one with the rightmost
peak; to the left follow the other p-values in descending order.
for p = 1 a graph / Æ(m0 = 3), be-
cause there are only embryonic sub-
nets.
What we observe for 0 p 1 is the
transition between both extremes.
4
2
1
-exponent
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
exponent of power law region
probability for a new net
e2.25x
Figure 5.2.: Negative exponent of the power law
part of the curves in
110. gure 5.1 vs.
probability p for a new net. The line
corresponds to exponent = e2:25p.
power law region For all p, we start with
a power law region, regarding the distribu-
tion of small and medium cluster sizes. The
exponent varies with network-birth proba-
bility p. In the semi-logarithmic plot in
111. g-
ure 5.2 it is shown, that the exponential
relation e2:25p describes our data rather
well. Of course, this formula cannot be
true for general p as for p ! 1 we expect
m ! 1!
Regarding empirical data from sections
3.2's collaboration graph, we
113. gure 5.3). In-
terestingly, the model is even able to ex-
plain facts formerly regarded as statistical
anomalies, as the observation of a giant
cluster of a size exceeding largely all oth-
ers in the network (section 3.2.2).
5.2.2. Merging clusters
Now, we modify the model by examining
m 1. In this case, newly added vertices
develop several links to existing nodes (and
thus existing clusters), being able to con-
32
114. 5.2. Simulation
10
1
0.1
0.01
collaboration network
p=4%’
1 10 100
frequency
total authors in clusters of given size
Figure 5.3.: Semi-logarithmic plot comparing the simulation with p = 0:04 using the isolated clusters
model of
115. gure 5.1 and statistical data from a science collaboration network (section 3.2).
nect hitherto separated networks. In this
paper, we limit our considerations on the
standard Barabasi-Albert case m = m0 =
3.
Using dierent p, we quickly recognize
that low and medium probabilities make
the simulations nearly always end up with a
single giant cluster containing all vertices.
Points of interest are higher p in the region
of 60{90%.
cluster size distribution Again, we plot
the total number of nodes contained in clus-
ters of a given size (
116. gure 5.4). For small
cluster sizes, we observe an non-uniform be-
havior of the graph. The explanation is as
follows: newly born clusters have a size of
m0 = 3 and thus appear very often. Also,
cluster of sizes 4 or 7 are very probable,
whereas a cluster of size 5 is very rare, be-
cause it can only be formed by a new clus-
ter to which two new ones have connected
without gluing it to a second cluster.
In a semi-logarithmic plot (
118. nd a parabolic dependence for high cluster
sizes (i.e. a Gaussian distribution around a
mean depending on p). Appearently, the
merging clusters cannot cope with reality.
33
119. 5. Barabasi-Albert network models
1e+06
100000
10000
1000
100
10
1
2 4 8 16 32
frequency
degree
p=60%
p=80%
1e+07
1e+06
100000
10000
1000
100
10
1
0 5 10 15 20 25 30 35
frequency
degree
p=60%
p=80%
Figure 5.5.: Frequency of nodes with a certain degree. Simulation was run 104 times with networks
growing up to 555 nodes. Left plot is linear, right plot semi-logarithmic. M = 1 (isolated
clusters).
1e+07
1e+06
100000
10000
1000
100
10
1
2 4 8 16 32 64 128
frequency
degree
p=1%
p=40%
1e+07
1e+06
100000
10000
1000
100
10
1
0 20 40 60 80 100 120 140 160 180 200
frequency
degree
p=1%
p=40%
Figure 5.6.: Same plots as
120. gure 5.5 using M = 3 (merging clusters).
5.2.3. scale-free behavior
In
121. gure 5.5 we can see that there is no
pure scale-free behavior. There seems to
be power-law behavior for small degrees and
an exponential cuto (
122. gure 5.5) at higher
values. Similar results have been observed
by Newman [7] for collaboration networks.
One could argue that this eect is due
to the fact that we do not plot the degree
distribution for single clusters, but for the
whole set of them. This demur only counts
at
123. rst sight, though. At p = 80% we have
several small clusters but virtually only one
giant cluster dominating the degree distri-
bution for high degrees. So, the fact of
averaging of many dierent sized clusters
should manifest mainly in the area of small
degrees opposite to our observations.
Mossa et al. [44] oer a possible expla-
nation for the exponential cuto encoun-
tered. They use a model which attributes
to each node only a restricted knowledge
on the network, i.e. the vertex is not able
34
124. 5.2. Simulation
1e+07
1e+06
100000
10000
1000
100
10
50 100 150 200 250 300 350 400
node count
cluster size
p=70%
p=80%
p=90%
Figure 5.4.: Number of nodes in clusters of a
given size vs. cluster size at dierent
probabilities for a new net. Simula-
tion was run 105 times with a net-
work growing up to 555 nodes.
to consider the whole graph's structure, but
only a subset according to its limited view.
M p ln(N0) k .
1 1% 16.8 2.27 60.
1 40% 17.1 2.45 9.2
3 60% 18.1 3.1 5.1
3 80% 20.1 4.8 3.5
Table 5.1.: CoeÆcients for
130. 6. Conclusion
Dealing with real-world networks, scien-
tists found three properties predominating:
short average path lengths (Small
World Eect),
scale-free behavior,
high clustering.
Dierent models were developed to cope
with this challenge, each having dierent
advantages and disadvantages. The model
of Barabasi and Albert [30] is a promis-
ing one, but lacks support for discontiguous
networks.
We constructed a network of co-
authorship with 555 authors. Only scien-
tists were chosen that cite a speci
131. c pa-
per [30]. The resulting net shows scale-free
characteristics but diers substantially from
accepted computer models' results.
Simulating Ising models on the network
reveals strong robustness against distur-
bances (spin
ip experiment/leadership ef-
fect) and shows coherence with mean
133. nd the critical temperature of
subnets of our graph being proportional to
the average number of edges per site, in the
limit of a high node count.
In order to overcome the mentioned dis-
advantages of the Barabasi-Albert model,
we developed a modi
134. ed version, allowing
formation of multiple clusters. We saw a
strong dependence of a node's edges count
on the network structure, separating two
cases: isolated clusters and merging clus-
ters.
Only the
136. t-
ting reality. Comparison with statistics from
our collaboration net shows similar behavior
and is even able to explain facts at
137. rst re-
garded as statistical anomalies as the obser-
vation of a giant cluster of a size exceeding
largely all others in the network. Even ex-
ponential cuto of nodes with high degrees,
as encountered empirically, is reproduced.
Re-evaluating the model with a higher
number of authors would lead to better
statistics and greater reliability.
37
139. A. Acknowledgements
I would like to thank D. Stauer1 for giv-
ing me the idea of the subject and support-
ing me by comments and discussions during
my research.
Thanks to M. Abd-Elmeguid2 for co-
judging this work.
My thanks for writing excellent software
goes to the authors of
LATEX, BibTEX, pdfLATEX, dvips
LATEXpackages: KOMAScript, natbib,
custom-bib, graphicx, listings, units,
hyperref, hypernat, colortbl, color
SciTE
gnuplot
Ruby, C++, Perl, bash
dotty, neato[39]
Thanks to A. Sindermann for supporting
my work by addressing computer network
problems.
Special thanks to K. Godthardt for sup-
porting me.
1Institute of Theoretical Physics, University of
Cologne
2II. Institute of Experimental Physics, University
of Cologne
39
141. B. Source code
B.1. Ising model
This C++ program simulates an Ising model on a given graph. In section 4.1.1 it was
used on our collaboration network.
// This program reads network data from a file and simulates
// an Ising model an this graph.
// Metropolis probabilities are used .
//
5 // Felix Puetsch fxp@thp .uni -koeln.de , 2003 -01 -22
#inc lude i o s t r e am
#inc lude f s t r eam
#inc lude s t d i o . h
10 #inc lude a s s e r t . h
#de f ine MAX INT 2147483647
#de f ine MAX CONN 30
15 #de f ine MAX NODE 600
#de f ine NODOUBLE 1
// # define NOASSERT
20
us ing namespace s t d ;
// === random number generator ================================
25 c l a s s Random f
p r i v a t e :
i n t s t a t e ;
p u b l i c :
Random( i n t s e ed ) ;
30 i n t ge t ( ) f r e turn s t a t e =65539; g // 16807
g ;
Random: : Random( i n t s e ed ) f
a s s e r t ( s e ed % 2 == 1) ;
35 s t a t e = s e ed ;
g
// === Vertex =================================================
41
142. B. Source code
40 c l a s s I s i n g ;
c l a s s Ve r t e x f
p r i v a t e :
i n t number , conn count , s p i n ;
45 Ve r t e x n e i g h b o u r [MAX CONN] ;
I s i n g i s i n g ;
p u b l i c :
Ve r t e x ( I s i n g i s , i n t nr ) ;
~ Ve r t e x ( ) ;
50 i n t g e tSp i n ( ) f r e turn s p i n ; g
void s e t S p i n ( i n t s ) f s p i n = s ; g
void addConn ( Ve r t e x to , i n t nodoubl e = 0) ;
i n t getNumber ( ) f r e turn number ; g
i n t getConnCount ( ) f r e turn conn count ; g
55 Ve r t e x getConn ( i n t i ) ;
i n t s imu l a t e S t e p ( ) ;
g ;
// === Ising ==================================================
60
c l a s s I s i n g f
p u b l i c :
Random rnd ;
i n t e n l imi t [ 2 MAX CONN+1] ;
65 p r i v a t e :
i f s t r e am ne t ;
char b u f f e r [ 8 0 ] ;
i n t v c o u n t ;
Ve r t e x v l i s t [MAX NODE ] ;
70 p u b l i c :
I s i n g ( char fname ) ;
void b u i l d n e t ( ) ;
void debug ( i n t nr ) ;
void r e s e t ( ) ;
75 void s imu l a t e ( double kT , i n t maxtime=1, i n t s t e p t ime=0) ;
g ;
// === Vertex =================================================
// ... Struktors .............................................
80
Ve r t e x : : Ve r t e x ( I s i n g i s , i n t nr ) f
conn count = 0 ;
i s i n g = i s ;
number = nr ;
85 g
Ve r t e x : : ~ Ve r t e x ( ) f
cout ~ Ve r t e x e n d l ;
g
90
void Ve r t e x : : addConn ( Ve r t e x to , i n t nodoubl e ) f
i f ( nodoubl e )
f o r ( i n t i =0; iconn count ; i++)
42
143. B.1. Ising model
i f ( n e i g h b o u r [ i ]==to ) r e turn ;
95 n e i g h b o u r [ conn count++] = to ;
a s s e r t ( conn count MAX CONN) ;
g
Ve r t e x Ve r t e x : : getConn ( i n t i ) f
100 a s s e r t ( i conn count ) ;
r e turn n e i g h b o u r [ i ] ;
g
i n t Ve r t e x : : s imu l a t e S t e p ( ) f
105 i n t spinsum=0;
f o r ( i n t i =0; iconn count ; i++)
spinsum += n e i g h b o u r [ i ]g e tSp i n ( ) ;
spinsum = s p i n ;
i f ( i s i n g rndge t ( ) i s i n g e n l imi t [ spinsum+MAX CONN] )
110 s p i n =1;
r e turn s p i n ;
g
// === Ising ==================================================
115 // ... Struktors .............................................
I s i n g : : I s i n g ( char fname ) f
cout I s i n g e n d l ;
rnd = new Random( 1 ) ;
120 ne t . open ( fname ) ;
i f ( ! ne t . i s o p e n ( ) ) f
c e r r i n p u t f i l e not found e n d l ;
e x i t ( 1 ) ;
g
125 b u i l d n e t ( ) ;
f o r ( f l o a t kT=50; kT5 0 . 1 ; kT+=0.2) f
c e r r s t a r t i n g s imu l a t i o n wi th kT= kT e n d l ;
r e s e t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 0 0 , 1 0 0 ) ;
130 g
g
// ... Methods ...............................................
135 void I s i n g : : b u i l d n e t ( ) f
cout b u i l d n e t e n d l ;
f o r ( i n t i =0; iMAX NODE; i++) f
v l i s t [ i ] = new Ve r t e x ( thi s , i ) ;
v l i s t [ i ]s e t S p i n ( 1 ) ;
140 g
i n t from , to ;
whi le ( ! ne t . e o f ( ) ) f
ne t . g e t l i n e ( b u f f e r , 8 0 ) ;
i f ( s s c a n f ( b u f f e r , %i %i ; , from , to ) != 2 ) f
145 c e r r i n p u t l i n e i g n o r e d : ' b u f f e r ' e n d l ;
cont inue ;
g
43
144. B. Source code
a s s e r t ( f rom MAX NODE) ; a s s e r t ( to MAX NODE) ;
v l i s t [ f rom]addConn ( v l i s t [ to ] , NODOUBLE) ;
150 v l i s t [ to]addConn ( v l i s t [ f rom ] , NODOUBLE) ;
g
ne t . c l o s e ( ) ;
v c o u n t = 0 ;
f o r ( i n t i =1; iMAX NODE; i++)
155 i f ( v l i s t [ i ]getConnCount ( ) 0) v c o u n t=i ;
e l s e v l i s t [ i ]s e t S p i n ( 0 ) ;
c e r r v c o u n t nodes e n d l ;
g
160 void I s i n g : : debug ( i n t nr ) f
Ve r t e x node = v l i s t [ nr ] ;
i n t nb = nodegetConnCount ( ) ;
cout node nr has nb
c o n n e c t i o n s : e n d l ;
165 f o r ( i n t i =0; inb ; i++)
cout nodegetConn ( i )getNumber ( ) e n d l ;
g
void I s i n g : : r e s e t ( ) f
170 f o r ( i n t i =0; iMAX NODE; i++)
v l i s t [ i ]s e t S p i n ( abs ( v l i s t [ i ]g e tSp i n ( ) ) ) ;
g
// ... Simulation .............................................
175
void I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t e p t ime ) f
i f ( ! s t e p t ime ) s t e p t ime=maxtime ;
f o r ( i n t i=MAX CONN; i=MAX CONN; i++)
e n l imi t [ i+MAX CONN] =
180 ( i n t ) (MAX INT ( 2 exp (2. i /kT)1) ) ;
i n t mag , t ime=0;
whi le ( t imemaxtime ) f
f o r ( i n t s t e p =0; s t eps t e p t ime ; s t e p++) f
mag = 0 ;
185 f o r ( i n t nr =1; nr=v c o u n t ; nr++) f
a s s e r t ( v l i s t [ nr ] !=NULL) ;
mag += v l i s t [ nr ]s imu l a t e S t e p ( ) ;
g
g
190 t ime += s t e p t ime ;
cout t ime kT abs (mag) mag e n d l ;
g
g
195 // === main ===================================================
i n t main ( i n t argc , char argv ) f
a s s e r t ( c e r r debug mode on e n d l ) ;
a s s e r t ( a r g c==2) ;
200 I s i n g i s i n g ( argv [ 1 ] ) ;
r e turn 0 ;
44
146. B. Source code
B.2. Spin
ip model
This C++ program simulates an Ising model on a given graph. In regular time intervals,
the most connected spins (hard coded) are pinned to an up position. In section 4.1.4
this was used on our collaboration network.
// This program reads network data from a file and simulates
// an Ising model an this graph.
// additionally , every ... time steps the next most connected
// spin is flipped permanently .
5 // Metropolis probabilities are used .
//
// Felix Puetsch fxp@thp .uni -koeln.de , 2003 -02 -05
#inc lude i o s t r e am
10 #inc lude f s t r eam
#inc lude s t d i o . h
#inc lude a s s e r t . h
#de f ine MAX INT 2147483647
15
#de f ine MAX CONN 40
#de f ine MAX NODE 600
#de f ine NODOUBLE 0
20
// # define NOASSERT
us ing namespace s t d ;
25 // === random number generator ================================
c l a s s Random f
p r i v a t e :
i n t s t a t e ;
30 p u b l i c :
Random( i n t s e ed ) ;
i n t ge t ( ) f r e turn s t a t e =65539; g // 16807
g ;
35 Random: : Random( i n t s e ed ) f
a s s e r t ( s e ed % 2 == 1) ;
s t a t e = s e ed ;
g
40 // === Vertex =================================================
c l a s s I s i n g ;
c l a s s Ve r t e x f
45 p r i v a t e :
i n t number , conn count , s p i n , s t i c k y ;
Ve r t e x n e i g h b o u r [MAX CONN] ;
46
147. B.2. Spin
ip model
I s i n g i s i n g ;
p u b l i c :
50 Ve r t e x ( I s i n g i s , i n t nr ) ;
~ Ve r t e x ( ) ;
i n t g e tSp i n ( ) f r e turn s p i n ; g
void s t i c k I t ( ) ;
void s e t S p i n ( i n t s ) f s p i n = s ; g
55 void addConn ( Ve r t e x to , i n t nodoubl e = 0) ;
i n t getNumber ( ) f r e turn number ; g
i n t getConnCount ( ) f r e turn conn count ; g
Ve r t e x getConn ( i n t i ) ;
i n t s imu l a t e S t e p ( ) ;
60 g ;
// === Ising ==================================================
c l a s s I s i n g f
65 p u b l i c :
Random rnd ;
i n t e n l imi t [ 2 MAX CONN+1] ;
p r i v a t e :
i f s t r e am ne t ;
70 char b u f f e r [ 8 0 ] ;
i n t v c o u n t ;
Ve r t e x v l i s t [MAX NODE ] ;
p u b l i c :
I s i n g ( char fname ) ;
75 void b u i l d n e t ( ) ;
void debug ( i n t nr ) ;
void r e s e t ( ) ;
void s imu l a t e ( double kT , i n t maxtime=1, i n t s t e p t ime =0 , i n t s t a r t t ime
=0) ;
g ;
80
// === Vertex =================================================
// ... Struktors .............................................
Ve r t e x : : Ve r t e x ( I s i n g i s , i n t nr ) f
85 conn count = 0 ;
s t i c k y = 0 ;
i s i n g = i s ;
number = nr ;
g
90
Ve r t e x : : ~ Ve r t e x ( ) f
cout ~ Ve r t e x e n d l ;
g
95 void Ve r t e x : : addConn ( Ve r t e x to , i n t nodoubl e ) f
i f ( nodoubl e )
f o r ( i n t i =0; iconn count ; i++)
i f ( n e i g h b o u r [ i ]==to ) r e turn ;
n e i g h b o u r [ conn count++] = to ;
100 a s s e r t ( conn count MAX CONN) ;
47
148. B. Source code
g
Ve r t e x Ve r t e x : : getConn ( i n t i ) f
a s s e r t ( i conn count ) ;
105 r e turn n e i g h b o u r [ i ] ;
g
void Ve r t e x : : s t i c k I t ( ) f
s t i c k y = 1 ;
110 s p i n = 1;
c e r r number s t i c k e d . e n d l ;
g
i n t Ve r t e x : : s imu l a t e S t e p ( ) f
115 i f ( s t i c k y !=0) r e turn s p i n ;
i n t spinsum=0;
f o r ( i n t i =0; iconn count ; i++)
spinsum += n e i g h b o u r [ i ]g e tSp i n ( ) ;
spinsum = s p i n ;
120 i f ( i s i n g rndge t ( ) i s i n g e n l imi t [ spinsum+MAX CONN] )
s p i n =1;
r e turn s p i n ;
g
125 // === Ising ==================================================
// ... Struktors .............................................
I s i n g : : I s i n g ( char fname ) f
cout I s i n g e n d l ;
130 rnd = new Random( 1 ) ;
ne t . open ( fname ) ;
i f ( ! ne t . i s o p e n ( ) ) f
c e r r i n p u t f i l e not found e n d l ;
e x i t ( 1 ) ;
135 g
b u i l d n e t ( ) ;
double kT=0.2;
f o r ( i n t i =1; i =100; i++) f
c e r r s t a r t i n g s imu l a t i o n wi th kT= kT e n d l ;
140 r e s e t ( ) ;
// the following is not beautiful , but quick :-)
s imu l a t e (kT, 1 0 0 0 0 0 , 1 0 0 0 ) ;
v l i s t [27] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 0 0 0 0 0 ) ;
145 a s s e r t ( v l i s t [27]g e tSp i n ( )==1) ;
v l i s t [329] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 1 0 0 0 0 ) ;
v l i s t [116] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 2 0 0 0 0 ) ;
150 v l i s t [223] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 3 0 0 0 0 ) ;
v l i s t [251] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 4 0 0 0 0 ) ;
v l i s t [237] s t i c k I t ( ) ;
48
149. B.2. Spin
ip model
155 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 5 0 0 0 0 ) ;
v l i s t [491] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 6 0 0 0 0 ) ;
v l i s t [365] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 7 0 0 0 0 ) ;
160 v l i s t [7] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 8 0 0 0 0 ) ;
v l i s t [418] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 1 9 0 0 0 0 ) ;
v l i s t [381] s t i c k I t ( ) ;
165 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 0 0 0 0 0 ) ;
v l i s t [199] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 1 0 0 0 0 ) ;
v l i s t [398] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 2 0 0 0 0 ) ;
170 v l i s t [15] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 3 0 0 0 0 ) ;
v l i s t [492] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 4 0 0 0 0 ) ;
v l i s t [461] s t i c k I t ( ) ;
175 s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 5 0 0 0 0 ) ;
v l i s t [371] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 6 0 0 0 0 ) ;
v l i s t [249] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 7 0 0 0 0 ) ;
180 v l i s t [80] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 8 0 0 0 0 ) ;
v l i s t [486] s t i c k I t ( ) ;
s imu l a t e (kT, 1 0 0 0 0 , 1 0 0 0 , 2 9 0 0 0 0 ) ;
g
185 g
// ... Methoden ...............................................
void I s i n g : : b u i l d n e t ( ) f
190 cout b u i l d n e t e n d l ;
f o r ( i n t i =0; iMAX NODE; i++) f
v l i s t [ i ] = new Ve r t e x ( thi s , i ) ;
v l i s t [ i ]s e t S p i n ( 1 ) ;
g
195 i n t from , to ;
whi le ( ! ne t . e o f ( ) ) f
ne t . g e t l i n e ( b u f f e r , 8 0 ) ;
i f ( s s c a n f ( b u f f e r , %i %i ; , from , to ) != 2 ) f
c e r r i n p u t l i n e i g n o r e d : ' b u f f e r ' e n d l ;
200 cont inue ;
g
a s s e r t ( f rom MAX NODE) ; a s s e r t ( to MAX NODE) ;
v l i s t [ f rom]addConn ( v l i s t [ to ] , NODOUBLE) ;
v l i s t [ to]addConn ( v l i s t [ f rom ] , NODOUBLE) ;
205 g
ne t . c l o s e ( ) ;
v c o u n t = 0 ;
f o r ( i n t i =1; iMAX NODE; i++)
49
150. B. Source code
i f ( v l i s t [ i ]getConnCount ( ) 0) v c o u n t=i ;
210 e l s e v l i s t [ i ]s e t S p i n ( 0 ) ;
c e r r v c o u n t nodes e n d l ;
g
void I s i n g : : debug ( i n t nr ) f
215 Ve r t e x node = v l i s t [ nr ] ;
i n t nb = nodegetConnCount ( ) ;
cout node nr has nb
c o n n e c t i o n s : e n d l ;
f o r ( i n t i =0; inb ; i++)
220 cout nodegetConn ( i )getNumber ( ) e n d l ;
g
void I s i n g : : r e s e t ( ) f
f o r ( i n t i =0; iMAX NODE; i++)
225 v l i s t [ i ]s e t S p i n ( abs ( v l i s t [ i ]g e tSp i n ( ) ) ) ;
g
// ... Simulation .............................................
230 void I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t e p t ime , i n t s t a r t t ime
) f
i f ( ! s t e p t ime ) s t e p t ime=maxtime ;
f o r ( i n t i=MAX CONN; i=MAX CONN; i++)
e n l imi t [ i+MAX CONN] =
( i n t ) (MAX INT ( 2 exp (2. i /kT)1) ) ;
235 i n t mag , t ime=s t a r t t ime ;
whi le ( t imes t a r t t ime maxtime ) f
f o r ( i n t s t e p =0; s t eps t e p t ime ; s t e p++) f
mag = 0 ;
f o r ( i n t nr =1; nr=v c o u n t ; nr++) f
240 a s s e r t ( v l i s t [ nr ] !=NULL) ;
mag += v l i s t [ nr ]s imu l a t e S t e p ( ) ;
g
g
t ime += s t e p t ime ;
245 cout t ime kT mag e n d l ;
g
g
// === main ===================================================
250
i n t main ( i n t argc , char argv ) f
a s s e r t ( c e r r debug mode on e n d l ) ;
I s i n g i s i n g ( ne t . t x t ) ;
r e turn 0 ;
255 g
50
155. nal size. In
section 5 this was used on our collaboration network.
#!/ home /fxp/bin /ruby -w
$P = 0 . 8
$RUNS = 1E0 . t o i
5 $M = 1
$ c l u s t e r i n i t i a l s i z e = 3
$MAXINT = 21474836472+1
10
c l a s s Random
@@ibm = 1
def rnd (max=n i l )
@@ibm = 65539 # 16807 # 65539
15 @@ibm = $MAXINT
max ? @@ibmmax/$MAXINT : @@ibm
end
end
20 c l a s s Ve r t e x
a t t r r e a d e r : c o n n e c t i o n s
a t t r a c c e s s o r : nr
@ @ t o t a l v e r t i c e s = 0
d e f i n i t i a l i z e
25 @c o n n e c t i o n s = [ ]
@ @ t o t a l v e r t i c e s += 1
@nr = @ @ t o t a l v e r t i c e s
end
d e f i n s p e c t
30 I 'm node nr . #f @nr g conne c t ed to #f@c o n n e c t i o n s . c o l l e c t f j c j c . nr g .
j o i n ( , ) g .
end
d e f a d d l i n k ( p a r t n e r )
@c o n n e c t i o n s = p a r t n e r
end
35 d e f conne c t ( p a r t n e r )
a d d l i n k ( p a r t n e r )
p a r t n e r . a d d l i n k ( s e l f )
end
end
40
c l a s s BA net
d e f i n i t i a l i z e ( prob new=0)
@p new = ( prob new $MAXINT) . t o i
@nodes = [ ]
45 @k e r t e s z = [ ]
@r = Random. new
1 0 0 . t ime s f @r . rnd g
s t a r t n ew n e t
51
156. B. Source code
end
50 d e f s i z e
@nodes . l e n g t h
end
d e f s t a r t n ew n e t
max = @nodes . l e n g t h
55 $ c l u s t e r i n i t i a l s i z e . t ime s f @nodes Ve r t e x . new g
$ c l u s t e r i n i t i a l s i z e . t ime s f j i j
o r i g = i + max
d e s t = ( ( i +1) % $ c l u s t e r i n i t i a l s i z e ) + max
@nodes [ o r i g ] . conne c t ( @nodes [ d e s t ] )
60 @k e r t e s z o r i g d e s t
g
end
d e f i n s p e c t
@nodes . c o l l e c t f j node j node . i n s p e c t g . j o i n (n n ) + n n#f@k e r t e s z .
i n s p e c t g
65 end
d e f add node
i f @r . rnd @p new
s t a r t n ew n e t
e l s e
70 @nodes new node = Ve r t e x . new
l = @k e r t e s z . l e n g t h
$M. t ime s f
d e s t = @k e r t e s z [ @r . rnd ( l 1) ]
new node . conne c t ( @nodes [ d e s t ] )
75 @k e r t e s z @nodes . l e n g t h 1 d e s t
g
end
end
d e f c h e c k s u b t r e e ( node , s u b t r e e )
80 r e t u r n i f ! node . nr
s u b t r e e node . nr
node . nr = n i l
node . c o n n e c t i o n s . each f j n j c h e c k s u b t r e e ( n , s u b t r e e ) g
end
85 d e f a n a l y s i s
@nodes . each f j node j
ne x t i f ! node . nr
t r e e = [ ]
c h e c k s u b t r e e ( node , t r e e )
90 # p t r e e . s o r t
# [ 1 ] [ 2 , 3 ] [ 4 , 5 , 6 , 7 ] [ 8 . . . ] . . .
bucke t = t r e e . l e n g t h
i f ! $ s t a t i s t i k [ bucke t ]
$ s t a t i s t i k [ bucke t ] = 1
95 e l s e
$ s t a t i s t i k [ bucke t ] += 1
end
g
end
100 end
52
158. ed Barabasi-Albert model
$ s t a t i s t i k = fg
$RUNS. t ime s f j i j
$ s t d e r r . p r i n t #f i g . . . i f i%100==0
105 mynet = BA net . new($P)
b e g i n
mynet . add node
end wh i l e mynet . s i z e 555
# p mynet
110 mynet . a n a l y s i s
g
$ s t a t i s t i k . k e y s . s o r t . each f j l j p r i n t f %3 i %6.4 f nn , l , $ s t a t i s t i k [ l
] g # . t o f /$RUNS g
53
169. Bibliography
[1] L. Euler. Solutio problematis ad geometriam situs
pertinentis. Commetarii Academiae Scientiarum
Imperialis Petropolitanae 8 (1736).
[2] R. Albert and A.-L. Barabasi. Statistical me-chanics
of complex networks. Rev. Mod. Phys. 74,
47{97 (2002).
[3] S. N. Dorogovtsev and J. F. F. Mendes. Evo-lution
of networks. Adv. Phys. 51, 1079{1187
(2002).
[4] A.-L. Barabasi. Linked. The New Science of Net-
works (Perseus, Cambridge, Massachusetts, 2002).
[5] K. Rieger. Sind Sie mit Marlon Brando befreun-det?
Die Zeit 44, BL21 (1999).
[6] A.-L. Barabasi, R. Albert, and H. Jeong.
Scale-free characteristics of random networks: the
topology of the world-wide web. Physica A 281,
69{77 (2000).
[7] M. E. J. Newman. Scienti
170. c collaboration net-works.
Network construction and fundamental re-sults.
Phys. Rev. E 64, 01631 (2001).
[8] M. E. J. Newman. Scienti
171. c collaboration net-works.
Shortest paths, weighted networks, and cen-trality.
Phys. Rev. E 64, 01632 (2001).
[9] H. Ebel, L.-I. Mielsch, and S. Bornholdt.
Scale-free topology of e-mail networks. Phys. Rev.
E 66, 035103 (2002).
[10] S. Redner. How popular is your paper? an empir-ical
study of the citation distribution. Europ. Phys.
J. B 4, 131{134 (1998).
[11] A. J. Lotka. The frequency distribution of scien-ti
172. c productivity. J. Wash. Acad. Sc. 16, 317{323
(1926).
[12] F. Liljeros, C. R. Edling, et al. The web of
human sexual contacts. Nature 411, 907 (2001).
[13] S. N. Dorogovtsev and J. F. F. Mendes. Evo-
lution of networks: From Biological Nets to the
Internet and WWW (Oxford University Press, Ox-ford,
2003).
[14] S. H. Strogatz. Exploring complex networks. Na-
ture 410, 268{276 (2001).
[15] S. Milgram. The small world problem. Psych.
Today 61 (1967).
[16] D. J. Watts and S. H. Strogatz. Small world.
Nature 393, 440{442 (1998).
[17] D. Watts. Small worlds: the dynamics of net-
works between order and randomness (Princeton
University Press, Princeton, New Jersey, 1999).
[18] M. E. J. Newman. Models of the small world|a
review (2000). ArXiv:cond-mat/0001118.
[19] M. Gladwell. The Tipping Point. How little
things can make a big dierence. (Little Brown
and Company, Boston, Massachusetts, 2000).
[20] R. Cohen, K. Erez, D. ben Avraham, and
S. Havlin. Resilience of the internet to random
breakdowns. Phys. Rev. Lett. 85, 21, 4626{4628
(2000).
[21] A. L. Lloyd and R. M. May. How viruses spread
among computers and people. Science 292, 1316{
1317 (2001).
[22] M. E. J. Newman, S. Forrest, and
J. Balthrop. Email networks and the spread
of computer viruses. Phys. Rev. E 66, 035101
(2002).
[23] B. J. Kim, C. N. Yoon, S. K. Han, and H. Jeong.
Path
173. nding strategies in scale-free networks.
Phys. Rev. E 65, 027103 (2002).
[24] J. M. Kleinberg. Navigation in a small world.
Nature 406, 845 (2000).
[25] Isi web of science (2002). http://www.isinet.com/-
isi/products/citation/wos/index.html.
[26] R. Albert, H. Jeong, and A.-L. Barabasi. Di-ameter
of the world-wide web. Nature 401, 130
(1999).
[27] R. Solomonoff and A. Rapoport. Connectivity
of random nets. Bull. Math. Biophys. 13, 107{227
(1951).
[28] P. Erd}os and A. Renyi. On random graphs. Pub.
Mathem. 6, 290{297 (1959).
[29] M. E. J. Newman. Random graphs as models of
networks (2002). ArXiv:cond-mat/0202208.
57
174. Bibliography
[30] A.-L. Barabasi and R. Albert. Emergence of
scaling in random networks. Science 286, 509{512
(1999).
[31] St. Matthew. New Testament. In The Bible,
chap. 12, 13 (God, ca. 10AD).
[32] D. Cohen. All the world's a net. New Scientist
174, 24{29 (2002).
[33] A.-L. Barabasi and E. Bonabeau. Scale-free net-works.
Sci. Amer. 3, 50{59 (2003).
[34] K. Klemm and V. M. Eguiluz. Growing scale-free
networks with small-world behavior. Phys. Rev. E
65, 057102 (2002).
[35] E. Ravasz and A.-L. Barabasi. Hierarchical or-ganization
in complex networks. Phys. Rev. E 67,
026112 (2003).
[36] K. Klemm and V. M. Eguiluz. Highly clustered
scale-free networks. Phys. Rev. E 65, 036123
(2002).
[37] A. F. J. van Raan. Fractal dimension of co-citations.
Nature 347, 626 (1990).
[38] ATT. Graphviz|open source graph drawing
software (2002). http://www.research.att.com/-
sw/tools/graphviz/.
[39] S. C. North. Drawing graphs with neato. ATT
(2002).
[40] J. C. Venter, M. D. Adams, E. W. Myers,
et al. The sequence of the human genome. Sci-
ence 291, 1304{1351 (2001).
[41] A.-L. Barabasi, H. Jeong, Z. Neda, E. Ravasz,
A. Schubert, and T. Vicsek. Evolution of the so-cial
network of scienti
175. c collaborations. Physica A
311, 590{614 (2002).
[42] A.-L. Barabasi. The physics of the web
(2001). http://www.physicsweb.org/article/-
world/14/7/09.
[43] L. A. N. Amaral, A. Scala, M. Barthelemy,
and H. E. Stanley. Classes of small-world net-works.
Proc. Nat. Acad. Sci. USA 97, 21, 509{512
(2000).
[44] S. Mossa, M. Barthelemy, H. E. Stanley, and
L. A. N. Amaral. Truncation of power law be-haviour
in scale-free network models due to in-formation
176. ltering. Phys. Rev. Lett. 89, 208701
(2002).
[45] D. Stauffer and A. Aharony. Introduction to
Percolation Theory (Taylor and Francis, London,
1994), second ed.
[46] A.-L. Barabasi, R. Albert, and H. Jeong. Mean
177. eld theory for scale-free random networks. Phys-
ica A 272, 173{187 (1999).
[47] E. Ising. Beitrag zur Theorie des Ferromag-netismus.
Zeitschr. f. Phys. 31, 253{258 (1925).
[48] W. Lenz. Beitrag zum Verstandnis der mag-netischen
Erscheinungen in festen Korpern. Phys.
Zeitschr. 21, 613{615 (1920).
[49] R. B. Potts. Some generalized order-disorder
transformations. Proc. Cambridge Phil. Soc. 48,
106 (1952).
[50] F. Y. Wu. The Potts model. Rev. Mod. Phys. 54,
235{268 (1982).
[51] N. Metropolis, A. W. Rosenbluth, and A. H.
Teller. Equation of state calculations by fast
computing machines. J. Chem. Phys. 21, 6, 1087{
1092 (1953).
[52] A. Aleksiejuk, J. A. Ho lyst, and D. Stauffer.
Ferromagnetic phase transition in Barabasi-Albert
networks. Physica A 310, 260{266 (2002).
[53] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F.
Mendes. Ising model on networks with an arbi-trary
distribution of connections. Phys. Rev. E 66,
016104 (2002).
[54] K. Kacperski and J. A. Ho lyst. Phase transi-tions
as a persistent feature of groups with leaders
in models of opinion formation. Physica A 287,
631{643 (2000).
[55] F. Putsch. Analysis and modeling of science col-laboration
networks. Adv. Compl. Sys. (2003).
Submitted on May 6th.
58