1. For office use only
T1 ________________
T2 ________________
T3 ________________
T4 ________________
Team Control Number
24147
Problem Chosen
C
For office use only
F1 ________________
F2 ________________
F3 ________________
F4 ________________
2014
Mathematical Contest in Modeling (MCM/ICM) Summary Sheet
(Attach a copy of this page to your solution paper.)
Type a summary of your results on this page. Do not include
the name of your school, advisor, or team members on this page.
Summary
In this paper, we establish two types of network including directed and undirected
network, handing with the following five tasks.
For task one, a study is conducted with the maximum connected graph. Through the
distribution of degree and distance, the property of ‗a small world‘ and ‗scale-free‘ can be
spotted. Then, based on information entropy theory, three indicators have been constructed.
In comparison with the means of corresponding indicator of 100 networks in the same scale,
our co-author network has the properties of effectiveness, organization and cooperation.
For task two, four relevant indicators are built including the cooperation time, representing
research longevity; degree, showing direct cooperation; quadratic correlation, reflecting
indirect cooperation; betweenness, indicating the link of collaboration. Imitating the
calculation of gross input in economics, we set up a calculable MEI assessment model,
which reveals that the top two are HARARY, FRANK* and SOS, VERA TURAN. Then we
test the results by Monte Carlo method.
For task three, a directed quotation network with several types of correlation is erected.
―Weak correlation‖ appeared from papers in the same filed is defined. The value of
PageRank computed through modified PageRank algorithm shows that the paper entitled
Collective dynamics of “small-world” networks ranks first. Depiction of relationship among
universities or departments can be discussed via improvement.
For task four, the dolphin network shows the feature of ‗small world‘ rather than ‗free scale‘.
The degree of order is higher than that of random ones. The most influential dolphin is
called SN100. The degree of intelligence has some relevance with free scale and
organization by comparing two networks mentioned.
For task five, pros and cons as well as sensitivity of the presented model are summarized.
Some suggestions have been proposed for college students eager to enhance their influence.
Keywords: Co-author network MEI Modified PageRank Weak correlation
2. Contents
1. Introduction ........................................................................................................................... 1
2. Assumptions .......................................................................................................................... 1
3. The construction of co-author network and influence assessment ........................................ 1
3.1 Data extraction and establishment of the co- author network ..................................... 1
3.1.1 The original network ........................................................................................ 1
3.1.2 The erection procedure ..................................................................................... 1
3.2 Property analysis of the co-author network................................................................. 2
3.2.1 Diagrammatic sketch and fairly interesting findings........................................ 2
3.2.2 Limitation of the size of our network by connected graph............................... 3
3.2.3 The Degree and its Distribution........................................................................ 3
3.3 Analysis based on information entropy theory............................................................ 5
3.3.1 The explanations of timeliness and time efficacy entropy in network
structure..................................................................................................................... 6
3.3.2 The explanations of quality and quality entropy in network structure. ............ 6
3.3.3 The results and comparison .............................................................................. 7
3.4 Four fundamental indicators and Marshall Entropy Index (MEI) ............................... 7
3.4.1 Introduction ...................................................................................................... 7
3.4.2 Calculating evaluating model ........................................................................... 8
3.4.3 Testing of evaluating model.............................................................................. 9
4. Construction of quotation network and modified PageRank algorithm .............................. 10
4.1 Conceptual framework .............................................................................................. 10
4.2 Construction of quotation network.............................................................................11
4.2.1 Quotation network of direct correlation ..........................................................11
4.2.2 Slightly complicated quotation network..........................................................11
4.3.3 The final quotation network ........................................................................... 12
4.3 Modified PageRank algorithm and its application .................................................... 12
4.3.1 The model corresponding PageRank algorithm ............................................. 12
4.3.2 Modified PageRank algorithm........................................................................ 13
4.4 Results and further discussion................................................................................... 13
5. The dolphin network and the analysis ................................................................................. 15
5.1 The erection of dolphin network and diagrammatic sketch....................................... 15
5.2 The value of indicators and further discussion.......................................................... 15
6. The assessment and popularization of our network............................................................. 16
6.1 The assessment and popularization of co-author and dolphin network..................... 16
6.2 The assessment and popularization of quotation network......................................... 17
6.3 Some suggestions ...................................................................................................... 17
References ............................................................................................................................... 17
3. Team #24147 Page 1 of 20
1. Introduction
Real-world entities often interconnect with each other through explicit or implicit
relationships among them then form a complex network. There are so many examples of
complex networks in different area such as natural systems, engineering systems, economic
systems as well as social systems. [1] Generally, each node in the complex network indicates
an individual in the real world, and each side linking two individuals indicates an interaction
between the two. As to the co-author systems presented in ICM 2014, studying the internal
property of them has a tremendous meaning of evaluating a scientist‘s achievement. Inspired
by empirical studies of networked systems, researchers have developed a variety of
techniques and models( the small-world effect, degree distributions, random graph models
and so on) to help us understand or predict the behavior of these systems in recent few years.
[2] In this article, it‘s accepted by both of us that the influence of scientists is closely of
relevance to those for whom scientific publication is a primary means of scholarly
communication. [3] Moreover, a network was built to analyze the relationships between the
species of dolphin after completing the analysis of co-author networks.
2. Assumptions
(1) The original data is believable;
(2) There is no academic malpractices among the listed authors;
(3) There is no obvious preference in the process of statistics;
(4) There is no article which violates the cited articles
3. The construction of co-author network and influence assessment
A co-author network is a collection of authors; each of them is acquainted with some subsets
of the others. Such a network can be represented as a set of nodes denoting authors joined by
edges denoting acquaintance. [4] In this section, the whole construction process of our
co-author network in detail includes data extraction, modeling effects, erection of four
different types of indicator. Through statistical analysis on the results, the author proposes our
conclusion based on the comprehensive indicator—Marshall Entropy Index (MEI).
3.1 Data extraction and establishment of the co- author network
3.1.1 The original network
Initially, the data from the file in the given website is guided into document whose extended
name is txt before being imported into Matlab, which is a mathematical calculation software,
designed for data processing. By applying the mechanism of cell array, the direct
co-authors— there is no blank before their names in given files— can be successfully
extracted which the quantity is 511 representing 511 co-authors in a specific sequence.
Subsequently, an original co-author network with 511 columns and 511 rows has been set up
on the previous stages.
3.1.2 The erection procedure
A brief treatment of valuation may be discussed in this part. Firstly, all elements in the
4. Team #24147 Page 2 of 20
original network are assigned to zero representing the initial state because of the convenience
to reach the following process. Secondly, we are asked to find whether the direct co-author
exists in all indirect co-author who is the co-author of the co-author of Erdos. If it works, the
corresponding elements would plus 1. For example, ABBOTT, HARVEY LESLIE has links
with MEIR, AMRAM; and it can be found that they are co-authors respectively. ABBOTT,
HARVEY LESLIE represents column 1 and row1, MEIR, AMRAM represents column 232
(provided) and row 232 (provided). So, the value of the elements A(1,232) and A(232,1)
change from 0 to 1—this occurs in the situation when the computer inspects ABBOTT,
HARVEY LESLIE. Thus, as what you will think after a correspondence data processing
procedure, the original co-author network becomes a symmetric matrix that contains only 0 or
1 in each element. In addition, if ABBOTT, HARVEY LESLIE did not have any links with
ACZEL, JANOS D., the elements (1,2) and (2,1) in this co-author matrix remain 0; otherwise,
the value will become 1. Finally, by calculating the amount of non-zero value in each rank,
the quantity of co-author of each author can be obtained.
Presenting it with the type of matrix:
)5111,5111( jiAij , (1)
Also, the first six authors are stated as following:
Table 1 The number of co-authors connected directly
Name The number of co-authors
ABBOTT, HARVEY LESLIE
ACZEL, JANOS D.
AGOH, TAKASHI
AHARONI, RON
AIGNER, MARTIN S.
AJTAI, MIKLOS
7
2
1
11
5
9
3.2 Property analysis of the co-author network
The co-author network mentioned is a symmetric matrix that contains only 0s and 1s in each
element position. Now, we prepare to utilize several tables and charts to intuitively illustrate
our co-author network‘s details directly.
3.2.1 Diagrammatic sketch and fairly interesting findings
5. Team #24147 Page 3 of 20
Fig. 1. Diagrammatic sketch of co-author network
Fig.1 indicates the cooperative relationship between the 511 authors. Through it‘s somewhat
intricate and complex, the figure presents an simple connection among the 511 authors. It
cannot be ignored that considerable nodes are isolated, such as node7, node83 and node375,
which means these authors corresponding to the isolated nodes have no direct cooperation
with other authors listed in the network. As a result, to some level, efforts in succession work
—the calculation of betweenness and quadratic correlation, which finally influence the final
score of each author when computing MEI. It would be so puzzling that more focus have to
be paid to this kind of situation. It is accepted by us that one of the reasons may be the
author ,cooperating with Erdos just for one time, is isolated with other co-author network
researchers because of the diverse areas compared to the co-authors‘. Interdisciplinary
research may be another reason.
So the author predicts courageously, that the isolated one might be one of the leaders of a
specific field in mathematics, or might be devoted to interdisciplinary research in various
subjects, such as music and economics. For example, the author, ASHBACHER, CHARLES
D. taught physics in college. However, he changed his direction from physics to the art after
graduating from college and composed a song with Wilson, Lewis listed in the co-author
order. As to the motivation he changed his profession, there are numerous stories showing that
it may be the cause of interest or love affairs. But, IT IS NOT CERTAINLY TRUE. It looks
like fairly interesting; yet unfortunately, we are not able to find it out.
3.2.2 Limitation of the size of our network by connected graph
After initiative presentation, the paper starts to narrow the size of our network with connected
graph, which exists an edge between arbitrary nodes. By selecting the largest connected graph
with 466 nodes and numerous edges, we change our attention just from 511 nodes to 466
nodes regarding the number of 466 authors as our study area.
3.2.3 The Degree and its Distribution
6. Team #24147 Page 4 of 20
Moreover, the degree and its distribution, which have been significant features in any graph,
come to be illustrated in this part. However, before proposing diagrams, we firstly define the
notion of ―degree‖.
Degree [5]: the degree of a certain node represents the amount of collaborator that a certain
author has. In graph theory, the degree of a certain node indicates the amount of edges jointed
the node. The representation of matrix is shown as follow:
)4661,4661(
466
1
jiAD
j
iji (2)
Fig. 2. Frequency histogram of degrees
The figure 2 demonstrates that with the increasing number of degree, the corresponding
quantity of co-author declines gradually. As we can see, the number of cooperator of most
co-author remains no more than 10 which is less than 3% of the gross co-author.
Fig. 3. Frequency histogram of the length of the shortest path
7. Team #24147 Page 5 of 20
The frequency of the length of the shortest path between two typical nodes can be calculated
by taking advantage of modified Floyd algorithm, a commonly used methodology for
compute the shortest path in graph theory. Moreover, the frequency that the length of the
shortest path lower than six can be counted out as 0.9783, approximately equal to 1. Maybe
the number ―six‖ seem too puzzling, yet the famous ―Six Degrees of Separation ‖ [6], which
was proposed by Harvard professor Frigyes Karinthy, indicates that everyone and everything
is six or fewer steps away. Namely, people can connect with each other in a maximum of five
middlemen. It can be fundamentally accepted by us that our calculation is consistent with
well-known theory.
Fig. 4. Frequency histogram of degrees and its fitting curve
After inspecting the frequency histogram of degrees, the author fits its regular pattern with
three different distribution types—the Poly3, the Exp2, the Fourier3—which have been
presented in Fig.4. Furthermore, some statistical magnitudes, eg. SSE, R-square, can be
worked out respectively. On contrast, the Exp2 is the most fitting curve where a few nodes
possess a large amount of edges while many nodes possess relatively fewer ones. More
importantly, our co-author network can be regarded as scale-free network [7] where the
distribution of degree corresponds with power law degree distribution.
Table 2 Some Statistical Magnitudes of Distribution of Degree
Equation name SSE R-square Adjusted R-square RMSE
Poly3 0.01616 0.8219 0.8123 0.01699
Exp2 0.0009141 0.9899 0.9894 0.00404
Fourier3 0.004283 0.9528 0.9464 0.009075
3.3 Analysis based on information entropy theory
The degree of order in strategic perspective may be defined by information entropy, which
can be divided into timeliness of information entropy and quality of information entropy. [8]
The degree of order in network structure (R) is an index considering the timeliness and
8. Team #24147 Page 6 of 20
quality of information which can be expressed as equation 3,
, (3)
where R1 and R2 mean the timeliness of information entropy and the quality of
information entropy.
3.3.1 The explanations of timeliness and time efficacy entropy in network structure.
Timeliness in network structure can be defined as the degree of time used in the transmitting
information from one node to another; and time efficacy entropy is the degree of uncertainty
in timeliness of which formula as follow,
, (4)
Supposing there is a network structure with n nodes and i and j mean two optional nodes, the
value of timeliness (R1) and time efficacy entropy (H1) can be calculated as following steps.
○1 calculate Lij. The length of node i and node j (Lij) is defined as the shortest path
between two nodes which means the value is 1 if the two nodes connect directly and added
with the times of transit.
○2 calculate A1. The total of timeliness in network structure (A1) can be calculated as
formula 5.
, (5)
○3 calculate P1(ij). The probability of two nodes connected together (P1(ij)) is defined as
eq.6,
, (6)
○4 calculate H1(ij). The time efficacy entropy between node i and node j (H1(ij)) can be
calculated in the following way,
, (7)
○5 obtain H1 and H1M. The total of time efficacy entropy in network structure (H1) and
the maximum of H1 can be got in equation 8 and equation 9.
, (8)
, (9)
○6 obtain R1. After the above steps, the timeliness of the network structure (R1) can be
got in the equation 10.
. (10)
3.3.2 The explanations of quality and quality entropy in network structure.
Quality in network structure is defined as the degree of accuracy that used in the transmitting
of information between one node and another; and quality entropy is the degree of uncertainty
in quality. The difference of calculating quality and quality entropy is in the method of
defining Lij. and A1 which are defined in the quality and quality entropy as follow.
9. Team #24147 Page 7 of 20
The number of nodes which connect directly (Ki) and the total of Ki in the network structure
(A2), , replace the original Lij and A1. Except for the two definitions, the
following data, the probability of quality of node i (P2(i)), the quality entropy of node i
(H2(i)), the total of quality entropy in network structure (H2), the maximum of H2 (H2M) and
the quality of the network (R2), can be calculated as the same way of calculating the
timeliness and time efficacy entropy in following equations.
3.3.3 The results and comparison
The results of three indicator—degree of timeliness, quality and order—are shown in table 3:
Table 3 The results of information entropy
timeliness quality the degree of order
The co-author net 0.1022 0.2979 0.1745
The simulated net 0.0341 0.4743 0.1271
* The number of nodes and edges in simulated net is same to that in given data.
The simulated net is that after erecting the same net with our co-author network except that
the ways of connection are fairly random. Having simulated a hundred nets, the author can
also calculate the corresponding average value to compare.
The table 3 reveals that though the value of the quality of entropy is lower than that of the
simulated network, the values of the other two are higher than those of the simulated one,
which cannot be ignored. That is to say, our co-author network seems to be more relevant and
ordered.
3.4 Four fundamental indicators and Marshall Entropy Index (MEI)
The ways to construct our co-author network have been mentioned. In this section, the author
will combine four fundamental indicator including Quadratic correlation, time, betweenness
and degree, and a comprehensive indicator—Marshall Entropy Index (MEI).
3.4.1 Introduction
Degree: The degree of a certain node represents the amount of collaborator that a certain
author has. In graph theory, the degree of a certain node indicates the amount of lines jointed
the node. The representation of matrix is shown as follow:
)4661,4661(
466
1
jiAD
j
iji , (11)
Betweenness: Betweenness is a global variable which reflects the impact and
10. Team #24147 Page 8 of 20
influence to the relationship in nodes or edges. It can be defined as equation (12):
nm nm
i
i
nmg
nmg
nmbB
),(
),(
),(i , (12)
Where ),( nmg is the length of the shortest distance between node m and node n and
),( nmgi is the length of the shortest distance between node m and node n which goes
through the node i.
Time: In this article, the author considers that the earlier an author cooperate with Erdos is,
greater the influence in academic has. What lead us to do this supposal originate from the
following two reasons. First, the earlier cooperation they had, the higher probability of
acknowledgement of Erdos to one‘s academic potential would be. Second, if they could
collaborate with each other in the earlier time, they may make a greater contribution in
academic world. It can be defined as follow equation (13):
)4661(2014 itT ii , (13)
Quadratic correlation: Quadratic correlation is an index to measure the indirect correlation of
a node. We can define the new measure as shown in Eq. (14) in which iA means a set of all the
nodes connect directly to the node i, 3,2,1 iiii AAAA ; iAn means the size of
all the factors in iA ; iE means a set of all the nodes connect the node i in two paths but
cannot reach in one path. Furthermore, the factor in iE should not include the node i,
iAAAi AiAAAE iii
)3()2()1( ; the definition of iEn is just like
iAn , meaning the size of all the factors in iE .
i
i
i
An
En
Q , (14)
Marshall Entropy Index: In economics study field, when it comes to calculate input-output
table in agriculture industry, though the types of input or output are fairly different and
uncorrelated, or have somewhat subtle links, the gross value of input or output can be
obtained by the method of continued multiplication before getting logarithm to the computed
product. Of course, get standardization when it is necessary. Inspired by this idea, the
processing procedure of four indicators will contain standardization, continued multiplication
and logarithm process. As a result, a comprehensive numerical value—Marshall Entropy
Index(MEI),an index illustrating the overall influence of each co-author—can be achieved.
(15)
3.4.2 Calculating evaluating model
According to principals elaborated above, numerical values of each indicator with
every 466 co-authors can be attained. Further, after sizing these figures down, top ten
figures of each indicator can be elected in Table 4.
Table 4 Ranking in Accordance with Four Fundamental Indicator and MEI
Rank Time Degree
11. Team #24147 Page 9 of 20
1 SZEKERES, GEORGE* (80) ALON, NOGA M. (52)
2 TURAN, PAL* (80) GRAHAM, RONALD LEWIS (44)
3 DAVENPORT, HAROLD* (78) HARARY, FRANK* (44)
4 FELDHEIM, ERVIN* (78) BOLLOBAS, BELA (43)
5 GALLAI, TIBOR* (GRUNWALD, TIBOR) (78) RODL, VOJTECH (43)
6 VAZSONYI, ANDREW* (WEISZFELD, ENDRE) (78) FUREDI, ZOLTAN (40)
7 GILLIS, JOSEPH E.* (77) TUZA, ZSOLT (40)
8 JARNIK, VOJTECH* (77) SOS, VERA TURAN (38)
9 OBLATH, RICHARD* (77) SPENCER, JOEL HAROLD (35)
10 GRUNWALD, GEZA* (76) GYARFAS, ANDRAS (32)
Rank Quadratic correlation Betweenness Marshall entropy index
1 BARAK, AMNON B. (52) HARARY, FRANK* (9587.7) HARARY, FRANK*(21.9)
2 COPELAND, ARTHUR HERBERT, SR.* (44) SOS, VERA TURAN(8912.2) SOS, VERA TURAN (21.8)
3 HARZHEIM, EGBERT (44) RUBEL, LEE ALBERT* (8573.3) BOLLOBAS, BELA (21.8)
4 MINC, HENRYK (44) STRAUS, ERNST GABOR* (8300.7) GRAHAM, RONALD LEWIS (21.7)
5 SARKAR, AMITES (43) POMERANCE, CARL BERNARD (7434.7) STRAUS, ERNST GABOR* (21.5)
6 ANDRASFAI, BELA (38) FUREDI, ZOLTAN (7410.1) ALON, NOGA M. (21.4)
7 ZAREMBA, STANISLAW KRYSTYN* (38) ALON, NOGA M. (6871.9) FUREDI, ZOLTAN (21.3)
8 LEWIN, MORDECHAI (29) GRAHAM, RONALD LEWIS (6817.2) HAJNAL, ANDRAS (21.2)
9 PENNEY, DAVID EMROY, II (29) BOLLOBAS, BELA (6699.5) PACH, JANOS (20.8)
10 SCHMUTZ, ERIC J. (29) PACH, JANOS (6073.6) TUZA, ZSOLT (20.7)
* Value in brackets means the score author got in the related index.
The table indicates that H., S. and B. are the top three in the final score of MEI. Unfortunately,
HARARY, FRANK has already died. On the other hand, they are listed in top of 10 both in
the Degree indicator and Betweenness indicator, which means that the indicator of Degree and
Betweenness are relatively important in MEI even after standardizing. However, that does not
mean that the indicator of Time and Quadratic correlation are meaningless. In contrast, the
appliance of Time and Quadratic correlation provides a different perspective to analyses the
results of evaluation. For example, we can get some ideas from the time indicator that the
earlier one cooperate with Erdos, the more influence he tends to be, though those who rank
among top 10 in time indicator do not rank among the first 10 in MEI.
In conclusion, if we stand on the bank of the river of the Network Science to judge who is the
most influential, the answer may be HARARY, FRANK, passed away years ago. If we are
attempted to make an assessment standing present, the most important researchers in 466
given authors SOS, VERA TURAN and BOLLOBAS, BELA.
3.4.3 Testing of evaluating model
In order to test the accuracy of results, the paper plans to remove the top two in MEI,
HARARY, FRANK* and SOS, VERATURAN from the 511 co-authors and acquire the
maximum of connected nodes which is 459 from the network of the maximum of connected
nodes to the rest. In addition, we‘ve tried to wipe off two random nodes and recorded the
maximum of the connected nodes after 2000 times of trial as figure 5.
12. Team #24147 Page 10 of 20
459 460 461 462 463 464 465 466
0
200
400
600
800
1000
1200
the maxmium of connected nodes
frequencies
Fig. 5. 2000 trials of the random nodes
The figure 5 shows that the times of the maximum of connected nodes which is less than 459
are close to 0. Therefore, it‘s obvious that HARARY, FRANK* and SOS, VERATURAN are
of importance to the connection of the co-author network which means our evaluating model
is believable.
4. Construction of quotation network and modified PageRank algorithm
In this section, the concepts of direct correlation, indirect correlation and weak correlation
will be introduced. Then, sixteen research papers in network science field in the given file can
be employed to erect our quotation network presenting the concepts of direct correlation,
indirect correlation and weak correlation. After that, we are to combine the quotation network
built with classical PageRank algorithm and to make some corresponding improvement.
Afterwards, having computing the relative influence of each paper via modified PageRank
algorithm, the author finds that the paper entitled „collective dynamics of „small world‟
network‟ written by Watts and Newman, M. is the most influential research paper among
sixteen theses in network science. Finally, further discussion about our quotation network
model and modified PageRank algorithm will be made.
4.1 Conceptual framework
It is undeniable that the establishment of quotation network cannot implement without
conceptual framework. Due to the requirements of technicality and rigor, the situation of
mutual citation will inevitable appears. Namely, two research papers have direct correlation
with each other. Further, indirect correlation exists only when an article has relationship with
the quotation of cited paper. In this paper, the author defines that both twice quotation and
three time quotation are covered in this concept. In addition, there are some relationships
among all of articles published in the network field. For example, there may be some relation
among sixteen papers in network science, which we will discuss later, simply because that
13. Team #24147 Page 11 of 20
their major focus concentrates on the same field, especially when it comes to ideas,
deductions and conclusions. So, the paper defines this relevance as weak correlation. It is
fairly apparent that as for power of influence, direct correlation is greater than indirect
correlation, while weak correlation is the minimum in these three. In this way, the conceptual
framework is clear and definite.
4.2 Construction of quotation network
Overall, the erection of our quotation network will be implemented in three stages. The first
part of our plan is to erect a quotation network only with direct correlation among all sixteen
articles. Our next business is to implement the task of building a quotation network which
contains both direct correlation and indirect correlation. The processing procedure in
establishing the quotation network involving all three concepts is the final step. Next, a
detailed depiction will be made on the procedure for setting up quotation network and present
diagrammatic sketch in every stage.
4.2.1 Quotation network of direct correlation
Since there are sixteen papers extracted from network science, a matrix E where consists of
the combination of 16*16 should be established. Next, given a matrix E in which
1e ij while node i can directly reach to node j or 0ije otherwise, the matrix can be
defined as:
)161,161( jiEij
We are supposed to pay attention to the idea that this matrix is quite different from ijA ,
which we have constructed in section 2, simply because this matrix is no longer symmetric
— the procedure of citation is one-way. That is to say, the only occasion will appear when
theses published later would cite theses published already. It can be presented in Fig. 6
Fig. 6 Original quotation network
4.2.2 Slightly complicated quotation network
In this step, the indirect quotation should be added to the original quotation network. As for
the matrixE , we vary it as follow:
3
3
2
2
~
e
E
e
E
EE , (16)
Where e means the natural exponential (approximately 2.718281828)
In the view of the influence generated by twice and three times quotation is smaller than that
of direct influence, such variation should be made.
14. Team #24147 Page 12 of 20
4.3.3 The final quotation network
Followed by the thoughts of 3.2.2, a million times of variation could be made. However, the
workload would be onerous. As the saying goes, Simplicity is the ultimate sophistication. So,
many times of variation can be simplified with the form of adding 1/16 in each element to
reveal the weak influence. We can also present it as Fig. 7:
Fig. 7 Quotation network based on B~
We can see that fig. 7 is far more complicated than Fig. 6.
4.3 Modified PageRank algorithm and its application
4.3.1 The model corresponding PageRank algorithm
PageRank algorithm [9], proposed by Sergey Brin and Lawrence Page, is widely applied in
search engine on information collection. In our modified PageRank algorithm, papers are
defined as assemblage NS 2,1 . The PageRank value is denoted by nr . nO can be
defined out-degree of the paper, meaning the amount of number of citing paper. So, the
expression the PageRank value of the paper n can be denoted by:
nAm m
m
n
O
r
r , (17)
According to this expression, diagrammatic sketch of calculating the value of Page Rank is
illustrated schematically in fig. 8.
1e
2e
3e 4e
100
50
50
50
20
10
10
60
15. Team #24147 Page 13 of 20
Fig. 8 Diagrammatic of computing the PageRank value
The numbers in the figure show the capacity of inflowing and out flowing
Then, by constructing quotation transference probability matrixC , where
i
ij
O
C
1
while
paper i is cited by paper j or 0ijC otherwise, we can get the model corresponding
PageRank algorithm:
0
1
R
eR
RGR
T
T
(18)
where the prominent character is that the sum of all component products in eigenvector R is
1. The values of component product basically determine the ranking among papers. The larger
the value of component product is, the more weightiness a paper will be, proving that the
paper is more important, namely, ranking higher.
4.3.2 Modified PageRank algorithm
In this part, a flow chart figure 9 can be presented to demonstrate our modified PageRank
algorithm.
Input adjacency
matrix-E and the
size-n、
convergence
threshold- sigma
E=E+E^2/e^2+E^3/e^3+1/nCommence
Calculate transition
probability matrix C
VectorR generated
randomly with initial
value of M-PR
X=PRR=X
max(X-R)
>sigma?
R=X/sum(R)Output RFinish FALSE
TRUE
Fig. 9. The flow chart of modified PageRank algorithm
4.4 Results and further discussion
Having calculating the value of modified PageRank via modified PageRank algorithm, a clear
result can be gained in table 5:
Table 5 The rank of the papers in the value of M-PageRank
Rank
the Value of
M-PAGERANK
the I.D. of
Paper
the Name of
Paper
16. Team #24147 Page 14 of 20
1 0.2131 14 Collective dynamics of `small-world' networks
2 0.1682 8 Navigation in a small world.
3 0.0839 3 Power and Centrality: A family of measures
4 0.0799 4 Emergence of scaling in random networks
5 0.0568 6 Models of core/periphery structures
6 0.0423 13 Identity and search in social networks
7 0.0420 1 On Random Graphs
7 0.0420 10 The structure of scientific collaboration networks
9 0.0383 11 The structure and function of complex networks
10 0.0361 2 Statistical mechanics of complex networks
10 0.0361 9 Scientific collaboration networks
12 0.0321 5 Identifying sets of key players in a network.
12 0.0321 7 On properties of a well-known graph.
12 0.0321 12 Networks, influence, and public opinion formation
12 0.0321 15 Statistical models for social networks
12 0.0321 16 Social network thresholds in the diffusion of innovations
As the table shows, Collective dynamics of `small-world' networks is in the first place,
followed by Navigation in a small world and Power and Centrality: A family of measures.
That is to say, Collective dynamics of “small-world” networks written by Watts, D. and
Strogatz, S. is the most influential paper in network science.
When discussing if there is a similar way to determine the role or influence measure of an
individual network researcher, the reply comes positively. The fruits of collaboration between
researchers can be presented by theses written by them, though there is no clear sense of
orientation when working together. An assumption that can be made grounded in reality is
that one researcher is ‗leader‘ and another one is ‗helper‘. When they finish theses in
cooperation, the leader may gain more relative influence, while the helper may gain less. As
to alteration of our quotation network matrix, the way to deal with is to alter the relative value
presented in the matrix. Thus, the influence of an individual network researcher can be
obtained.
As for measuring the role, influence, or impact of a specific university, department, or a
journal in network science, our quotation network also comes in handy. Whether a specific
university or a department, it is a portion in network science, which can be represented a node
in Graph Theory. A university, for instance, may have some kind of collaboration with others,
though the collaboration does not intense, which we could define it as a sort of weak
correlation. As to alteration of our quotation network matrix, the way to deal with is to add a
small number in each component product.
As the depiction above, a conclusion may be drawn that a more detailed methodology when
measuring the role, influence, or impact of a specific university, department, or a journal in
network science is needed. Among all of the alteration, weak correlation, whose quantity is
mainly determined by the number of nodes, should be emphasized. Thus, the data that need to
be collected includes that the program of cooperation between authors and their workload
respectively, the amount of publications and journals in network science, the quantity of
university or department researching network science.
17. Team #24147 Page 15 of 20
5. The dolphin network and the analysis
In this section, the dolphin network [10] that contains 62 nodes and some edges if the graph
could be depicted will be constructed. Then, the key indicators, such as degree, quadratic
correlation, Betweenness as well as timeliness entropy, quality entropy and the degree of
order, will also be calculated, indicating the dolphin named SN100 rank the first. Finally,
further discussion will be made.
5.1 The erection of dolphin network and diagrammatic sketch
Following the construction steps introduced in section 2, a 62*62 matrix F has been built.
The diagrammatic sketch is shown in figure 10:
Fig. 10 Diagrammatic sketch of the dolphin network
The property of Six Degrees of Separation is still notable by means of further study, while the
property of power law degree distribution is no longer apparent.
5.2 The value of indicators and further discussion
According to the principal of calculation mentioned in section 2, key indicators can be
computed that are shown in table 6 and table 7:
Table 6 The evaluation of dolphins about their influence
Rank Degree Quadratic correlation Betweenness Marshall entropy index
18. Team #24147 Page 16 of 20
1 Grin (22) Cross (10) SN100 (454.27) SN100 (9.48)
2 SN4 (18) Five (10) Beescratch (390.38) SN9 (8.90)
3 Topless (11) Fork (10) SN9 (261.96) Beescratch (8.86)
4 Scabs (17) MN23 (9) SN4 (253.58) SN4 (8.71)
5 Tringger (10) Quasi (9) DN63 (216.38) Kringel (8.45)
6 Jet (8) SMN (9) Jet (209.17) DN63 (8.32)
7 Kringel (13) TR82 (9) Kringel (187.84) Jet (7.91)
8 Patchback (19) Whiteti (8) Upbang (181.39) Stripes (7.88)
9 Web (22) SN89 (7.5) Trigger (154.96) Oscar (7.85)
10 Beescratch (11) Vau (6.5) Web (154.09) Upbang (7.84)
* Value in brackets means the score dolphin got in the related index.
As the table 6 demonstrates, SN100 comes in the first place, predicating it would be the most
influential dolphin of all. Noticing that the top 4 in betweenness is also stand at the first four
place, the importance of the betweenness indicator show up again, which is consistent with
the results presented previously.
Table 7 The result of information entropy of the dolphin network
Table 7 reports the result of information entropy of the dolphin network and our simulated
network, where the value of all three indicators of our dolphin network is higher than that of
simulated network, revealing that our dolphin network, being more relevant, certain and
well-organized, may be slightly different from the random net.
Further, by comparison, the dolphin network is different from the co-author network to some
extent. So, we are brave to guess the reason that ‗degree intelligent‘ may be a sensible factor.
6. The assessment and popularization of our network
It is well accepted by us that two pairs of network have been erected in this article, with an
undirected network employed in co-author network and dolphin network, and directed
network like quotation network. So, it is reasonable to discuss the networks respectively.
6.1 The assessment and popularization of co-author and dolphin network
Strengths: 1. The author or the dolphin can be abstracted as a node representing
its role in the network, which can be extended in many aspects, such as
student union and conduct business.
2. The edge represents the relationship between two authors of whether they
have cooperation or not or two dolphins of whether they have somewhat
mysterious relationship.
3. The sensitivity of our methodology is relative low, which means that the
stability of our network can be ensured basically.
Weaknesses: 1. The network can handle with problems of directed graph.
Timeliness quality the degree of order
The dolphin network 0.1379 0.3143 0.2081
The simulated net 0.1109 0.3002 0.1824
19. Team #24147 Page 17 of 20
2. If there are too many nodes and edges, the computer workload would be
immense.
3. The problem of the identity of each node may not be considered.
The power of network: Our network may be widely utilized in studying social relationship,
advanced management, individual choice, which the correlation is
undirected or two-way.
6.2 The assessment and popularization of quotation network
Strengths: 1. The network can reduce the pressure of computing on computer, so it decrease
the time that we are waiting for results.
2. The situation of authority of each paper has been taken into account.
3. The sensitivity of our methodology is relative low, which means that the
stability of our algorithm can be ensured basically.
Weakness: A more detailed study cannot be implemented. For example, the influence of
weak correlation is difficult to measure.
The power of our network: Our network may be widely utilized in studying enterprise
organization, self-promotion and capital accumulation, because these areas
involve change or flow—maybe from one person to another, from one position to
another or from the past to the present to the same person.
6.3 Some suggestions
Further, according to MEI model, some suggestions have been proposed for college students
eager to enhance their influence:
1. making one‘s best to join a research team and seeking for opportunities of cooperation;
2. participating a research team with more international exchanges as much as possible;
3. choosing a newly emerging subject or interdisciplinary field to commence research.
References
[1]http://www.sitis-conf.org/en/workshop-on-complex-networks-and-their-applications-compl
ex-networks-2012.php?Preview=ok.
[2] M. E. J. Newman, The structure and function of complex networks, SIAM Review, 45,
167–256 (2003)
[3] Redner S., How popular is your paper? An empirical study of the citation distribution, Eur
Phys J B, 4, 131-134 (1998)
[4] Newman, M. The structure of scientific collaboration networks. Proc. Natl.Acad. Sci.
USA, 98: 404-409, January 2001.
[5] Xu Ling, Research on co-author network based on SCIENCE [D], Shanghai Jiao Tong
University [D], 2009.
[6] Zhao Yan, Research on routing protocols of opportunistic networks based six degrees of
separation [D], Qiqihar University, 2012.
[7] Xiang Linying. Chen xiangqiang, Review on modeling, analysis and control of complex
dynamic network, China academic journal electronic publishing house, 16:1543-1551, Nov.
2006.
20. Team #24147 Page 18 of 20
[8] Delu Wang; Ziwei Li, Web-based organizational structure information entropy theory
analysis, Modern Management Science, (1):65-66, 2007.
[9] Yue Xie, Research on PageRank algorithm and HITS algorithms in webpage sort [D],
University of Electric Science and Technology of China, 2012.
[10] The data of dolphins comes from http://www.datatang.com/data/769/.