SlideShare a Scribd company logo
1 of 20
Download to read offline
For office use only
T1 ________________
T2 ________________
T3 ________________
T4 ________________
Team Control Number
24147
Problem Chosen
C
For office use only
F1 ________________
F2 ________________
F3 ________________
F4 ________________
2014
Mathematical Contest in Modeling (MCM/ICM) Summary Sheet
(Attach a copy of this page to your solution paper.)
Type a summary of your results on this page. Do not include
the name of your school, advisor, or team members on this page.
Summary
In this paper, we establish two types of network including directed and undirected
network, handing with the following five tasks.
For task one, a study is conducted with the maximum connected graph. Through the
distribution of degree and distance, the property of ‗a small world‘ and ‗scale-free‘ can be
spotted. Then, based on information entropy theory, three indicators have been constructed.
In comparison with the means of corresponding indicator of 100 networks in the same scale,
our co-author network has the properties of effectiveness, organization and cooperation.
For task two, four relevant indicators are built including the cooperation time, representing
research longevity; degree, showing direct cooperation; quadratic correlation, reflecting
indirect cooperation; betweenness, indicating the link of collaboration. Imitating the
calculation of gross input in economics, we set up a calculable MEI assessment model,
which reveals that the top two are HARARY, FRANK* and SOS, VERA TURAN. Then we
test the results by Monte Carlo method.
For task three, a directed quotation network with several types of correlation is erected.
―Weak correlation‖ appeared from papers in the same filed is defined. The value of
PageRank computed through modified PageRank algorithm shows that the paper entitled
Collective dynamics of “small-world” networks ranks first. Depiction of relationship among
universities or departments can be discussed via improvement.
For task four, the dolphin network shows the feature of ‗small world‘ rather than ‗free scale‘.
The degree of order is higher than that of random ones. The most influential dolphin is
called SN100. The degree of intelligence has some relevance with free scale and
organization by comparing two networks mentioned.
For task five, pros and cons as well as sensitivity of the presented model are summarized.
Some suggestions have been proposed for college students eager to enhance their influence.
Keywords: Co-author network MEI Modified PageRank Weak correlation
Contents
1. Introduction ........................................................................................................................... 1
2. Assumptions .......................................................................................................................... 1
3. The construction of co-author network and influence assessment ........................................ 1
3.1 Data extraction and establishment of the co- author network ..................................... 1
3.1.1 The original network ........................................................................................ 1
3.1.2 The erection procedure ..................................................................................... 1
3.2 Property analysis of the co-author network................................................................. 2
3.2.1 Diagrammatic sketch and fairly interesting findings........................................ 2
3.2.2 Limitation of the size of our network by connected graph............................... 3
3.2.3 The Degree and its Distribution........................................................................ 3
3.3 Analysis based on information entropy theory............................................................ 5
3.3.1 The explanations of timeliness and time efficacy entropy in network
structure..................................................................................................................... 6
3.3.2 The explanations of quality and quality entropy in network structure. ............ 6
3.3.3 The results and comparison .............................................................................. 7
3.4 Four fundamental indicators and Marshall Entropy Index (MEI) ............................... 7
3.4.1 Introduction ...................................................................................................... 7
3.4.2 Calculating evaluating model ........................................................................... 8
3.4.3 Testing of evaluating model.............................................................................. 9
4. Construction of quotation network and modified PageRank algorithm .............................. 10
4.1 Conceptual framework .............................................................................................. 10
4.2 Construction of quotation network.............................................................................11
4.2.1 Quotation network of direct correlation ..........................................................11
4.2.2 Slightly complicated quotation network..........................................................11
4.3.3 The final quotation network ........................................................................... 12
4.3 Modified PageRank algorithm and its application .................................................... 12
4.3.1 The model corresponding PageRank algorithm ............................................. 12
4.3.2 Modified PageRank algorithm........................................................................ 13
4.4 Results and further discussion................................................................................... 13
5. The dolphin network and the analysis ................................................................................. 15
5.1 The erection of dolphin network and diagrammatic sketch....................................... 15
5.2 The value of indicators and further discussion.......................................................... 15
6. The assessment and popularization of our network............................................................. 16
6.1 The assessment and popularization of co-author and dolphin network..................... 16
6.2 The assessment and popularization of quotation network......................................... 17
6.3 Some suggestions ...................................................................................................... 17
References ............................................................................................................................... 17
Team #24147 Page 1 of 20
1. Introduction
Real-world entities often interconnect with each other through explicit or implicit
relationships among them then form a complex network. There are so many examples of
complex networks in different area such as natural systems, engineering systems, economic
systems as well as social systems. [1] Generally, each node in the complex network indicates
an individual in the real world, and each side linking two individuals indicates an interaction
between the two. As to the co-author systems presented in ICM 2014, studying the internal
property of them has a tremendous meaning of evaluating a scientist‘s achievement. Inspired
by empirical studies of networked systems, researchers have developed a variety of
techniques and models( the small-world effect, degree distributions, random graph models
and so on) to help us understand or predict the behavior of these systems in recent few years.
[2] In this article, it‘s accepted by both of us that the influence of scientists is closely of
relevance to those for whom scientific publication is a primary means of scholarly
communication. [3] Moreover, a network was built to analyze the relationships between the
species of dolphin after completing the analysis of co-author networks.
2. Assumptions
(1) The original data is believable;
(2) There is no academic malpractices among the listed authors;
(3) There is no obvious preference in the process of statistics;
(4) There is no article which violates the cited articles
3. The construction of co-author network and influence assessment
A co-author network is a collection of authors; each of them is acquainted with some subsets
of the others. Such a network can be represented as a set of nodes denoting authors joined by
edges denoting acquaintance. [4] In this section, the whole construction process of our
co-author network in detail includes data extraction, modeling effects, erection of four
different types of indicator. Through statistical analysis on the results, the author proposes our
conclusion based on the comprehensive indicator—Marshall Entropy Index (MEI).
3.1 Data extraction and establishment of the co- author network
3.1.1 The original network
Initially, the data from the file in the given website is guided into document whose extended
name is txt before being imported into Matlab, which is a mathematical calculation software,
designed for data processing. By applying the mechanism of cell array, the direct
co-authors— there is no blank before their names in given files— can be successfully
extracted which the quantity is 511 representing 511 co-authors in a specific sequence.
Subsequently, an original co-author network with 511 columns and 511 rows has been set up
on the previous stages.
3.1.2 The erection procedure
A brief treatment of valuation may be discussed in this part. Firstly, all elements in the
Team #24147 Page 2 of 20
original network are assigned to zero representing the initial state because of the convenience
to reach the following process. Secondly, we are asked to find whether the direct co-author
exists in all indirect co-author who is the co-author of the co-author of Erdos. If it works, the
corresponding elements would plus 1. For example, ABBOTT, HARVEY LESLIE has links
with MEIR, AMRAM; and it can be found that they are co-authors respectively. ABBOTT,
HARVEY LESLIE represents column 1 and row1, MEIR, AMRAM represents column 232
(provided) and row 232 (provided). So, the value of the elements A(1,232) and A(232,1)
change from 0 to 1—this occurs in the situation when the computer inspects ABBOTT,
HARVEY LESLIE. Thus, as what you will think after a correspondence data processing
procedure, the original co-author network becomes a symmetric matrix that contains only 0 or
1 in each element. In addition, if ABBOTT, HARVEY LESLIE did not have any links with
ACZEL, JANOS D., the elements (1,2) and (2,1) in this co-author matrix remain 0; otherwise,
the value will become 1. Finally, by calculating the amount of non-zero value in each rank,
the quantity of co-author of each author can be obtained.
Presenting it with the type of matrix:
)5111,5111(   jiAij , (1)
Also, the first six authors are stated as following:
Table 1 The number of co-authors connected directly
Name The number of co-authors
ABBOTT, HARVEY LESLIE
ACZEL, JANOS D.
AGOH, TAKASHI
AHARONI, RON
AIGNER, MARTIN S.
AJTAI, MIKLOS
7
2
1
11
5
9
3.2 Property analysis of the co-author network
The co-author network mentioned is a symmetric matrix that contains only 0s and 1s in each
element position. Now, we prepare to utilize several tables and charts to intuitively illustrate
our co-author network‘s details directly.
3.2.1 Diagrammatic sketch and fairly interesting findings
Team #24147 Page 3 of 20
Fig. 1. Diagrammatic sketch of co-author network
Fig.1 indicates the cooperative relationship between the 511 authors. Through it‘s somewhat
intricate and complex, the figure presents an simple connection among the 511 authors. It
cannot be ignored that considerable nodes are isolated, such as node7, node83 and node375,
which means these authors corresponding to the isolated nodes have no direct cooperation
with other authors listed in the network. As a result, to some level, efforts in succession work
—the calculation of betweenness and quadratic correlation, which finally influence the final
score of each author when computing MEI. It would be so puzzling that more focus have to
be paid to this kind of situation. It is accepted by us that one of the reasons may be the
author ,cooperating with Erdos just for one time, is isolated with other co-author network
researchers because of the diverse areas compared to the co-authors‘. Interdisciplinary
research may be another reason.
So the author predicts courageously, that the isolated one might be one of the leaders of a
specific field in mathematics, or might be devoted to interdisciplinary research in various
subjects, such as music and economics. For example, the author, ASHBACHER, CHARLES
D. taught physics in college. However, he changed his direction from physics to the art after
graduating from college and composed a song with Wilson, Lewis listed in the co-author
order. As to the motivation he changed his profession, there are numerous stories showing that
it may be the cause of interest or love affairs. But, IT IS NOT CERTAINLY TRUE. It looks
like fairly interesting; yet unfortunately, we are not able to find it out.
3.2.2 Limitation of the size of our network by connected graph
After initiative presentation, the paper starts to narrow the size of our network with connected
graph, which exists an edge between arbitrary nodes. By selecting the largest connected graph
with 466 nodes and numerous edges, we change our attention just from 511 nodes to 466
nodes regarding the number of 466 authors as our study area.
3.2.3 The Degree and its Distribution
Team #24147 Page 4 of 20
Moreover, the degree and its distribution, which have been significant features in any graph,
come to be illustrated in this part. However, before proposing diagrams, we firstly define the
notion of ―degree‖.
Degree [5]: the degree of a certain node represents the amount of collaborator that a certain
author has. In graph theory, the degree of a certain node indicates the amount of edges jointed
the node. The representation of matrix is shown as follow:
)4661,4661(
466
1
  
jiAD
j
iji (2)
Fig. 2. Frequency histogram of degrees
The figure 2 demonstrates that with the increasing number of degree, the corresponding
quantity of co-author declines gradually. As we can see, the number of cooperator of most
co-author remains no more than 10 which is less than 3% of the gross co-author.
Fig. 3. Frequency histogram of the length of the shortest path
Team #24147 Page 5 of 20
The frequency of the length of the shortest path between two typical nodes can be calculated
by taking advantage of modified Floyd algorithm, a commonly used methodology for
compute the shortest path in graph theory. Moreover, the frequency that the length of the
shortest path lower than six can be counted out as 0.9783, approximately equal to 1. Maybe
the number ―six‖ seem too puzzling, yet the famous ―Six Degrees of Separation ‖ [6], which
was proposed by Harvard professor Frigyes Karinthy, indicates that everyone and everything
is six or fewer steps away. Namely, people can connect with each other in a maximum of five
middlemen. It can be fundamentally accepted by us that our calculation is consistent with
well-known theory.
Fig. 4. Frequency histogram of degrees and its fitting curve
After inspecting the frequency histogram of degrees, the author fits its regular pattern with
three different distribution types—the Poly3, the Exp2, the Fourier3—which have been
presented in Fig.4. Furthermore, some statistical magnitudes, eg. SSE, R-square, can be
worked out respectively. On contrast, the Exp2 is the most fitting curve where a few nodes
possess a large amount of edges while many nodes possess relatively fewer ones. More
importantly, our co-author network can be regarded as scale-free network [7] where the
distribution of degree corresponds with power law degree distribution.
Table 2 Some Statistical Magnitudes of Distribution of Degree
Equation name SSE R-square Adjusted R-square RMSE
Poly3 0.01616 0.8219 0.8123 0.01699
Exp2 0.0009141 0.9899 0.9894 0.00404
Fourier3 0.004283 0.9528 0.9464 0.009075
3.3 Analysis based on information entropy theory
The degree of order in strategic perspective may be defined by information entropy, which
can be divided into timeliness of information entropy and quality of information entropy. [8]
The degree of order in network structure (R) is an index considering the timeliness and
Team #24147 Page 6 of 20
quality of information which can be expressed as equation 3,
, (3)
where R1 and R2 mean the timeliness of information entropy and the quality of
information entropy.
3.3.1 The explanations of timeliness and time efficacy entropy in network structure.
Timeliness in network structure can be defined as the degree of time used in the transmitting
information from one node to another; and time efficacy entropy is the degree of uncertainty
in timeliness of which formula as follow,
, (4)
Supposing there is a network structure with n nodes and i and j mean two optional nodes, the
value of timeliness (R1) and time efficacy entropy (H1) can be calculated as following steps.
○1 calculate Lij. The length of node i and node j (Lij) is defined as the shortest path
between two nodes which means the value is 1 if the two nodes connect directly and added
with the times of transit.
○2 calculate A1. The total of timeliness in network structure (A1) can be calculated as
formula 5.
, (5)
○3 calculate P1(ij). The probability of two nodes connected together (P1(ij)) is defined as
eq.6,
, (6)
○4 calculate H1(ij). The time efficacy entropy between node i and node j (H1(ij)) can be
calculated in the following way,
, (7)
○5 obtain H1 and H1M. The total of time efficacy entropy in network structure (H1) and
the maximum of H1 can be got in equation 8 and equation 9.
, (8)
, (9)
○6 obtain R1. After the above steps, the timeliness of the network structure (R1) can be
got in the equation 10.
. (10)
3.3.2 The explanations of quality and quality entropy in network structure.
Quality in network structure is defined as the degree of accuracy that used in the transmitting
of information between one node and another; and quality entropy is the degree of uncertainty
in quality. The difference of calculating quality and quality entropy is in the method of
defining Lij. and A1 which are defined in the quality and quality entropy as follow.
Team #24147 Page 7 of 20
The number of nodes which connect directly (Ki) and the total of Ki in the network structure
(A2), , replace the original Lij and A1. Except for the two definitions, the
following data, the probability of quality of node i (P2(i)), the quality entropy of node i
(H2(i)), the total of quality entropy in network structure (H2), the maximum of H2 (H2M) and
the quality of the network (R2), can be calculated as the same way of calculating the
timeliness and time efficacy entropy in following equations.
3.3.3 The results and comparison
The results of three indicator—degree of timeliness, quality and order—are shown in table 3:
Table 3 The results of information entropy
timeliness quality the degree of order
The co-author net 0.1022 0.2979 0.1745
The simulated net 0.0341 0.4743 0.1271
* The number of nodes and edges in simulated net is same to that in given data.
The simulated net is that after erecting the same net with our co-author network except that
the ways of connection are fairly random. Having simulated a hundred nets, the author can
also calculate the corresponding average value to compare.
The table 3 reveals that though the value of the quality of entropy is lower than that of the
simulated network, the values of the other two are higher than those of the simulated one,
which cannot be ignored. That is to say, our co-author network seems to be more relevant and
ordered.
3.4 Four fundamental indicators and Marshall Entropy Index (MEI)
The ways to construct our co-author network have been mentioned. In this section, the author
will combine four fundamental indicator including Quadratic correlation, time, betweenness
and degree, and a comprehensive indicator—Marshall Entropy Index (MEI).
3.4.1 Introduction
Degree: The degree of a certain node represents the amount of collaborator that a certain
author has. In graph theory, the degree of a certain node indicates the amount of lines jointed
the node. The representation of matrix is shown as follow:
)4661,4661(
466
1
  
jiAD
j
iji , (11)
Betweenness: Betweenness is a global variable which reflects the impact and
Team #24147 Page 8 of 20
influence to the relationship in nodes or edges. It can be defined as equation (12):
  

nm nm
i
i
nmg
nmg
nmbB
),(
),(
),(i , (12)
Where ),( nmg is the length of the shortest distance between node m and node n and
),( nmgi is the length of the shortest distance between node m and node n which goes
through the node i.
Time: In this article, the author considers that the earlier an author cooperate with Erdos is,
greater the influence in academic has. What lead us to do this supposal originate from the
following two reasons. First, the earlier cooperation they had, the higher probability of
acknowledgement of Erdos to one‘s academic potential would be. Second, if they could
collaborate with each other in the earlier time, they may make a greater contribution in
academic world. It can be defined as follow equation (13):
)4661(2014  itT ii , (13)
Quadratic correlation: Quadratic correlation is an index to measure the indirect correlation of
a node. We can define the new measure as shown in Eq. (14) in which iA means a set of all the
nodes connect directly to the node i,       3,2,1 iiii AAAA  ;  iAn means the size of
all the factors in iA ; iE means a set of all the nodes connect the node i in two paths but
cannot reach in one path. Furthermore, the factor in iE should not include the node i,
     iAAAi AiAAAE iii
 )3()2()1( ; the definition of  iEn is just like
 iAn , meaning the size of all the factors in iE .
 
 i
i
i
An
En
Q  , (14)
Marshall Entropy Index: In economics study field, when it comes to calculate input-output
table in agriculture industry, though the types of input or output are fairly different and
uncorrelated, or have somewhat subtle links, the gross value of input or output can be
obtained by the method of continued multiplication before getting logarithm to the computed
product. Of course, get standardization when it is necessary. Inspired by this idea, the
processing procedure of four indicators will contain standardization, continued multiplication
and logarithm process. As a result, a comprehensive numerical value—Marshall Entropy
Index(MEI),an index illustrating the overall influence of each co-author—can be achieved.
(15)
3.4.2 Calculating evaluating model
According to principals elaborated above, numerical values of each indicator with
every 466 co-authors can be attained. Further, after sizing these figures down, top ten
figures of each indicator can be elected in Table 4.
Table 4 Ranking in Accordance with Four Fundamental Indicator and MEI
Rank Time Degree
Team #24147 Page 9 of 20
1 SZEKERES, GEORGE* (80) ALON, NOGA M. (52)
2 TURAN, PAL* (80) GRAHAM, RONALD LEWIS (44)
3 DAVENPORT, HAROLD* (78) HARARY, FRANK* (44)
4 FELDHEIM, ERVIN* (78) BOLLOBAS, BELA (43)
5 GALLAI, TIBOR* (GRUNWALD, TIBOR) (78) RODL, VOJTECH (43)
6 VAZSONYI, ANDREW* (WEISZFELD, ENDRE) (78) FUREDI, ZOLTAN (40)
7 GILLIS, JOSEPH E.* (77) TUZA, ZSOLT (40)
8 JARNIK, VOJTECH* (77) SOS, VERA TURAN (38)
9 OBLATH, RICHARD* (77) SPENCER, JOEL HAROLD (35)
10 GRUNWALD, GEZA* (76) GYARFAS, ANDRAS (32)
Rank Quadratic correlation Betweenness Marshall entropy index
1 BARAK, AMNON B. (52) HARARY, FRANK* (9587.7) HARARY, FRANK*(21.9)
2 COPELAND, ARTHUR HERBERT, SR.* (44) SOS, VERA TURAN(8912.2) SOS, VERA TURAN (21.8)
3 HARZHEIM, EGBERT (44) RUBEL, LEE ALBERT* (8573.3) BOLLOBAS, BELA (21.8)
4 MINC, HENRYK (44) STRAUS, ERNST GABOR* (8300.7) GRAHAM, RONALD LEWIS (21.7)
5 SARKAR, AMITES (43) POMERANCE, CARL BERNARD (7434.7) STRAUS, ERNST GABOR* (21.5)
6 ANDRASFAI, BELA (38) FUREDI, ZOLTAN (7410.1) ALON, NOGA M. (21.4)
7 ZAREMBA, STANISLAW KRYSTYN* (38) ALON, NOGA M. (6871.9) FUREDI, ZOLTAN (21.3)
8 LEWIN, MORDECHAI (29) GRAHAM, RONALD LEWIS (6817.2) HAJNAL, ANDRAS (21.2)
9 PENNEY, DAVID EMROY, II (29) BOLLOBAS, BELA (6699.5) PACH, JANOS (20.8)
10 SCHMUTZ, ERIC J. (29) PACH, JANOS (6073.6) TUZA, ZSOLT (20.7)
* Value in brackets means the score author got in the related index.
The table indicates that H., S. and B. are the top three in the final score of MEI. Unfortunately,
HARARY, FRANK has already died. On the other hand, they are listed in top of 10 both in
the Degree indicator and Betweenness indicator, which means that the indicator of Degree and
Betweenness are relatively important in MEI even after standardizing. However, that does not
mean that the indicator of Time and Quadratic correlation are meaningless. In contrast, the
appliance of Time and Quadratic correlation provides a different perspective to analyses the
results of evaluation. For example, we can get some ideas from the time indicator that the
earlier one cooperate with Erdos, the more influence he tends to be, though those who rank
among top 10 in time indicator do not rank among the first 10 in MEI.
In conclusion, if we stand on the bank of the river of the Network Science to judge who is the
most influential, the answer may be HARARY, FRANK, passed away years ago. If we are
attempted to make an assessment standing present, the most important researchers in 466
given authors SOS, VERA TURAN and BOLLOBAS, BELA.
3.4.3 Testing of evaluating model
In order to test the accuracy of results, the paper plans to remove the top two in MEI,
HARARY, FRANK* and SOS, VERATURAN from the 511 co-authors and acquire the
maximum of connected nodes which is 459 from the network of the maximum of connected
nodes to the rest. In addition, we‘ve tried to wipe off two random nodes and recorded the
maximum of the connected nodes after 2000 times of trial as figure 5.
Team #24147 Page 10 of 20
459 460 461 462 463 464 465 466
0
200
400
600
800
1000
1200
the maxmium of connected nodes
frequencies
Fig. 5. 2000 trials of the random nodes
The figure 5 shows that the times of the maximum of connected nodes which is less than 459
are close to 0. Therefore, it‘s obvious that HARARY, FRANK* and SOS, VERATURAN are
of importance to the connection of the co-author network which means our evaluating model
is believable.
4. Construction of quotation network and modified PageRank algorithm
In this section, the concepts of direct correlation, indirect correlation and weak correlation
will be introduced. Then, sixteen research papers in network science field in the given file can
be employed to erect our quotation network presenting the concepts of direct correlation,
indirect correlation and weak correlation. After that, we are to combine the quotation network
built with classical PageRank algorithm and to make some corresponding improvement.
Afterwards, having computing the relative influence of each paper via modified PageRank
algorithm, the author finds that the paper entitled „collective dynamics of „small world‟
network‟ written by Watts and Newman, M. is the most influential research paper among
sixteen theses in network science. Finally, further discussion about our quotation network
model and modified PageRank algorithm will be made.
4.1 Conceptual framework
It is undeniable that the establishment of quotation network cannot implement without
conceptual framework. Due to the requirements of technicality and rigor, the situation of
mutual citation will inevitable appears. Namely, two research papers have direct correlation
with each other. Further, indirect correlation exists only when an article has relationship with
the quotation of cited paper. In this paper, the author defines that both twice quotation and
three time quotation are covered in this concept. In addition, there are some relationships
among all of articles published in the network field. For example, there may be some relation
among sixteen papers in network science, which we will discuss later, simply because that
Team #24147 Page 11 of 20
their major focus concentrates on the same field, especially when it comes to ideas,
deductions and conclusions. So, the paper defines this relevance as weak correlation. It is
fairly apparent that as for power of influence, direct correlation is greater than indirect
correlation, while weak correlation is the minimum in these three. In this way, the conceptual
framework is clear and definite.
4.2 Construction of quotation network
Overall, the erection of our quotation network will be implemented in three stages. The first
part of our plan is to erect a quotation network only with direct correlation among all sixteen
articles. Our next business is to implement the task of building a quotation network which
contains both direct correlation and indirect correlation. The processing procedure in
establishing the quotation network involving all three concepts is the final step. Next, a
detailed depiction will be made on the procedure for setting up quotation network and present
diagrammatic sketch in every stage.
4.2.1 Quotation network of direct correlation
Since there are sixteen papers extracted from network science, a matrix E where consists of
the combination of 16*16 should be established. Next, given a matrix E in which
1e ij while node i can directly reach to node j or 0ije otherwise, the matrix can be
defined as:
)161,161(   jiEij
We are supposed to pay attention to the idea that this matrix is quite different from ijA ,
which we have constructed in section 2, simply because this matrix is no longer symmetric
— the procedure of citation is one-way. That is to say, the only occasion will appear when
theses published later would cite theses published already. It can be presented in Fig. 6
Fig. 6 Original quotation network
4.2.2 Slightly complicated quotation network
In this step, the indirect quotation should be added to the original quotation network. As for
the matrixE , we vary it as follow:
3
3
2
2
~
e
E
e
E
EE  , (16)
Where e means the natural exponential (approximately 2.718281828)
In the view of the influence generated by twice and three times quotation is smaller than that
of direct influence, such variation should be made.
Team #24147 Page 12 of 20
4.3.3 The final quotation network
Followed by the thoughts of 3.2.2, a million times of variation could be made. However, the
workload would be onerous. As the saying goes, Simplicity is the ultimate sophistication. So,
many times of variation can be simplified with the form of adding 1/16 in each element to
reveal the weak influence. We can also present it as Fig. 7:
Fig. 7 Quotation network based on B~
We can see that fig. 7 is far more complicated than Fig. 6.
4.3 Modified PageRank algorithm and its application
4.3.1 The model corresponding PageRank algorithm
PageRank algorithm [9], proposed by Sergey Brin and Lawrence Page, is widely applied in
search engine on information collection. In our modified PageRank algorithm, papers are
defined as assemblage  NS 2,1 . The PageRank value is denoted by nr . nO can be
defined out-degree of the paper, meaning the amount of number of citing paper. So, the
expression the PageRank value of the paper n can be denoted by:


nAm m
m
n
O
r
r , (17)
According to this expression, diagrammatic sketch of calculating the value of Page Rank is
illustrated schematically in fig. 8.
1e
2e
3e 4e
100
50
50
50
20
10
10
60
Team #24147 Page 13 of 20
Fig. 8 Diagrammatic of computing the PageRank value
The numbers in the figure show the capacity of inflowing and out flowing
Then, by constructing quotation transference probability matrixC , where
i
ij
O
C
1
 while
paper i is cited by paper j or 0ijC otherwise, we can get the model corresponding
PageRank algorithm:








0
1
R
eR
RGR
T
T
(18)
where the prominent character is that the sum of all component products in eigenvector R is
1. The values of component product basically determine the ranking among papers. The larger
the value of component product is, the more weightiness a paper will be, proving that the
paper is more important, namely, ranking higher.
4.3.2 Modified PageRank algorithm
In this part, a flow chart figure 9 can be presented to demonstrate our modified PageRank
algorithm.
Input adjacency
matrix-E and the
size-n、
convergence
threshold- sigma
E=E+E^2/e^2+E^3/e^3+1/nCommence
Calculate transition
probability matrix C
VectorR generated
randomly with initial
value of M-PR
X=PRR=X
max(X-R)
>sigma?
R=X/sum(R)Output RFinish FALSE
TRUE
Fig. 9. The flow chart of modified PageRank algorithm
4.4 Results and further discussion
Having calculating the value of modified PageRank via modified PageRank algorithm, a clear
result can be gained in table 5:
Table 5 The rank of the papers in the value of M-PageRank
Rank
the Value of
M-PAGERANK
the I.D. of
Paper
the Name of
Paper
Team #24147 Page 14 of 20
1 0.2131 14 Collective dynamics of `small-world' networks
2 0.1682 8 Navigation in a small world.
3 0.0839 3 Power and Centrality: A family of measures
4 0.0799 4 Emergence of scaling in random networks
5 0.0568 6 Models of core/periphery structures
6 0.0423 13 Identity and search in social networks
7 0.0420 1 On Random Graphs
7 0.0420 10 The structure of scientific collaboration networks
9 0.0383 11 The structure and function of complex networks
10 0.0361 2 Statistical mechanics of complex networks
10 0.0361 9 Scientific collaboration networks
12 0.0321 5 Identifying sets of key players in a network.
12 0.0321 7 On properties of a well-known graph.
12 0.0321 12 Networks, influence, and public opinion formation
12 0.0321 15 Statistical models for social networks
12 0.0321 16 Social network thresholds in the diffusion of innovations
As the table shows, Collective dynamics of `small-world' networks is in the first place,
followed by Navigation in a small world and Power and Centrality: A family of measures.
That is to say, Collective dynamics of “small-world” networks written by Watts, D. and
Strogatz, S. is the most influential paper in network science.
When discussing if there is a similar way to determine the role or influence measure of an
individual network researcher, the reply comes positively. The fruits of collaboration between
researchers can be presented by theses written by them, though there is no clear sense of
orientation when working together. An assumption that can be made grounded in reality is
that one researcher is ‗leader‘ and another one is ‗helper‘. When they finish theses in
cooperation, the leader may gain more relative influence, while the helper may gain less. As
to alteration of our quotation network matrix, the way to deal with is to alter the relative value
presented in the matrix. Thus, the influence of an individual network researcher can be
obtained.
As for measuring the role, influence, or impact of a specific university, department, or a
journal in network science, our quotation network also comes in handy. Whether a specific
university or a department, it is a portion in network science, which can be represented a node
in Graph Theory. A university, for instance, may have some kind of collaboration with others,
though the collaboration does not intense, which we could define it as a sort of weak
correlation. As to alteration of our quotation network matrix, the way to deal with is to add a
small number in each component product.
As the depiction above, a conclusion may be drawn that a more detailed methodology when
measuring the role, influence, or impact of a specific university, department, or a journal in
network science is needed. Among all of the alteration, weak correlation, whose quantity is
mainly determined by the number of nodes, should be emphasized. Thus, the data that need to
be collected includes that the program of cooperation between authors and their workload
respectively, the amount of publications and journals in network science, the quantity of
university or department researching network science.
Team #24147 Page 15 of 20
5. The dolphin network and the analysis
In this section, the dolphin network [10] that contains 62 nodes and some edges if the graph
could be depicted will be constructed. Then, the key indicators, such as degree, quadratic
correlation, Betweenness as well as timeliness entropy, quality entropy and the degree of
order, will also be calculated, indicating the dolphin named SN100 rank the first. Finally,
further discussion will be made.
5.1 The erection of dolphin network and diagrammatic sketch
Following the construction steps introduced in section 2, a 62*62 matrix F has been built.
The diagrammatic sketch is shown in figure 10:
Fig. 10 Diagrammatic sketch of the dolphin network
The property of Six Degrees of Separation is still notable by means of further study, while the
property of power law degree distribution is no longer apparent.
5.2 The value of indicators and further discussion
According to the principal of calculation mentioned in section 2, key indicators can be
computed that are shown in table 6 and table 7:
Table 6 The evaluation of dolphins about their influence
Rank Degree Quadratic correlation Betweenness Marshall entropy index
Team #24147 Page 16 of 20
1 Grin (22) Cross (10) SN100 (454.27) SN100 (9.48)
2 SN4 (18) Five (10) Beescratch (390.38) SN9 (8.90)
3 Topless (11) Fork (10) SN9 (261.96) Beescratch (8.86)
4 Scabs (17) MN23 (9) SN4 (253.58) SN4 (8.71)
5 Tringger (10) Quasi (9) DN63 (216.38) Kringel (8.45)
6 Jet (8) SMN (9) Jet (209.17) DN63 (8.32)
7 Kringel (13) TR82 (9) Kringel (187.84) Jet (7.91)
8 Patchback (19) Whiteti (8) Upbang (181.39) Stripes (7.88)
9 Web (22) SN89 (7.5) Trigger (154.96) Oscar (7.85)
10 Beescratch (11) Vau (6.5) Web (154.09) Upbang (7.84)
* Value in brackets means the score dolphin got in the related index.
As the table 6 demonstrates, SN100 comes in the first place, predicating it would be the most
influential dolphin of all. Noticing that the top 4 in betweenness is also stand at the first four
place, the importance of the betweenness indicator show up again, which is consistent with
the results presented previously.
Table 7 The result of information entropy of the dolphin network
Table 7 reports the result of information entropy of the dolphin network and our simulated
network, where the value of all three indicators of our dolphin network is higher than that of
simulated network, revealing that our dolphin network, being more relevant, certain and
well-organized, may be slightly different from the random net.
Further, by comparison, the dolphin network is different from the co-author network to some
extent. So, we are brave to guess the reason that ‗degree intelligent‘ may be a sensible factor.
6. The assessment and popularization of our network
It is well accepted by us that two pairs of network have been erected in this article, with an
undirected network employed in co-author network and dolphin network, and directed
network like quotation network. So, it is reasonable to discuss the networks respectively.
6.1 The assessment and popularization of co-author and dolphin network
Strengths: 1. The author or the dolphin can be abstracted as a node representing
its role in the network, which can be extended in many aspects, such as
student union and conduct business.
2. The edge represents the relationship between two authors of whether they
have cooperation or not or two dolphins of whether they have somewhat
mysterious relationship.
3. The sensitivity of our methodology is relative low, which means that the
stability of our network can be ensured basically.
Weaknesses: 1. The network can handle with problems of directed graph.
Timeliness quality the degree of order
The dolphin network 0.1379 0.3143 0.2081
The simulated net 0.1109 0.3002 0.1824
Team #24147 Page 17 of 20
2. If there are too many nodes and edges, the computer workload would be
immense.
3. The problem of the identity of each node may not be considered.
The power of network: Our network may be widely utilized in studying social relationship,
advanced management, individual choice, which the correlation is
undirected or two-way.
6.2 The assessment and popularization of quotation network
Strengths: 1. The network can reduce the pressure of computing on computer, so it decrease
the time that we are waiting for results.
2. The situation of authority of each paper has been taken into account.
3. The sensitivity of our methodology is relative low, which means that the
stability of our algorithm can be ensured basically.
Weakness: A more detailed study cannot be implemented. For example, the influence of
weak correlation is difficult to measure.
The power of our network: Our network may be widely utilized in studying enterprise
organization, self-promotion and capital accumulation, because these areas
involve change or flow—maybe from one person to another, from one position to
another or from the past to the present to the same person.
6.3 Some suggestions
Further, according to MEI model, some suggestions have been proposed for college students
eager to enhance their influence:
1. making one‘s best to join a research team and seeking for opportunities of cooperation;
2. participating a research team with more international exchanges as much as possible;
3. choosing a newly emerging subject or interdisciplinary field to commence research.
References
[1]http://www.sitis-conf.org/en/workshop-on-complex-networks-and-their-applications-compl
ex-networks-2012.php?Preview=ok.
[2] M. E. J. Newman, The structure and function of complex networks, SIAM Review, 45,
167–256 (2003)
[3] Redner S., How popular is your paper? An empirical study of the citation distribution, Eur
Phys J B, 4, 131-134 (1998)
[4] Newman, M. The structure of scientific collaboration networks. Proc. Natl.Acad. Sci.
USA, 98: 404-409, January 2001.
[5] Xu Ling, Research on co-author network based on SCIENCE [D], Shanghai Jiao Tong
University [D], 2009.
[6] Zhao Yan, Research on routing protocols of opportunistic networks based six degrees of
separation [D], Qiqihar University, 2012.
[7] Xiang Linying. Chen xiangqiang, Review on modeling, analysis and control of complex
dynamic network, China academic journal electronic publishing house, 16:1543-1551, Nov.
2006.
Team #24147 Page 18 of 20
[8] Delu Wang; Ziwei Li, Web-based organizational structure information entropy theory
analysis, Modern Management Science, (1):65-66, 2007.
[9] Yue Xie, Research on PageRank algorithm and HITS algorithms in webpage sort [D],
University of Electric Science and Technology of China, 2012.
[10] The data of dolphins comes from http://www.datatang.com/data/769/.

More Related Content

What's hot

An Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksAn Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksSubhajit Sahu
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A SurveyIRJET Journal
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Zac Darcy
 
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...IJNSA Journal
 
Software System Package Dependencies and Visualization of Internal Structure
Software System Package Dependencies and Visualization of Internal StructureSoftware System Package Dependencies and Visualization of Internal Structure
Software System Package Dependencies and Visualization of Internal StructureIJAAS Team
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)dnac
 
IRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator SystemIRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator SystemIRJET Journal
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET Journal
 
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...ijdms
 
A Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed ClusteringA Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed ClusteringIRJET Journal
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of communityIJCSES Journal
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithmnishant24894
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
 
Structured system analysis and design
Structured system analysis and design Structured system analysis and design
Structured system analysis and design Jayant Dalvi
 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach IJCSIS Research Publications
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...IOSR Journals
 

What's hot (20)

An Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksAn Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer Networks
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A Survey
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[
 
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...
 
Software System Package Dependencies and Visualization of Internal Structure
Software System Package Dependencies and Visualization of Internal StructureSoftware System Package Dependencies and Visualization of Internal Structure
Software System Package Dependencies and Visualization of Internal Structure
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 
A4 elanjceziyan
A4 elanjceziyanA4 elanjceziyan
A4 elanjceziyan
 
IRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator SystemIRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator System
 
17 Statistical Models for Networks
17 Statistical Models for Networks17 Statistical Models for Networks
17 Statistical Models for Networks
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
 
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...
DESIGN METHODOLOGY FOR RELATIONAL DATABASES: ISSUES RELATED TO TERNARY RELATI...
 
A Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed ClusteringA Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed Clustering
 
Taxonomy and survey of community
Taxonomy and survey of communityTaxonomy and survey of community
Taxonomy and survey of community
 
Data Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering AlgorithmData Mining In Social Networks Using K-Means Clustering Algorithm
Data Mining In Social Networks Using K-Means Clustering Algorithm
 
Current trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networksCurrent trends of opinion mining and sentiment analysis in social networks
Current trends of opinion mining and sentiment analysis in social networks
 
One Graduate Paper
One Graduate PaperOne Graduate Paper
One Graduate Paper
 
Structured system analysis and design
Structured system analysis and design Structured system analysis and design
Structured system analysis and design
 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
 
E017433538
E017433538E017433538
E017433538
 

Viewers also liked

2013 CHINA
2013 CHINA2013 CHINA
2013 CHINALI HE
 
MMAE554 Term Paper
MMAE554 Term PaperMMAE554 Term Paper
MMAE554 Term PaperLI HE
 
2014 Central China Area
2014 Central China Area2014 Central China Area
2014 Central China AreaLI HE
 
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)LI HE
 
MMAE545-Final Report-Analysis of Aircraft Wing
MMAE545-Final Report-Analysis of Aircraft WingMMAE545-Final Report-Analysis of Aircraft Wing
MMAE545-Final Report-Analysis of Aircraft WingLI HE
 
Civilizacao Solar
Civilizacao Solar Civilizacao Solar
Civilizacao Solar guestd2646d
 
Facebook dmk proprty
Facebook  dmk  proprtyFacebook  dmk  proprty
Facebook dmk proprtygoogle
 
Oficinas de Garrigues en Colombia en español (abril 2014)
Oficinas de Garrigues en Colombia en español (abril 2014)Oficinas de Garrigues en Colombia en español (abril 2014)
Oficinas de Garrigues en Colombia en español (abril 2014)Garrigues abogados
 
5 praktische tips voor een veilige opslag
5 praktische tips voor een veilige opslag5 praktische tips voor een veilige opslag
5 praktische tips voor een veilige opslagManutan
 
Preguntas frecuentes
Preguntas frecuentesPreguntas frecuentes
Preguntas frecuentesWhops
 
Airiti books user_guide
Airiti books user_guideAiriti books user_guide
Airiti books user_guideairitiBooks
 
Seminario ortografia 2011
Seminario ortografia 2011Seminario ortografia 2011
Seminario ortografia 2011Josmiliteratura
 
Judicious use of custom development in an open source component architecture
Judicious use of custom development in an open source component architectureJudicious use of custom development in an open source component architecture
Judicious use of custom development in an open source component architectureSky Bristol
 
пакеты сети 2011 (source may '11)
пакеты сети 2011 (source may '11)пакеты сети 2011 (source may '11)
пакеты сети 2011 (source may '11)breus
 
2014 telpas reading_test_manual
2014 telpas reading_test_manual2014 telpas reading_test_manual
2014 telpas reading_test_manualLuis Acosta
 
Unidad_2_Internet
Unidad_2_InternetUnidad_2_Internet
Unidad_2_InternetITIC
 
Weekly update 30 june 2011
Weekly update 30 june 2011Weekly update 30 june 2011
Weekly update 30 june 2011Tammy Flores
 

Viewers also liked (20)

2013 CHINA
2013 CHINA2013 CHINA
2013 CHINA
 
MMAE554 Term Paper
MMAE554 Term PaperMMAE554 Term Paper
MMAE554 Term Paper
 
2014 Central China Area
2014 Central China Area2014 Central China Area
2014 Central China Area
 
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)
MMAE557 Consulting Project-Li He(A20358122),Xingye Dai(A20365915)
 
MMAE545-Final Report-Analysis of Aircraft Wing
MMAE545-Final Report-Analysis of Aircraft WingMMAE545-Final Report-Analysis of Aircraft Wing
MMAE545-Final Report-Analysis of Aircraft Wing
 
Civilizacao Solar
Civilizacao Solar Civilizacao Solar
Civilizacao Solar
 
Facebook dmk proprty
Facebook  dmk  proprtyFacebook  dmk  proprty
Facebook dmk proprty
 
Oficinas de Garrigues en Colombia en español (abril 2014)
Oficinas de Garrigues en Colombia en español (abril 2014)Oficinas de Garrigues en Colombia en español (abril 2014)
Oficinas de Garrigues en Colombia en español (abril 2014)
 
Katalog uz saytov
Katalog uz saytovKatalog uz saytov
Katalog uz saytov
 
5 praktische tips voor een veilige opslag
5 praktische tips voor een veilige opslag5 praktische tips voor een veilige opslag
5 praktische tips voor een veilige opslag
 
Made in italy
Made in italyMade in italy
Made in italy
 
Preguntas frecuentes
Preguntas frecuentesPreguntas frecuentes
Preguntas frecuentes
 
Airiti books user_guide
Airiti books user_guideAiriti books user_guide
Airiti books user_guide
 
Seminario ortografia 2011
Seminario ortografia 2011Seminario ortografia 2011
Seminario ortografia 2011
 
Judicious use of custom development in an open source component architecture
Judicious use of custom development in an open source component architectureJudicious use of custom development in an open source component architecture
Judicious use of custom development in an open source component architecture
 
пакеты сети 2011 (source may '11)
пакеты сети 2011 (source may '11)пакеты сети 2011 (source may '11)
пакеты сети 2011 (source may '11)
 
Binder1
Binder1Binder1
Binder1
 
2014 telpas reading_test_manual
2014 telpas reading_test_manual2014 telpas reading_test_manual
2014 telpas reading_test_manual
 
Unidad_2_Internet
Unidad_2_InternetUnidad_2_Internet
Unidad_2_Internet
 
Weekly update 30 june 2011
Weekly update 30 june 2011Weekly update 30 june 2011
Weekly update 30 june 2011
 

Similar to 2014 USA

Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & associjerd
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - DocumentNishna Ma
 
IRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET Journal
 
Corporate bankruptcy prediction using Deep learning techniques
Corporate bankruptcy prediction using Deep learning techniquesCorporate bankruptcy prediction using Deep learning techniques
Corporate bankruptcy prediction using Deep learning techniquesShantanu Deshpande
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET Journal
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...cscpconf
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET Journal
 
Query-drift prevention for robust query expansion
Query-drift prevention for robust query expansionQuery-drift prevention for robust query expansion
Query-drift prevention for robust query expansionLiron Zighelnic
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
 
On the Choice of Models of Computation for Writing Executable Specificatoins ...
On the Choice of Models of Computation for Writing Executable Specificatoins ...On the Choice of Models of Computation for Writing Executable Specificatoins ...
On the Choice of Models of Computation for Writing Executable Specificatoins ...ijeukens
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET Journal
 
0912f50eedb48e44d7000000
0912f50eedb48e44d70000000912f50eedb48e44d7000000
0912f50eedb48e44d7000000Rakesh Sharma
 

Similar to 2014 USA (20)

美赛论文
美赛论文美赛论文
美赛论文
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
IRJET- Online Course Recommendation System
IRJET- Online Course Recommendation SystemIRJET- Online Course Recommendation System
IRJET- Online Course Recommendation System
 
Corporate bankruptcy prediction using Deep learning techniques
Corporate bankruptcy prediction using Deep learning techniquesCorporate bankruptcy prediction using Deep learning techniques
Corporate bankruptcy prediction using Deep learning techniques
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 
Data Science Machine
Data Science Machine Data Science Machine
Data Science Machine
 
Query-drift prevention for robust query expansion
Query-drift prevention for robust query expansionQuery-drift prevention for robust query expansion
Query-drift prevention for robust query expansion
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
 
On the Choice of Models of Computation for Writing Executable Specificatoins ...
On the Choice of Models of Computation for Writing Executable Specificatoins ...On the Choice of Models of Computation for Writing Executable Specificatoins ...
On the Choice of Models of Computation for Writing Executable Specificatoins ...
 
merged_document
merged_documentmerged_document
merged_document
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
 
Machine_Learning_Co__
Machine_Learning_Co__Machine_Learning_Co__
Machine_Learning_Co__
 
0912f50eedb48e44d7000000
0912f50eedb48e44d70000000912f50eedb48e44d7000000
0912f50eedb48e44d7000000
 

2014 USA

  • 1. For office use only T1 ________________ T2 ________________ T3 ________________ T4 ________________ Team Control Number 24147 Problem Chosen C For office use only F1 ________________ F2 ________________ F3 ________________ F4 ________________ 2014 Mathematical Contest in Modeling (MCM/ICM) Summary Sheet (Attach a copy of this page to your solution paper.) Type a summary of your results on this page. Do not include the name of your school, advisor, or team members on this page. Summary In this paper, we establish two types of network including directed and undirected network, handing with the following five tasks. For task one, a study is conducted with the maximum connected graph. Through the distribution of degree and distance, the property of ‗a small world‘ and ‗scale-free‘ can be spotted. Then, based on information entropy theory, three indicators have been constructed. In comparison with the means of corresponding indicator of 100 networks in the same scale, our co-author network has the properties of effectiveness, organization and cooperation. For task two, four relevant indicators are built including the cooperation time, representing research longevity; degree, showing direct cooperation; quadratic correlation, reflecting indirect cooperation; betweenness, indicating the link of collaboration. Imitating the calculation of gross input in economics, we set up a calculable MEI assessment model, which reveals that the top two are HARARY, FRANK* and SOS, VERA TURAN. Then we test the results by Monte Carlo method. For task three, a directed quotation network with several types of correlation is erected. ―Weak correlation‖ appeared from papers in the same filed is defined. The value of PageRank computed through modified PageRank algorithm shows that the paper entitled Collective dynamics of “small-world” networks ranks first. Depiction of relationship among universities or departments can be discussed via improvement. For task four, the dolphin network shows the feature of ‗small world‘ rather than ‗free scale‘. The degree of order is higher than that of random ones. The most influential dolphin is called SN100. The degree of intelligence has some relevance with free scale and organization by comparing two networks mentioned. For task five, pros and cons as well as sensitivity of the presented model are summarized. Some suggestions have been proposed for college students eager to enhance their influence. Keywords: Co-author network MEI Modified PageRank Weak correlation
  • 2. Contents 1. Introduction ........................................................................................................................... 1 2. Assumptions .......................................................................................................................... 1 3. The construction of co-author network and influence assessment ........................................ 1 3.1 Data extraction and establishment of the co- author network ..................................... 1 3.1.1 The original network ........................................................................................ 1 3.1.2 The erection procedure ..................................................................................... 1 3.2 Property analysis of the co-author network................................................................. 2 3.2.1 Diagrammatic sketch and fairly interesting findings........................................ 2 3.2.2 Limitation of the size of our network by connected graph............................... 3 3.2.3 The Degree and its Distribution........................................................................ 3 3.3 Analysis based on information entropy theory............................................................ 5 3.3.1 The explanations of timeliness and time efficacy entropy in network structure..................................................................................................................... 6 3.3.2 The explanations of quality and quality entropy in network structure. ............ 6 3.3.3 The results and comparison .............................................................................. 7 3.4 Four fundamental indicators and Marshall Entropy Index (MEI) ............................... 7 3.4.1 Introduction ...................................................................................................... 7 3.4.2 Calculating evaluating model ........................................................................... 8 3.4.3 Testing of evaluating model.............................................................................. 9 4. Construction of quotation network and modified PageRank algorithm .............................. 10 4.1 Conceptual framework .............................................................................................. 10 4.2 Construction of quotation network.............................................................................11 4.2.1 Quotation network of direct correlation ..........................................................11 4.2.2 Slightly complicated quotation network..........................................................11 4.3.3 The final quotation network ........................................................................... 12 4.3 Modified PageRank algorithm and its application .................................................... 12 4.3.1 The model corresponding PageRank algorithm ............................................. 12 4.3.2 Modified PageRank algorithm........................................................................ 13 4.4 Results and further discussion................................................................................... 13 5. The dolphin network and the analysis ................................................................................. 15 5.1 The erection of dolphin network and diagrammatic sketch....................................... 15 5.2 The value of indicators and further discussion.......................................................... 15 6. The assessment and popularization of our network............................................................. 16 6.1 The assessment and popularization of co-author and dolphin network..................... 16 6.2 The assessment and popularization of quotation network......................................... 17 6.3 Some suggestions ...................................................................................................... 17 References ............................................................................................................................... 17
  • 3. Team #24147 Page 1 of 20 1. Introduction Real-world entities often interconnect with each other through explicit or implicit relationships among them then form a complex network. There are so many examples of complex networks in different area such as natural systems, engineering systems, economic systems as well as social systems. [1] Generally, each node in the complex network indicates an individual in the real world, and each side linking two individuals indicates an interaction between the two. As to the co-author systems presented in ICM 2014, studying the internal property of them has a tremendous meaning of evaluating a scientist‘s achievement. Inspired by empirical studies of networked systems, researchers have developed a variety of techniques and models( the small-world effect, degree distributions, random graph models and so on) to help us understand or predict the behavior of these systems in recent few years. [2] In this article, it‘s accepted by both of us that the influence of scientists is closely of relevance to those for whom scientific publication is a primary means of scholarly communication. [3] Moreover, a network was built to analyze the relationships between the species of dolphin after completing the analysis of co-author networks. 2. Assumptions (1) The original data is believable; (2) There is no academic malpractices among the listed authors; (3) There is no obvious preference in the process of statistics; (4) There is no article which violates the cited articles 3. The construction of co-author network and influence assessment A co-author network is a collection of authors; each of them is acquainted with some subsets of the others. Such a network can be represented as a set of nodes denoting authors joined by edges denoting acquaintance. [4] In this section, the whole construction process of our co-author network in detail includes data extraction, modeling effects, erection of four different types of indicator. Through statistical analysis on the results, the author proposes our conclusion based on the comprehensive indicator—Marshall Entropy Index (MEI). 3.1 Data extraction and establishment of the co- author network 3.1.1 The original network Initially, the data from the file in the given website is guided into document whose extended name is txt before being imported into Matlab, which is a mathematical calculation software, designed for data processing. By applying the mechanism of cell array, the direct co-authors— there is no blank before their names in given files— can be successfully extracted which the quantity is 511 representing 511 co-authors in a specific sequence. Subsequently, an original co-author network with 511 columns and 511 rows has been set up on the previous stages. 3.1.2 The erection procedure A brief treatment of valuation may be discussed in this part. Firstly, all elements in the
  • 4. Team #24147 Page 2 of 20 original network are assigned to zero representing the initial state because of the convenience to reach the following process. Secondly, we are asked to find whether the direct co-author exists in all indirect co-author who is the co-author of the co-author of Erdos. If it works, the corresponding elements would plus 1. For example, ABBOTT, HARVEY LESLIE has links with MEIR, AMRAM; and it can be found that they are co-authors respectively. ABBOTT, HARVEY LESLIE represents column 1 and row1, MEIR, AMRAM represents column 232 (provided) and row 232 (provided). So, the value of the elements A(1,232) and A(232,1) change from 0 to 1—this occurs in the situation when the computer inspects ABBOTT, HARVEY LESLIE. Thus, as what you will think after a correspondence data processing procedure, the original co-author network becomes a symmetric matrix that contains only 0 or 1 in each element. In addition, if ABBOTT, HARVEY LESLIE did not have any links with ACZEL, JANOS D., the elements (1,2) and (2,1) in this co-author matrix remain 0; otherwise, the value will become 1. Finally, by calculating the amount of non-zero value in each rank, the quantity of co-author of each author can be obtained. Presenting it with the type of matrix: )5111,5111(   jiAij , (1) Also, the first six authors are stated as following: Table 1 The number of co-authors connected directly Name The number of co-authors ABBOTT, HARVEY LESLIE ACZEL, JANOS D. AGOH, TAKASHI AHARONI, RON AIGNER, MARTIN S. AJTAI, MIKLOS 7 2 1 11 5 9 3.2 Property analysis of the co-author network The co-author network mentioned is a symmetric matrix that contains only 0s and 1s in each element position. Now, we prepare to utilize several tables and charts to intuitively illustrate our co-author network‘s details directly. 3.2.1 Diagrammatic sketch and fairly interesting findings
  • 5. Team #24147 Page 3 of 20 Fig. 1. Diagrammatic sketch of co-author network Fig.1 indicates the cooperative relationship between the 511 authors. Through it‘s somewhat intricate and complex, the figure presents an simple connection among the 511 authors. It cannot be ignored that considerable nodes are isolated, such as node7, node83 and node375, which means these authors corresponding to the isolated nodes have no direct cooperation with other authors listed in the network. As a result, to some level, efforts in succession work —the calculation of betweenness and quadratic correlation, which finally influence the final score of each author when computing MEI. It would be so puzzling that more focus have to be paid to this kind of situation. It is accepted by us that one of the reasons may be the author ,cooperating with Erdos just for one time, is isolated with other co-author network researchers because of the diverse areas compared to the co-authors‘. Interdisciplinary research may be another reason. So the author predicts courageously, that the isolated one might be one of the leaders of a specific field in mathematics, or might be devoted to interdisciplinary research in various subjects, such as music and economics. For example, the author, ASHBACHER, CHARLES D. taught physics in college. However, he changed his direction from physics to the art after graduating from college and composed a song with Wilson, Lewis listed in the co-author order. As to the motivation he changed his profession, there are numerous stories showing that it may be the cause of interest or love affairs. But, IT IS NOT CERTAINLY TRUE. It looks like fairly interesting; yet unfortunately, we are not able to find it out. 3.2.2 Limitation of the size of our network by connected graph After initiative presentation, the paper starts to narrow the size of our network with connected graph, which exists an edge between arbitrary nodes. By selecting the largest connected graph with 466 nodes and numerous edges, we change our attention just from 511 nodes to 466 nodes regarding the number of 466 authors as our study area. 3.2.3 The Degree and its Distribution
  • 6. Team #24147 Page 4 of 20 Moreover, the degree and its distribution, which have been significant features in any graph, come to be illustrated in this part. However, before proposing diagrams, we firstly define the notion of ―degree‖. Degree [5]: the degree of a certain node represents the amount of collaborator that a certain author has. In graph theory, the degree of a certain node indicates the amount of edges jointed the node. The representation of matrix is shown as follow: )4661,4661( 466 1    jiAD j iji (2) Fig. 2. Frequency histogram of degrees The figure 2 demonstrates that with the increasing number of degree, the corresponding quantity of co-author declines gradually. As we can see, the number of cooperator of most co-author remains no more than 10 which is less than 3% of the gross co-author. Fig. 3. Frequency histogram of the length of the shortest path
  • 7. Team #24147 Page 5 of 20 The frequency of the length of the shortest path between two typical nodes can be calculated by taking advantage of modified Floyd algorithm, a commonly used methodology for compute the shortest path in graph theory. Moreover, the frequency that the length of the shortest path lower than six can be counted out as 0.9783, approximately equal to 1. Maybe the number ―six‖ seem too puzzling, yet the famous ―Six Degrees of Separation ‖ [6], which was proposed by Harvard professor Frigyes Karinthy, indicates that everyone and everything is six or fewer steps away. Namely, people can connect with each other in a maximum of five middlemen. It can be fundamentally accepted by us that our calculation is consistent with well-known theory. Fig. 4. Frequency histogram of degrees and its fitting curve After inspecting the frequency histogram of degrees, the author fits its regular pattern with three different distribution types—the Poly3, the Exp2, the Fourier3—which have been presented in Fig.4. Furthermore, some statistical magnitudes, eg. SSE, R-square, can be worked out respectively. On contrast, the Exp2 is the most fitting curve where a few nodes possess a large amount of edges while many nodes possess relatively fewer ones. More importantly, our co-author network can be regarded as scale-free network [7] where the distribution of degree corresponds with power law degree distribution. Table 2 Some Statistical Magnitudes of Distribution of Degree Equation name SSE R-square Adjusted R-square RMSE Poly3 0.01616 0.8219 0.8123 0.01699 Exp2 0.0009141 0.9899 0.9894 0.00404 Fourier3 0.004283 0.9528 0.9464 0.009075 3.3 Analysis based on information entropy theory The degree of order in strategic perspective may be defined by information entropy, which can be divided into timeliness of information entropy and quality of information entropy. [8] The degree of order in network structure (R) is an index considering the timeliness and
  • 8. Team #24147 Page 6 of 20 quality of information which can be expressed as equation 3, , (3) where R1 and R2 mean the timeliness of information entropy and the quality of information entropy. 3.3.1 The explanations of timeliness and time efficacy entropy in network structure. Timeliness in network structure can be defined as the degree of time used in the transmitting information from one node to another; and time efficacy entropy is the degree of uncertainty in timeliness of which formula as follow, , (4) Supposing there is a network structure with n nodes and i and j mean two optional nodes, the value of timeliness (R1) and time efficacy entropy (H1) can be calculated as following steps. ○1 calculate Lij. The length of node i and node j (Lij) is defined as the shortest path between two nodes which means the value is 1 if the two nodes connect directly and added with the times of transit. ○2 calculate A1. The total of timeliness in network structure (A1) can be calculated as formula 5. , (5) ○3 calculate P1(ij). The probability of two nodes connected together (P1(ij)) is defined as eq.6, , (6) ○4 calculate H1(ij). The time efficacy entropy between node i and node j (H1(ij)) can be calculated in the following way, , (7) ○5 obtain H1 and H1M. The total of time efficacy entropy in network structure (H1) and the maximum of H1 can be got in equation 8 and equation 9. , (8) , (9) ○6 obtain R1. After the above steps, the timeliness of the network structure (R1) can be got in the equation 10. . (10) 3.3.2 The explanations of quality and quality entropy in network structure. Quality in network structure is defined as the degree of accuracy that used in the transmitting of information between one node and another; and quality entropy is the degree of uncertainty in quality. The difference of calculating quality and quality entropy is in the method of defining Lij. and A1 which are defined in the quality and quality entropy as follow.
  • 9. Team #24147 Page 7 of 20 The number of nodes which connect directly (Ki) and the total of Ki in the network structure (A2), , replace the original Lij and A1. Except for the two definitions, the following data, the probability of quality of node i (P2(i)), the quality entropy of node i (H2(i)), the total of quality entropy in network structure (H2), the maximum of H2 (H2M) and the quality of the network (R2), can be calculated as the same way of calculating the timeliness and time efficacy entropy in following equations. 3.3.3 The results and comparison The results of three indicator—degree of timeliness, quality and order—are shown in table 3: Table 3 The results of information entropy timeliness quality the degree of order The co-author net 0.1022 0.2979 0.1745 The simulated net 0.0341 0.4743 0.1271 * The number of nodes and edges in simulated net is same to that in given data. The simulated net is that after erecting the same net with our co-author network except that the ways of connection are fairly random. Having simulated a hundred nets, the author can also calculate the corresponding average value to compare. The table 3 reveals that though the value of the quality of entropy is lower than that of the simulated network, the values of the other two are higher than those of the simulated one, which cannot be ignored. That is to say, our co-author network seems to be more relevant and ordered. 3.4 Four fundamental indicators and Marshall Entropy Index (MEI) The ways to construct our co-author network have been mentioned. In this section, the author will combine four fundamental indicator including Quadratic correlation, time, betweenness and degree, and a comprehensive indicator—Marshall Entropy Index (MEI). 3.4.1 Introduction Degree: The degree of a certain node represents the amount of collaborator that a certain author has. In graph theory, the degree of a certain node indicates the amount of lines jointed the node. The representation of matrix is shown as follow: )4661,4661( 466 1    jiAD j iji , (11) Betweenness: Betweenness is a global variable which reflects the impact and
  • 10. Team #24147 Page 8 of 20 influence to the relationship in nodes or edges. It can be defined as equation (12):     nm nm i i nmg nmg nmbB ),( ),( ),(i , (12) Where ),( nmg is the length of the shortest distance between node m and node n and ),( nmgi is the length of the shortest distance between node m and node n which goes through the node i. Time: In this article, the author considers that the earlier an author cooperate with Erdos is, greater the influence in academic has. What lead us to do this supposal originate from the following two reasons. First, the earlier cooperation they had, the higher probability of acknowledgement of Erdos to one‘s academic potential would be. Second, if they could collaborate with each other in the earlier time, they may make a greater contribution in academic world. It can be defined as follow equation (13): )4661(2014  itT ii , (13) Quadratic correlation: Quadratic correlation is an index to measure the indirect correlation of a node. We can define the new measure as shown in Eq. (14) in which iA means a set of all the nodes connect directly to the node i,       3,2,1 iiii AAAA  ;  iAn means the size of all the factors in iA ; iE means a set of all the nodes connect the node i in two paths but cannot reach in one path. Furthermore, the factor in iE should not include the node i,      iAAAi AiAAAE iii  )3()2()1( ; the definition of  iEn is just like  iAn , meaning the size of all the factors in iE .    i i i An En Q  , (14) Marshall Entropy Index: In economics study field, when it comes to calculate input-output table in agriculture industry, though the types of input or output are fairly different and uncorrelated, or have somewhat subtle links, the gross value of input or output can be obtained by the method of continued multiplication before getting logarithm to the computed product. Of course, get standardization when it is necessary. Inspired by this idea, the processing procedure of four indicators will contain standardization, continued multiplication and logarithm process. As a result, a comprehensive numerical value—Marshall Entropy Index(MEI),an index illustrating the overall influence of each co-author—can be achieved. (15) 3.4.2 Calculating evaluating model According to principals elaborated above, numerical values of each indicator with every 466 co-authors can be attained. Further, after sizing these figures down, top ten figures of each indicator can be elected in Table 4. Table 4 Ranking in Accordance with Four Fundamental Indicator and MEI Rank Time Degree
  • 11. Team #24147 Page 9 of 20 1 SZEKERES, GEORGE* (80) ALON, NOGA M. (52) 2 TURAN, PAL* (80) GRAHAM, RONALD LEWIS (44) 3 DAVENPORT, HAROLD* (78) HARARY, FRANK* (44) 4 FELDHEIM, ERVIN* (78) BOLLOBAS, BELA (43) 5 GALLAI, TIBOR* (GRUNWALD, TIBOR) (78) RODL, VOJTECH (43) 6 VAZSONYI, ANDREW* (WEISZFELD, ENDRE) (78) FUREDI, ZOLTAN (40) 7 GILLIS, JOSEPH E.* (77) TUZA, ZSOLT (40) 8 JARNIK, VOJTECH* (77) SOS, VERA TURAN (38) 9 OBLATH, RICHARD* (77) SPENCER, JOEL HAROLD (35) 10 GRUNWALD, GEZA* (76) GYARFAS, ANDRAS (32) Rank Quadratic correlation Betweenness Marshall entropy index 1 BARAK, AMNON B. (52) HARARY, FRANK* (9587.7) HARARY, FRANK*(21.9) 2 COPELAND, ARTHUR HERBERT, SR.* (44) SOS, VERA TURAN(8912.2) SOS, VERA TURAN (21.8) 3 HARZHEIM, EGBERT (44) RUBEL, LEE ALBERT* (8573.3) BOLLOBAS, BELA (21.8) 4 MINC, HENRYK (44) STRAUS, ERNST GABOR* (8300.7) GRAHAM, RONALD LEWIS (21.7) 5 SARKAR, AMITES (43) POMERANCE, CARL BERNARD (7434.7) STRAUS, ERNST GABOR* (21.5) 6 ANDRASFAI, BELA (38) FUREDI, ZOLTAN (7410.1) ALON, NOGA M. (21.4) 7 ZAREMBA, STANISLAW KRYSTYN* (38) ALON, NOGA M. (6871.9) FUREDI, ZOLTAN (21.3) 8 LEWIN, MORDECHAI (29) GRAHAM, RONALD LEWIS (6817.2) HAJNAL, ANDRAS (21.2) 9 PENNEY, DAVID EMROY, II (29) BOLLOBAS, BELA (6699.5) PACH, JANOS (20.8) 10 SCHMUTZ, ERIC J. (29) PACH, JANOS (6073.6) TUZA, ZSOLT (20.7) * Value in brackets means the score author got in the related index. The table indicates that H., S. and B. are the top three in the final score of MEI. Unfortunately, HARARY, FRANK has already died. On the other hand, they are listed in top of 10 both in the Degree indicator and Betweenness indicator, which means that the indicator of Degree and Betweenness are relatively important in MEI even after standardizing. However, that does not mean that the indicator of Time and Quadratic correlation are meaningless. In contrast, the appliance of Time and Quadratic correlation provides a different perspective to analyses the results of evaluation. For example, we can get some ideas from the time indicator that the earlier one cooperate with Erdos, the more influence he tends to be, though those who rank among top 10 in time indicator do not rank among the first 10 in MEI. In conclusion, if we stand on the bank of the river of the Network Science to judge who is the most influential, the answer may be HARARY, FRANK, passed away years ago. If we are attempted to make an assessment standing present, the most important researchers in 466 given authors SOS, VERA TURAN and BOLLOBAS, BELA. 3.4.3 Testing of evaluating model In order to test the accuracy of results, the paper plans to remove the top two in MEI, HARARY, FRANK* and SOS, VERATURAN from the 511 co-authors and acquire the maximum of connected nodes which is 459 from the network of the maximum of connected nodes to the rest. In addition, we‘ve tried to wipe off two random nodes and recorded the maximum of the connected nodes after 2000 times of trial as figure 5.
  • 12. Team #24147 Page 10 of 20 459 460 461 462 463 464 465 466 0 200 400 600 800 1000 1200 the maxmium of connected nodes frequencies Fig. 5. 2000 trials of the random nodes The figure 5 shows that the times of the maximum of connected nodes which is less than 459 are close to 0. Therefore, it‘s obvious that HARARY, FRANK* and SOS, VERATURAN are of importance to the connection of the co-author network which means our evaluating model is believable. 4. Construction of quotation network and modified PageRank algorithm In this section, the concepts of direct correlation, indirect correlation and weak correlation will be introduced. Then, sixteen research papers in network science field in the given file can be employed to erect our quotation network presenting the concepts of direct correlation, indirect correlation and weak correlation. After that, we are to combine the quotation network built with classical PageRank algorithm and to make some corresponding improvement. Afterwards, having computing the relative influence of each paper via modified PageRank algorithm, the author finds that the paper entitled „collective dynamics of „small world‟ network‟ written by Watts and Newman, M. is the most influential research paper among sixteen theses in network science. Finally, further discussion about our quotation network model and modified PageRank algorithm will be made. 4.1 Conceptual framework It is undeniable that the establishment of quotation network cannot implement without conceptual framework. Due to the requirements of technicality and rigor, the situation of mutual citation will inevitable appears. Namely, two research papers have direct correlation with each other. Further, indirect correlation exists only when an article has relationship with the quotation of cited paper. In this paper, the author defines that both twice quotation and three time quotation are covered in this concept. In addition, there are some relationships among all of articles published in the network field. For example, there may be some relation among sixteen papers in network science, which we will discuss later, simply because that
  • 13. Team #24147 Page 11 of 20 their major focus concentrates on the same field, especially when it comes to ideas, deductions and conclusions. So, the paper defines this relevance as weak correlation. It is fairly apparent that as for power of influence, direct correlation is greater than indirect correlation, while weak correlation is the minimum in these three. In this way, the conceptual framework is clear and definite. 4.2 Construction of quotation network Overall, the erection of our quotation network will be implemented in three stages. The first part of our plan is to erect a quotation network only with direct correlation among all sixteen articles. Our next business is to implement the task of building a quotation network which contains both direct correlation and indirect correlation. The processing procedure in establishing the quotation network involving all three concepts is the final step. Next, a detailed depiction will be made on the procedure for setting up quotation network and present diagrammatic sketch in every stage. 4.2.1 Quotation network of direct correlation Since there are sixteen papers extracted from network science, a matrix E where consists of the combination of 16*16 should be established. Next, given a matrix E in which 1e ij while node i can directly reach to node j or 0ije otherwise, the matrix can be defined as: )161,161(   jiEij We are supposed to pay attention to the idea that this matrix is quite different from ijA , which we have constructed in section 2, simply because this matrix is no longer symmetric — the procedure of citation is one-way. That is to say, the only occasion will appear when theses published later would cite theses published already. It can be presented in Fig. 6 Fig. 6 Original quotation network 4.2.2 Slightly complicated quotation network In this step, the indirect quotation should be added to the original quotation network. As for the matrixE , we vary it as follow: 3 3 2 2 ~ e E e E EE  , (16) Where e means the natural exponential (approximately 2.718281828) In the view of the influence generated by twice and three times quotation is smaller than that of direct influence, such variation should be made.
  • 14. Team #24147 Page 12 of 20 4.3.3 The final quotation network Followed by the thoughts of 3.2.2, a million times of variation could be made. However, the workload would be onerous. As the saying goes, Simplicity is the ultimate sophistication. So, many times of variation can be simplified with the form of adding 1/16 in each element to reveal the weak influence. We can also present it as Fig. 7: Fig. 7 Quotation network based on B~ We can see that fig. 7 is far more complicated than Fig. 6. 4.3 Modified PageRank algorithm and its application 4.3.1 The model corresponding PageRank algorithm PageRank algorithm [9], proposed by Sergey Brin and Lawrence Page, is widely applied in search engine on information collection. In our modified PageRank algorithm, papers are defined as assemblage  NS 2,1 . The PageRank value is denoted by nr . nO can be defined out-degree of the paper, meaning the amount of number of citing paper. So, the expression the PageRank value of the paper n can be denoted by:   nAm m m n O r r , (17) According to this expression, diagrammatic sketch of calculating the value of Page Rank is illustrated schematically in fig. 8. 1e 2e 3e 4e 100 50 50 50 20 10 10 60
  • 15. Team #24147 Page 13 of 20 Fig. 8 Diagrammatic of computing the PageRank value The numbers in the figure show the capacity of inflowing and out flowing Then, by constructing quotation transference probability matrixC , where i ij O C 1  while paper i is cited by paper j or 0ijC otherwise, we can get the model corresponding PageRank algorithm:         0 1 R eR RGR T T (18) where the prominent character is that the sum of all component products in eigenvector R is 1. The values of component product basically determine the ranking among papers. The larger the value of component product is, the more weightiness a paper will be, proving that the paper is more important, namely, ranking higher. 4.3.2 Modified PageRank algorithm In this part, a flow chart figure 9 can be presented to demonstrate our modified PageRank algorithm. Input adjacency matrix-E and the size-n、 convergence threshold- sigma E=E+E^2/e^2+E^3/e^3+1/nCommence Calculate transition probability matrix C VectorR generated randomly with initial value of M-PR X=PRR=X max(X-R) >sigma? R=X/sum(R)Output RFinish FALSE TRUE Fig. 9. The flow chart of modified PageRank algorithm 4.4 Results and further discussion Having calculating the value of modified PageRank via modified PageRank algorithm, a clear result can be gained in table 5: Table 5 The rank of the papers in the value of M-PageRank Rank the Value of M-PAGERANK the I.D. of Paper the Name of Paper
  • 16. Team #24147 Page 14 of 20 1 0.2131 14 Collective dynamics of `small-world' networks 2 0.1682 8 Navigation in a small world. 3 0.0839 3 Power and Centrality: A family of measures 4 0.0799 4 Emergence of scaling in random networks 5 0.0568 6 Models of core/periphery structures 6 0.0423 13 Identity and search in social networks 7 0.0420 1 On Random Graphs 7 0.0420 10 The structure of scientific collaboration networks 9 0.0383 11 The structure and function of complex networks 10 0.0361 2 Statistical mechanics of complex networks 10 0.0361 9 Scientific collaboration networks 12 0.0321 5 Identifying sets of key players in a network. 12 0.0321 7 On properties of a well-known graph. 12 0.0321 12 Networks, influence, and public opinion formation 12 0.0321 15 Statistical models for social networks 12 0.0321 16 Social network thresholds in the diffusion of innovations As the table shows, Collective dynamics of `small-world' networks is in the first place, followed by Navigation in a small world and Power and Centrality: A family of measures. That is to say, Collective dynamics of “small-world” networks written by Watts, D. and Strogatz, S. is the most influential paper in network science. When discussing if there is a similar way to determine the role or influence measure of an individual network researcher, the reply comes positively. The fruits of collaboration between researchers can be presented by theses written by them, though there is no clear sense of orientation when working together. An assumption that can be made grounded in reality is that one researcher is ‗leader‘ and another one is ‗helper‘. When they finish theses in cooperation, the leader may gain more relative influence, while the helper may gain less. As to alteration of our quotation network matrix, the way to deal with is to alter the relative value presented in the matrix. Thus, the influence of an individual network researcher can be obtained. As for measuring the role, influence, or impact of a specific university, department, or a journal in network science, our quotation network also comes in handy. Whether a specific university or a department, it is a portion in network science, which can be represented a node in Graph Theory. A university, for instance, may have some kind of collaboration with others, though the collaboration does not intense, which we could define it as a sort of weak correlation. As to alteration of our quotation network matrix, the way to deal with is to add a small number in each component product. As the depiction above, a conclusion may be drawn that a more detailed methodology when measuring the role, influence, or impact of a specific university, department, or a journal in network science is needed. Among all of the alteration, weak correlation, whose quantity is mainly determined by the number of nodes, should be emphasized. Thus, the data that need to be collected includes that the program of cooperation between authors and their workload respectively, the amount of publications and journals in network science, the quantity of university or department researching network science.
  • 17. Team #24147 Page 15 of 20 5. The dolphin network and the analysis In this section, the dolphin network [10] that contains 62 nodes and some edges if the graph could be depicted will be constructed. Then, the key indicators, such as degree, quadratic correlation, Betweenness as well as timeliness entropy, quality entropy and the degree of order, will also be calculated, indicating the dolphin named SN100 rank the first. Finally, further discussion will be made. 5.1 The erection of dolphin network and diagrammatic sketch Following the construction steps introduced in section 2, a 62*62 matrix F has been built. The diagrammatic sketch is shown in figure 10: Fig. 10 Diagrammatic sketch of the dolphin network The property of Six Degrees of Separation is still notable by means of further study, while the property of power law degree distribution is no longer apparent. 5.2 The value of indicators and further discussion According to the principal of calculation mentioned in section 2, key indicators can be computed that are shown in table 6 and table 7: Table 6 The evaluation of dolphins about their influence Rank Degree Quadratic correlation Betweenness Marshall entropy index
  • 18. Team #24147 Page 16 of 20 1 Grin (22) Cross (10) SN100 (454.27) SN100 (9.48) 2 SN4 (18) Five (10) Beescratch (390.38) SN9 (8.90) 3 Topless (11) Fork (10) SN9 (261.96) Beescratch (8.86) 4 Scabs (17) MN23 (9) SN4 (253.58) SN4 (8.71) 5 Tringger (10) Quasi (9) DN63 (216.38) Kringel (8.45) 6 Jet (8) SMN (9) Jet (209.17) DN63 (8.32) 7 Kringel (13) TR82 (9) Kringel (187.84) Jet (7.91) 8 Patchback (19) Whiteti (8) Upbang (181.39) Stripes (7.88) 9 Web (22) SN89 (7.5) Trigger (154.96) Oscar (7.85) 10 Beescratch (11) Vau (6.5) Web (154.09) Upbang (7.84) * Value in brackets means the score dolphin got in the related index. As the table 6 demonstrates, SN100 comes in the first place, predicating it would be the most influential dolphin of all. Noticing that the top 4 in betweenness is also stand at the first four place, the importance of the betweenness indicator show up again, which is consistent with the results presented previously. Table 7 The result of information entropy of the dolphin network Table 7 reports the result of information entropy of the dolphin network and our simulated network, where the value of all three indicators of our dolphin network is higher than that of simulated network, revealing that our dolphin network, being more relevant, certain and well-organized, may be slightly different from the random net. Further, by comparison, the dolphin network is different from the co-author network to some extent. So, we are brave to guess the reason that ‗degree intelligent‘ may be a sensible factor. 6. The assessment and popularization of our network It is well accepted by us that two pairs of network have been erected in this article, with an undirected network employed in co-author network and dolphin network, and directed network like quotation network. So, it is reasonable to discuss the networks respectively. 6.1 The assessment and popularization of co-author and dolphin network Strengths: 1. The author or the dolphin can be abstracted as a node representing its role in the network, which can be extended in many aspects, such as student union and conduct business. 2. The edge represents the relationship between two authors of whether they have cooperation or not or two dolphins of whether they have somewhat mysterious relationship. 3. The sensitivity of our methodology is relative low, which means that the stability of our network can be ensured basically. Weaknesses: 1. The network can handle with problems of directed graph. Timeliness quality the degree of order The dolphin network 0.1379 0.3143 0.2081 The simulated net 0.1109 0.3002 0.1824
  • 19. Team #24147 Page 17 of 20 2. If there are too many nodes and edges, the computer workload would be immense. 3. The problem of the identity of each node may not be considered. The power of network: Our network may be widely utilized in studying social relationship, advanced management, individual choice, which the correlation is undirected or two-way. 6.2 The assessment and popularization of quotation network Strengths: 1. The network can reduce the pressure of computing on computer, so it decrease the time that we are waiting for results. 2. The situation of authority of each paper has been taken into account. 3. The sensitivity of our methodology is relative low, which means that the stability of our algorithm can be ensured basically. Weakness: A more detailed study cannot be implemented. For example, the influence of weak correlation is difficult to measure. The power of our network: Our network may be widely utilized in studying enterprise organization, self-promotion and capital accumulation, because these areas involve change or flow—maybe from one person to another, from one position to another or from the past to the present to the same person. 6.3 Some suggestions Further, according to MEI model, some suggestions have been proposed for college students eager to enhance their influence: 1. making one‘s best to join a research team and seeking for opportunities of cooperation; 2. participating a research team with more international exchanges as much as possible; 3. choosing a newly emerging subject or interdisciplinary field to commence research. References [1]http://www.sitis-conf.org/en/workshop-on-complex-networks-and-their-applications-compl ex-networks-2012.php?Preview=ok. [2] M. E. J. Newman, The structure and function of complex networks, SIAM Review, 45, 167–256 (2003) [3] Redner S., How popular is your paper? An empirical study of the citation distribution, Eur Phys J B, 4, 131-134 (1998) [4] Newman, M. The structure of scientific collaboration networks. Proc. Natl.Acad. Sci. USA, 98: 404-409, January 2001. [5] Xu Ling, Research on co-author network based on SCIENCE [D], Shanghai Jiao Tong University [D], 2009. [6] Zhao Yan, Research on routing protocols of opportunistic networks based six degrees of separation [D], Qiqihar University, 2012. [7] Xiang Linying. Chen xiangqiang, Review on modeling, analysis and control of complex dynamic network, China academic journal electronic publishing house, 16:1543-1551, Nov. 2006.
  • 20. Team #24147 Page 18 of 20 [8] Delu Wang; Ziwei Li, Web-based organizational structure information entropy theory analysis, Modern Management Science, (1):65-66, 2007. [9] Yue Xie, Research on PageRank algorithm and HITS algorithms in webpage sort [D], University of Electric Science and Technology of China, 2012. [10] The data of dolphins comes from http://www.datatang.com/data/769/.