SlideShare a Scribd company logo
1 of 12
Download to read offline
A social network-empowered research analytics framework for
project selection
Thushari Silva a
, Zhiling Guo a,
āŽ, Jian Ma a
, Hongbing Jiang a,b
, Huaping Chen b
a
Department of Information Systems, City University of Hong Kong, Hong Kong
b
School of Management, University of Science and Technology of China and USTC-CityU Joint Advanced Research Centre, Suzhou, PR China
a r t i c l e i n f o a b s t r a c t
Available online 9 January 2013 Traditional approaches for research project selection by government funding agencies mainly focus on the
matching of research relevance by keywords or disciplines. Other research relevant information such as social
Keywords: connections (e.g., collaboration and co-authorship) and productivity (e.g., quality, quantity, and citations
Research project selection of published journal articles) of researchers is largely ignored. To overcome these limitations, this paper
Research social networks proposes a social network-empowered research analytics framework (RAF) for research project selections.
Research analytics
Scholarmate.com, a professional research social network with easy access to research relevant information,
serves as a platform to build researcher proļ¬les from three dimensions, i.e., relevance, productivity and con-
nectivity. Building upon proļ¬les of both proposals and researchers, we develop a unique matching algorithm
to assist decision makers (e.g. panel chairs or division managers) in optimizing the assignment of reviewers
to research project proposals. The proposed framework is implemented and tested by the largest government
funding agency in China to aid the grant proposal evaluation process. The new system generated signiļ¬cant
economic beneļ¬ts including great cost savings and quality improvement in the proposal evaluation process.
Ā© 2013 Elsevier B.V. All rights reserved.
1. Introduction
There is a steadily growing trend for government funding agencies
to support an increasing number of research proposals. For example,
there were 42,225 research grant proposals submitted to the National
Science Foundation (NSF) in the U.S. in 2010. The estimated number
of submission for 2012 will increase to 46,000. The number of proposals
submitted to the National Natural Science Foundation of China (NSFC)
has increased from 23,636 in 2001 to over 147,000 in 2011. The sheer
volume of submission has posed a signiļ¬cant challenge for research
project selection due to difļ¬culties of assigning the most suitable
reviewers to the most relevant project proposals.
A research project can be characterized by a set of qualitative and
quantitative, tangible and intangible attributes. Management scientists,
Economist and IS practitioners have proposed various decision models,
methodologies and decision support systems to assist decision making
tasks related to research project selection [13,15,34,35]. Traditional
approaches based on mathematical programming and optimization
are useful for handling large volume of submissions, but are less efļ¬-
cient in dealing with subjective judgment and information. Machine
learning techniques incorporating fuzzy logic, genetic algorithms and
artiļ¬cial intelligence techniques are capable of learning complex pat-
terns in data, but are limited by their ability to generalize from training
data and optimize decisions over the entire decision space. Other tradi-
tional approaches involve manually assigning proposals to reviewers
based on their claimed expertise, which is neither efļ¬cient nor practical
to support increasing complexity of decision making faced by funding
agencies.
Current computer-based methods mainly consider matching
research relevance in terms of keywords or disciplines, while ignoring
the social connections (e.g., collaboration and co-authorship) and
productivity (e.g., quality, quantity, and citations of published journal
articles) of researchers. It is desirable to incorporate all these aspects
into a uniļ¬ed evaluation framework. To achieve this goal, we propose
a research analytics framework that is empowered by a research
social network (www.scholarmate.com) for effective research project
selection. Better identiļ¬cation of social connection can effectively
cluster researchers based on topics of interests, methodologies, and
research disciplines. Being able to identify community structure in the
social network helps us understand and exploit the research network
more effectively. On the one hand, such information can be used to
identify most suitable reviewers. On the other hand, it can help avoid
conļ¬‚ict of interests to ensure fair evaluation.
Speciļ¬cally, we propose to deļ¬ne proļ¬les of research entities (e.g.
project proposals, researchers) from three dimensions, i.e. relevance
(e.g., keywords and research disciplines), productivity (e.g., quality,
quantity, and citations of published journal articles), and connectivity
(e.g., project collaborators, co-authors and colleagues). Represented
by visual research CVs, proļ¬les of proposals and potential reviewers
are built by extracting information from multiple sources including
submitted proposals, bibliographic databases (e.g., ISI, Scopus, and
Decision Support Systems 55 (2013) 957ā€“968
āŽ Corresponding author.
E-mail addresses: tpsilva2@student.cityu.edu.hk (T. Silva),
zhiling.guo@cityu.edu.hk (Z. Guo), isjian@cityu.edu.hk (J. Ma), jhbymx@foxmail.com
(H. Jiang), hpchen@ustc.edu.cn (H. Chen).
0167-9236/$ ā€“ see front matter Ā© 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.dss.2013.01.005
Contents lists available at ScienceDirect
Decision Support Systems
journal homepage: www.elsevier.com/locate/dss
EI), and research social network (i.e. www.Scholarmate.com). By
aggregating information in the three dimensions, we construct a unique
matching algorithm to assist decision makers (e.g. panel chairs or divi-
sion managers) in optimizing the assignment of reviewers to research
project proposals.
To demonstrate the usability of the proposed framework, we
implemented the system to aid China's largest government funding
agency in its grant proposal evaluation. The research analytics frame-
work builds upon scientometrics, business intelligence and social
network analysis techniques. Its powerful search and data access capa-
bilities provide timely and relevant information in visualized forms for
research project evaluation. The implemented system generates signif-
icant economic beneļ¬ts including cost savings and quality improve-
ment in the proposal evaluation process.
This paper is organized as follows. Section 2 reviews the relevant
literature. Section 3 provides an overview of the research analytics
framework and the Scholarmate research social network. Section 4
presents the detailed methods for proļ¬ling and algorithms to calculate
the key performance indicators. An optimization problem for reviewer
assignment is proposed in Section 5. Section 6 reports evaluation of
the proposed system by China's largest government funding agency
for its grant proposal evaluation. Section 7 concludes with a summary
of contribution and directions for future research.
2. Literature review
The major challenge in reviewer assignment for proposal evaluation
is identifying and recommending the most suitable reviewers who have
a high level of expertise and will make valuable professional judgment
on given proposals [13,38]. In this paper, we propose a proļ¬le-based
approach to assign reviewers for proposal evaluation.
Previous research has identiļ¬ed two approaches to scientiļ¬c re-
searcher proļ¬ling. One approach relies on subjective self-claimed infor-
mation declared by researchers themselves. The other approach is based
on objective measurement obtained through automated inferences
about the researcher's behavior patterns related to publications and
citations derived from relevant resources [37]. The ļ¬rst approach uses
qualitative methods (e.g. surveys, questionnaires, or interviews) and
traditional information retrieval models (e.g., term-based modeling [3]
and rough-set modeling [19]) to gain knowledge of a researcher's inter-
ests and resulted proļ¬les. The latter approach utilizes various feature
selection techniques in machine learning to learn user proļ¬le [10].
The machine learning approaches tend to learn the mapping
between incoming set of documents relevant to user input and real
numbers which represent the strength of user preferences. The features
of the documents are ļ¬rst extracted by widely used techniques including
information gain [8,21] and correlation coefļ¬cient [32]. Then the key fea-
tures are used as attributes in the mapping functions. Some studies focus
on techniques such as neural networks [22], Support Vector Machine
(SVM) [11,16,29], K-Nearest Neighbors (K-NN) and logistic regres-
sion [6,40] before generating a mapping with a set of real numbers. Li
et al. [20] proposed a rough threshold model (RTM) to analyze and
extract keywords from the scientiļ¬c publications. In our approach we
augment the original rough threshold model with phrase analysis algo-
rithm to resolve semantic ambiguity that is not handled by the original
rough threshold model for topic generation.
Collaboration network is one type of popular social networks that
has been widely studied in the literature [4,5,23]. A property that
many social networks have in common is clustering, or network tran-
sitivity [2,26,39]. Clustering coefļ¬cient is deļ¬ned as the probability that
two of one's friends are friends themselves [7,39]. It typically ranges
from 0.1 to 0.5 in many real-world networks. A related concept is com-
munity in which connection within the same community is dense and
outside the community is sparse. Community structure in a social net-
work represents real social groupings by interest or background [27].
For example, communities in a citation network represent related
papers on a single topic [31].
There are two broad classes of hierarchical clustering methods to
detect the community structure in a social network: agglomerative
and divisive [28,30]. The agglomerative approach focuses on ļ¬nding
the strongly connected cores of communities by adding links [24],
and the divisive approach uses information about edge betweenness
to detect community boundaries by removing links [23]. For example,
the Girvanā€“Newman algorithm [12] is one of the most widely used
divisive methods and is effective at discovering the latent groups or
communities that are deļ¬ned by the link structure of a graph.
Newman's fast algorithm [25] is an efļ¬cient reference algorithm for
clustering in large networks. It falls in the general category of agglom-
erative hierarchical clustering methods. This method can be easily
generalized to weighted networks in which each edge has a numeric
value indicating link strength. It has been successfully applied to a
collaboration network of more than 50,000 physicists. In this study,
we adopt Newman's fast algorithm in our research social network
analysis.
3. An overview of the RAF and Scholarmate
Research Analytics is the application of methods and theories in
scientometrics, business intelligence and social network analysis to
transform research related data into relevant information in research
management. In this paper we demonstrate the research analytics
framework in the context of reviewer recommendation for research
project selection.
3.1. The RAF for reviewer recommendation
This study takes a proļ¬le-based approach to reviewer recommen-
dation. Fig. 1 illustrates the key framework.
Research Online (http://rol.scholarmate.com) is an institutional
repository service provided by Scholarmate (http://www.scholarmate.
com) to analyze proposals submitted through the Internet-based
Science Information System (ISIS, https://isis.nsfc.gov.cn). It helps
build standardized visual research CVs of researchers and identify the
social groups to which they belong. These steps greatly ease the proļ¬l-
ing of proposals and researchers. Key features and attributes such as
discipline codes and keywords to represent proposals and researchers
are derived from the standard keyword dictionary. Phrase patterns are
discovered by data mining the free text categories of the electronic
documents from various databases (e.g. ISI, Scopus and EI). Based on
the constructed comprehensive proļ¬les of both the proposals and
potential reviewers, the system generates key performance indicators
in three dimensions, i.e., relevance, productivity and connectivity. Finally,
a matching algorithm that takes into account all three dimensional
measures is proposed for reviewer recommendation.
Speciļ¬cally, relevance refers to the keywords, research discipline and
expertise area that are derived from both the researcher's scientiļ¬c pub-
lications and prior funded projects. Productivity is measured by quality,
quantity, citations, and impacts of one's research, as well as other
academic achievements. Connectivity among researchers is inferred
through collaborations, such as collaborators in projects, co-authorship
in publications, and colleagues in the same organizations. Their speciļ¬c
roles in the reviewer assignment process can be demonstrated in Fig. 2.
We will discuss each of them in detail in Section 4.
3.2. Scholarmate research social network
Scholarmate (http://www.scholarmate.com) is a professional
research social network that connects people to research with the aim
of ā€œinnovating smarterā€. It offers research social network services that
help researchers ļ¬nd suitable funding opportunities and potential re-
search collaborators. In addition to its important function of connecting
958 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
people with similar interests, Scholarmate has a search tool to help
researchers extract their publications from existing bibliographic data-
bases (e.g., ISI, Scopus) directly, along with citations of the paper
and impact factor of the journal. Moreover, Scholarmate provides
researchers with the ability to disseminate research outcomes and in-
formation about their current interests over established social connec-
tions. On the one hand, researchers can use Scholarmate to manage
their research outcomes and research in progress, including research
proposal preparation. On the other hand, transparency in information
sharing among scholars in Scholarmate will open an opportunity for
researchers to timely participate in relevant scholarly activities, such as
becoming potential reviewers. For example, a panel chair will be able
to judge the recent research expertise of a researcher after analyzing
the knowledge sharing activities in Scholarmate.
In Scholarmate, several types of networks can be constructed, such as
citation networks, project collaboration and journal article co-authorship
networks. An example of the collaboration network is presented in Fig. 3.
The numbers beside the nodes are researcher identiļ¬cation numbers
(RIDs). The numbers on the edges are the collaboration frequencies of
two researchers. The frequency of collaboration is measured in terms
of the number of co-authored publications, number of collaborated pro-
jects and number of co-cited papers extracted through the Scholarmate
platform. Three major communities are identiļ¬ed and are indicated by
the ovals in the ļ¬gure. The communities are derived according to
research expertise. We are also able to identify top researchers in the
social network in terms of connectivity by degree, betweenness, and
closeness, as shown in Table 1. The numbers in brackets denote the
rankings under the corresponding measures. The researchers who
have high ranks in the same community as principle investigators are
identiļ¬ed as the potential reviewers subject to the condition that there's
no direct connection between the potential reviewers and the principle
investigators. For example, researcher 51 is a principle investigator and
researcher 55 is identiļ¬ed as a potential reviewer because these two
researchers are in the same community but they have no direct collab-
oration. The fact that both of them have collaborations with researchers
37 and 38 indicates potential overlap of research interests in some com-
mon research areas.
The research social network can enhance data representation in
several ways. For example, existing databases only store data about
published articles. Working papers that reļ¬‚ect the most recent
research activities cannot be obtained by a search in bibliographic
databases, but may be available on the social network site. Similarly, a
researcher who has secured an industry grant that is relevant to the
required reviewer expertise may be suitable to serve as a potential
reviewer. However, traditional method cannot identify this researcher
due to inability to access such information. Social network facilitates
real-time information sharing and therefore is effective for such type
of information acquisition. Such additional information greatly en-
hances the completeness and timeliness of our data representation.
4. Proļ¬ling and key indices
In this section, we present a comprehensive representation of the
proposal and researcher proļ¬les from both available databases and
the research social network, based on which three key performance
indicators are derived: relevance, productivity, and connectivity.
Fig. 4 shows relationship between three key performance indicators
and their usage in reviewer recommendation.
Reviewer
Profiling
Discipline codes,
keywords, phrase
patterns
Reviewer
Recommendation
Proposal
Profiling
Key words
dictionary
Matching
Algorithm
Discipline codes,
keywords, phrase
patterns
Research Online
Platform
Scholarmate Research
Social Network
Research Analytics
Relevance Productivity Connectivity
Fig. 1. The framework of proļ¬le-based reviewer recommendation.
Proposal
clustering
Selection of
eligible reviewers
(Relevance
Index)
Exclusion of conflict
of interest
(Connectivity Index)
Balance expertise
of reviewers
(Productivity
Index)
Assignment
of reviewers
Fig. 2. Stage diagram for proposal-reviewer recommendation.
959
T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
Initially the system constructs proļ¬les of proposals (indexed by i)
and researchers (indexed by j), respectively. The proposal proļ¬ling
and reviewer proļ¬ling are discussed in detail in Section 4.1. The three
key indices are developed as follows. We ļ¬rst use a component-based
matching algorithm to calculate the relevance index (rij), which denotes
the degree of matching between the proposal proļ¬le and the reviewer
proļ¬le. Based on the Scholarmate platform services, we construct
the connectivity index (cij) via the collaboration network indicating
frequency of research collaboration among reviewers, PIs and co-PIs.
The generated collaboration network is analyzed by identifying com-
munities and their features such as structure and closeness, and those
features are used in the generation of connectivity index. The connectiv-
ity index is used to resolve the conļ¬‚ict of interest and to identify the
most relevant reviewers. Finally, we generate potential reviewers' pro-
ductivity index (ej), which considers quality of the publications, research
impact and academic achievement. The productivity index is used to
balance the expertise of potential set of reviewers in the optimization
program of reviewer recommendation.
4.1. Proļ¬ling
In general, proļ¬ling is the process of determining key attributes
that can be used to characterize a given object. In our project selection
context we focus on proposal proļ¬ling and researcher proļ¬ling. The
objective of proposal proļ¬ling is to extract proposal relevant features
and that of researcher proļ¬ling is to extract researcher expertise. The
quality of proļ¬ling directly affects the effectiveness of research project
selection. The integration of both subjective and objective information
is necessary during the process of proļ¬le generation.
We ļ¬rst focus on proposal proļ¬ling. The proposal submitted
through the Internet-based Science Information System (ISIS) has
standard template to be ļ¬lled in up to two discipline codes and ļ¬ve
keywords. We express the self-claimed discipline code (Discode) and
keywords (Key) in the following sequence:
bPropNo; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 > Ć°1ƞ
where PropNo is the proposal number that uniquely identiļ¬es a proposal.
This sequence can be directly extracted from the proposal.
To verify whether the claimed information is accurate, an objective
examination of the proposal title and abstract is necessary. The second
type of information is obtained through data mining the title and
abstract sections of the proposal. It can be expressed in the following
sequence:
bPropNo; key1; key2; ā€¦; keym > : Ć°2ƞ
Note here that we use lower case key to represent keywords
extracted from the non-standard content area (i.e., title and abstract).
This set of keywords has some overlaps with, but is generally larger
than, the standard keyword database deļ¬ned by the government
funding agency. For the fair comparison of any two proposal docu-
ments, we extract m keywords in each document. The search algorithm
that we will discuss later determines the preferred number of key-
words. Ideally we can add the whole content of the proposal to obtain
the highest ranked keywords through word frequency analysis. We
found that this would increase the computational effort without adding
too much new insight. Mining the title and the abstract is accurate
enough to classify proposals according to the keywords.
We next consider researcher proļ¬ling. The funding agency main-
tains an expert dictionary for the pool of potential reviewers. The expert
Fig. 3. An example of collaborated network.
Table 1
Researchers' Connectivity Ranking.
RID Degree n-Degree Betweenness Closeness Overall
37 11 (1) 0.1930 0.0345 (4) 0.2069 (1) 0.1685 (1)
18 9 (2) 0.1579 0.0459 (1) 0.1787 (3) 0.1459 (2)
27 6 (2) 0.1053 0.0382 (3) 0.1474 (7) 0.1129 (5)
10 6 (4) 0.1053 0.0453 (2) 0.1843 (2) 0.1328 (3)
15 5 (4) 0.0877 0.0143 (9) 0.1685 (4) 0.1134 (4)
31 5 (6) 0.0877 0.0244 (5) ā€“ ā€“
19 5 (6) 0.0877 0.0169 (7) ā€“ ā€“
38 5 (6) 0.0877 ā€“ 0.1345 (8) ā€“
52 4 (8) 0.0702 0.0122 (10) 0.1638 (5) 0.1054 (6)
43 4 (8) 0.0702 0.0163 (8) ā€“ ā€“
34 ā€“ ā€“ 0.0207 (6) ā€“ ā€“
36 ā€“ ā€“ ā€“ 0.1340 (9) ā€“
44 ā€“ ā€“ ā€“ 0.1512 (6) ā€“
960 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
dictionary is standardized and the available choices are the same as
those in the proposal application. Initially each potential reviewer
chooses his/her own disciplines and expertise areas (expressed as key-
words). The self-claimed discipline code (Discode) and keywords (Key)
are expressed in the following sequence:
bResearcherID; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 > Ć°3ƞ
where ResearcherID is used to uniquely identify a potential reviewer.
Each potential reviewer may have successful grants from different
funding agencies and have publications, patents, or awards from
various sources. We extract such objective information from several
databases and list them as:
bResearcherID; GrantNo; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 >
ư4ƞ
bResearcherID; PubNo; key1; key2; ā€¦; keym > : Ć°5ƞ
In addition, the potential reviewers may have social tags. Social
tags are labels about expertise areas that are maintained by friends
or other concerned parties who may know reviewers well in other
capacities. For example, a panel chair may know research expertise
of the reviewer from his/her previous service to the funding agency.
Information extracted from reviewers' social tags can be aggregated
and expressed as:
bResearcherID; key1; key2; ā€¦; keym > : Ć°6ƞ
4.2. Extracting topic features from texts
During the process of objective information extraction, it is neces-
sary to analyze non-free text areas such as titles and abstracts of elec-
tronic documents. The determination of a set of topic features from
non-text ļ¬elds follows several steps including extracting phrases,
ļ¬ltering out non-key phrases, resolve semantic heterogeneity and
constructing keyword dictionary. In this study we combine several
techniques including Rough Threshold Model and Database Tomogra-
phy and develop an algorithm to calculate document phrase weight
distribution.
When extracting information from texts such as titles and abstracts
in funded projects and publications, we ļ¬rst need to build a standard
research keyword dictionary. Phrases (a combination of multiple
words) rather than single word are used to solve semantic ambiguity
as single words are rarely sufļ¬cient to accurately distinguish standing
researcher interests [32]. Generally phrases carry more meaning than
single words. We ļ¬nd a phrase with length of 2ā€“4 keywords strong
enough to capture the meaning effectively.
The free-text category ļ¬elds of scientiļ¬c publications (e.g. title,
abstract and keywords) are analyzed and technical phrases were
extracted using the Database Tomography (DT) process [17,18]. DT
is a textual database analysis system that provides algorithms for
extracting multi-word phrase frequencies with their proximities. We
applied DT algorithm to extract all adjacent double, adjacent triple
and adjacent quadruple word phrases from the text (i.e. title, abstract
and keywords) along with their frequencies. We discarded those
phrases with extremely high frequencies (not useful to distinguish
documents) and those with extremely low frequencies (not useful to
compare documents). Finally these phrases are built into the keyword
dictionary.
Extract self-
claimed
disciplines and
key words from
proposal
Extract key
words from
proposal title and
abstract
Proposal
Profiling
Researcher
Profiling
Extract self-claimed
disciplines and
expertise from
keyword dictionary
Extract key words
from previously
funded proposal
title and abstract
Extract key words
from publication
title and abstract
Relevance
Index
Extract PI/
Co-PI Info
Connectivity
Index
Weighted
publication
score from
research
databases
(e.g., ISI,
Scopus, EI)
Citation score based
on SCI/SSCI search
and H-index
Academic ranking &
institutional reputation
Research social network
(e.g., ScholarMate)
Relevance
Index
Connectivity
Index
Key Performance
Indicators of RAF
Remove
conflict of
interests
Balance
reviewer
expertise
Productivity
Index
Fig. 4. Process model and relationship with key indices in reviewer recommendation.
961
T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
According to Rough Threshold Model (RTM) [20], documents are
represented in terms of weight distribution over topic features. We
use an augmented RTM topic ļ¬ltering algorithm to generate topic
features from the documents. Speciļ¬cally, let P={p1, ā€¦ pm} be the
initial set of phrases extracted from all documents D={d1,d2, ā€¦,dn}.
Let fij be the number of appearances of phrase j in document di. A
document di can be expressed by a set of phrases with corresponding
occurring frequencies: di ={(p1,fi1), ā€¦ (pn,fim)}.
The initial phrase set of di is rpi ={pj|fij >0}. If two documents have
the same phrase patterns, the two initial phrase patterns can be com-
posed. For example, {(p1,1),(p2,3)}āŠ•{(p1,2),(p2,2)}={(p1,3),(p2,5)},
where āŠ• denotes the composition operation. We can group the initial
phrase patterns that have the same phrase sets into clusters and use
their composed phrase pattern to represent the cluster. Assume that
there are rbn clusters. The cluster can be represented by crpr ={(p1,
cfr1),(p2, cfr2), ā€¦,(pm, cfrm)}, where the cluster frequency cf rk Ā¼
āˆ‘
crpr
j j
iĀ¼1 f ik, for k=1, 2, ā€¦ m, is the composed frequency in the cluster.
We deļ¬ne the support for phrase pattern rpi āˆˆcrpr as follows.
support crpr
Ć° ƞ Ā¼
crpr
j j
D
j j
ư7ƞ
Furthermore, āˆ‘rsupport crpr
Ć° ƞ Ā¼ 1.
The normal form of the cluster phrase pattern can be described by
the following association mapping function: Ī²(crpr)={(p1,wr1),(p2,
wr2), ā€¦,(pm,wrm)}, where phrase normalized frequency is deļ¬ned as:
wrk Ā¼
cf rk
āˆ‘m
iĀ¼1cf ri
; k Ā¼ 1; 2; ā€¦; m: Ć°8ƞ
The relative importance weight of phrase pk in document i over all
documents can be deļ¬ned as:
Ī²ik Ā¼ āˆ‘
pkāˆˆrpiāˆˆĪ² crpr
ư ƞ
support crpr
ư ƞwrk
f ik
cf rk
: ư9ƞ
The document i can be alternatively represented by its phrase
weighted distribution Ī²i ={Ī²i1, Ī²i2 ā€¦ Ī²im)}.
For a given document (i.e. set of publications and projects), all
initial phrase patterns are calculated with their pattern frequency.
Generated patterns are combined to construct clusters and clusters
are labeled using phrases in combined patterns. Each pattern frequency
in the cluster is normalized and normalized weights are calculated.
Finally, a document that is uniquely represented by its initial phrase
patterns can be characterized by its phrase weight distribution across
all documents. The algorithm is described as follows.
4.3. Relevance
The relevance index is used to determine how well reviewer
expertise is matched with the content of the proposal. It is calculated
from matching the proposal and reviewer proļ¬les. The task of proļ¬le
matching is to decide whether a sequence of key phrases that
describe proposal proļ¬le attributes matches key phrases that represent
reviewer proļ¬le attributes. Two widely accepted approaches for calcu-
lating similarity between terms are Euclidean distance and cosine sim-
ilarity measure [14]. For the self-claimed information that is extracted
in standard terms, we use the Jaccard similarity measure [1] to perform
component-based matching over reviewer and proposal proļ¬les. Data
extracted by Eqs. (1), (3) and (4) can be matched using this method.
The Jaccard index between reviewer i and proposal j is expressed as:
Jij Ā¼
F Keyi1; Keyi2; ā€¦Keyi5
Ć° ƞāˆ© Keyj1; Keyk2; ā€¦Keyj5
 
h i
F Keyi1; Keyi2; ā€¦Keyi5
Ć° ƞāˆŖ Keyj1; Keyk2; ā€¦Keyj5
 
h i ư10ƞ
where Keyik and Keyjk, k=1, 2 ā€¦, 5, are the ļ¬ve keywords associated
with reviewer i and proposal j in standard terms. The numerator
denotes the number of keywords in common, and the denominator
represents the total number of unique keywords in both proļ¬les. As
shown, the Jaccard similarity is measured by the ratio of the frequency
of an intersection divided by the frequency of a union between two sets
of keywords [1].
To determine the similarity of the non-standard phrase patterns,
we adopt the cosine similarity measure. For researcher proļ¬le i and
proposal proļ¬le j, the similarity can be calculated as follows [9].
Cij Ā¼
Ī²iĪ²j
āˆ„Ī²iāˆ„āˆ„Ī²jāˆ„
Ā¼
āˆ‘
m
kĀ¼1Ī²ikĪ²jk
ļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒ
āˆ‘m
kĀ¼1Ī²2
ikāˆ‘m
kĀ¼1Ī²2
jk
q ư11ƞ
where Ī²ik and Ī²jk are the normalized frequency of phrase patterns pk
in two proļ¬les i and j. Phrase patterns extracted by Eqs. (2), (5), and
(6) are processed by the algorithm presented in Table 2. The resulting
weight distribution is used to derive the similarity measure.
Note that each researcher may have several grants or publications.
There are different ways to deļ¬ne the similarity measure in the
respective categories (grant or publication). The ļ¬rst possibility is to
consolidate several documents in the same category into one integrated
document that represents the researcher proļ¬le in that speciļ¬c category.
Then the algorithm generates one weight distribution for the consoli-
dated document. Within each category only one consolidated measure
Cij is derived. Another method is to treat the documents separately.
The algorithm will result in one weight distribution for each document.
Pair-wise similarity measure can be calculated between each of the
researcher's documents and the proposal. We then choose the maxi-
mum similarity in a category as the ļ¬nal measure of similarity between
the proposal and the potential reviewer in that speciļ¬c category.
Since multiple sources of information, both subjective and objective,
need to be aggregated, an appropriate weighting strategy is needed to
reļ¬‚ect the relative importance in the overall evaluation [9]. Denote rij
as the degree of matching between proposal i and the potential reviewer
j. An aggregate measure in the relevance dimension can be obtained as
follows:
rij Ā¼ Ī±Selfij Ć¾ Ī²Garntij Ć¾ Ī³Pubij Ć¾ Ī“Socialij Ć°12ƞ
where Ī±+Ī²+Ī³+Ī“=1.
The four terms refer to self-claimed information, grants, publications,
and social tags. Note that self-claimed information from proposal (Self)
and the social tags that label the potential reviewers (Social) are related
to the subjective judgment, while grants and publications provide ob-
jective measures related to the match between proposals and potential
Table 2
Algorithm to calculate document phrase weight distribution.
Input: A document set D and a phrase set P
Output: A document's phrase weight distribution Ī²i =(Ī²i1,Ī²i2, ā€¦,Ī²im)
Initialize RP=Ī¦
for each (di āˆˆD){
for (pj āˆˆP)
di ={(p1,fi1), ā€¦ (pm,fim)}
rpi ={pj|fij 0}
RP=RPāˆŖrpi
}
RP=āŠ• RP
Cluster document crpr based on rpi
Calculate support (crpr) based on Eq. (7)
crpr ={(p1,cfr1),(p2,wr2),ā€¦,(pm,wrm)} based on Eq. (8)
for each (di āˆˆD) {
for (pj āˆˆrpi āˆˆcrpr)
calculate Ī²ik based on Eq. (9)}
END
962 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
reviewers. Decision makers may assign different weights to aggregate
both subjective and objective information.
As shown in Fig. 5, the proposal related information including
discipline codes, keywords, abstracts and PI is displayed on the top
of the screen. The relevance score is calculated and displayed in the
middle. Clicking on each tab will show the matches that are identiļ¬ed
by the system.
Efļ¬ciency of the matching algorithm can be calculated in terms of
time complexity. The algorithm requires a single traverse through all
the set of reviewer proļ¬les for each proposal. The matching between
pre-generated subjective, objective and social information patterns in
proposal and reviewer proļ¬les requires O(nāˆ—m) in its worst-case,
where n is the number of proposals and m is the number of reviewers.
The proposals are clustered according to their disciplines and a set of
proposals is matched against all the reviewers. In order to reduce the
computational complexity of the algorithm, proļ¬les of reviewers and
proposals are constructed beforehand.
4.4. Connectivity
The nature of connection between reviewers, PIs and co-PIs is very
important when assigning reviewers to evaluate proposals. Bearing
same expertise as principle investigators and having no direct personal
conļ¬‚ict with PIs are essential constraints that should be satisļ¬ed by the
reviewers. Thus in this study we utilize social network analysis related
concepts such as community structure, closeness of individuals in
the same community to discover non-trivial relationships among
researches. After analyzing individuals in one community we are able
to identify group of individuals who have similar research interests,
who are active in the corresponding research area, and who have
close connection with PIs or co-PIs. Such information is then used to
remove conļ¬‚ict of interest and to aid preferential assignment to the
most relevant reviewers.
Several types of networks can be constructed using the available
social network data in Scholarmate. For example, we can represent
scientiļ¬c papers as vertices in a graph. Vertices are connected by
the edges when one paper cites the other one, which cites others as
well. Alternatively, we can construct the researcher network. Each
researcher is represented as a vertex in a graph. An edge is built when
one researcher cites another's work (directed citation network), or
when one researcher co-authored with another researcher (undirected
collaboration network). We deļ¬ne the edge weight as the number of
citations or collaborations between two researchers. Higher weight
implies more connectivity between the two researchers.
We use graph clustering method to cluster graphs. Hierarchical
clustering is a traditional method for detecting community structure.
Here we focus on the collaboration network. We ļ¬rst assign a weight
uij for a pair of vertices in the network, which is deļ¬ned as frequency
of collaboration between two researchers and therefore represents
how closely the researchers are connected. By analyzing the implicit
community structure and estimating the strength of ties between
individuals, we are able to discover nontrivial patterns of interactions
in the scientiļ¬c collaboration networks.
Assume that there are s predeļ¬ned communities. Deļ¬ne uIJ as the
fraction of collaboration frequency among researchers in community
I to those in community J. Denote aI =āˆ‘JuIJ, which represents the
Relevance Score
Overall Self
(40%)
Grants
(20%)
Publications
(20%)
Social
(20%)
81 100 40 85 80
Proposal 53361479
Discipline code 1: F020508 (Pattern recognition theory and application)
Discipline code 2
:
Keywords: Machine learning, Semi-supervised learning, Spectral clustering, Support vector machine
Abstract: Machine learning based on data is an important direction for modern artificial intelligence. It allows computers to automatically learn
characteristics of the unknown underlying probability distribution and refine learning based on empirical data from observed samples. Semi-
supervised learning is a popular machine learning technique that make use of both labeled and unlabeled data in the learning process. Support
vector machine is a new method for supervised learning that can be used for classification and regression analysis. Spectral methods are of
fundamental importance in statistics and machine learning. This project aims to combine support vector machine and spectral clustering methods to
study the semi-supervised learning problems for very large datasets.
Principle Investigator
: XXX
Institution:XXX
Grant No. Project Title Abstract
60775045 Data Reduction Method for Machine
Learning
Data reduction is one of the main topics of machine learning. High dimension and
nonlinearity are two key problems ...
61033013 Theories and Technologies of Image
Invariant Features Based on Cognitive
Models
This interdisciplinary research aimsto integrate human knowledge, image
processing, computer pattern recognition, and machine learning to extract image
invariant features ā€¦
Grant No. Title Discipline Code Keywords
60775045 Data Reduction Method for Machine
Learning
Discipline Code 1:F030504 Data mining
and machine learning
Discipline Code 2:
Data reduction, Lie Group,Machine
learning, Crystal classification
61033013 Theories and Technologies of Image
Invariant Features Based on Cognitive
models
Discipline Code 1: F0205 Computer
application technology
Discipline Code 2: F020512 Knowledge
discovery and knowledge engineering
Image processing, Recognition,
Machine learning, Scale space, Image
invariant features
More
More
Fig. 5. Matching between proposal and reviewer to calculate the relevance score.
963
T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
weighted fraction of edges that connect to vertices in community
I (i.e., the fraction of collaborations that researchers in community
I collaborate with researchers in other communities). The Newman's
fast algorithm is based on the idea of modularity [25]. Following
their approach, we deļ¬ne the modularity measure for a network
with s communities as:
Qs Ā¼ āˆ‘
s
IĀ¼1 uIIāˆ’a
2
I
 
ư13ƞ
where uII is the weighted fraction of edges in the network that con-
nect vertices in the same community. A high value of Qs represents
a good community division. However, optimizing Qs over all possible
divisions is infeasible in practice for networks larger than thirty verti-
ces. Various approximation methods are available, such as simulated
annealing, genetic algorithms, and so on. A standard ā€œgreedyā€ optimi-
zation algorithm is used. The algorithm to determine the optimal
community structure takes the following steps (Table 3).
The algorithm starts with n communities, where n is the total
number of nodes in the collaboration network. Assuming that each
vertex is the sole member of a distinctive community, and the algo-
rithm iteratively merges each pair of communities in which there are
edges connecting them. The time taken to join any pair of communities
will at most be m, where m is the total number of edges in the graph.
The change of Qs can be calculated in constant time in each iteration.
Following the join some elements in the matrix W should be updated
by adding together the rows and columns corresponding to the joined
communities. At each step the algorithm takes worst-case time
O(n+m). When the algorithm completes its execution minimum nāˆ’
1 joins are required. Then the time complexity of the algorithm is
O((n+m)n) or O(n2
).
Since the value of Q|W| is calculated in each iteration, ļ¬nding the
optimal community structure is straightforward. The hierarchical
clustering method also enables us to deļ¬ne the community structure
according to our required granularity level. To ļ¬nd the connectivity,
we extract all principle investigators and other members of proposal
i. If none of them is in the same community as the potential reviewer
j, we deem the reviewer is not an ideal candidate to review the pro-
posal. Therefore, we label gij ā‰Ŗ1 to suggest a mismatch. Otherwise,
we label gij =1, indicating a high goodness of ļ¬t.
Resolving conļ¬‚ict of interest is an important step in the reviewer
assignment process. For example, to ensure an objective review of
the proposal, the government funding agency requires that applicants
and reviewers should not have co-author relationship in the last ļ¬ve
years. Conļ¬‚ict of interest can be immediately identiļ¬ed by a direct
link in our collaboration network. If any of the primary members of
proposal i has conļ¬‚ict of interest with a potential reviewer j, we
label cij =0, enforcing a ā€œNoā€ decision in the reviewer assignment.
4.5. Productivity
Productivity index is calculated for potential reviewers and is used
to indicate the contribution to the ļ¬eld made by them. For fair and
unbiased project selection, productivity needs to be balanced among
the reviewers who are to be assigned to evaluate same proposals.
We measure the productivity of a potential reviewer in terms of the
number of publications, quality of the publications and citation
impact in the past ļ¬ve years. A productivity index can be computed
with aggregation of quality and quantity of publications.
Generally academic journals are classiļ¬ed into different disciplines
and they are assigned a rank, such as level A journals, level B journals
or level C journals. As in [33] we assume that the journal rank reļ¬‚ects
the quality of the articles published in that journal as it is widely used
in many research performance measuring activities related to merit
increases and for allocation of research funding in university settings
[36]. According to [33], we adopt a weighted scheme to generate the
productivity index as a measure of overall contribution of a researcher
to the ļ¬eld. Let qij be reviewer j's total number of publications in rank
i level's journals, where i=A, B, C. The publication score of reviewer j
is expressed as:
Gj Ā¼ wAqAj Ć¾ wBqBj Ć¾ wCqCj Ć°14ƞ
where wA wB wC, indicating the emphasis on quality work. There are
different ways to deļ¬ne the weights. For example, the average impact
factors for all the journals classiļ¬ed at the same level can be used to
deļ¬ne the corresponding weight.
Professional titles (e.g. senior scholars like Professor and Associate
Professor, or junior scholars like Assistant Professor) and H-index can
also be taken into consideration for recommending reviewers to
proposals. We may assign higher rank score to higher professional
titles. Let Rj and Hj be potential reviewer j's rank score and H-index,
respectively. An integrated research productivity measure can be
obtained as follows:
ej Ā¼ uGj Ć¾ vRj Ć¾ tHj Ć°15ƞ
where u+v+t=1.
5. Assigning reviewers for proposal evaluation
The reviewer assignment process deals with assigning reviewers
to evaluate proposals in speciļ¬c discipline area. Current practice is
manual matching of proposals to reviewers based on their declared
expertise. This is inefļ¬cient because subjective expertise judgment
alone is insufļ¬cient to decide reviewer expertise as it lacks objective
evidences. We introduce the relevance measure to balance the self-
claimed expertise and the expertise induced from the derived objec-
tive information. The key objective is to maximize relevance between
proposals and potential reviewers.
Because the quality of evaluation largely depends on the experiences
and judgments from the reviewers, there is a need to balance reviewer
expertise among the reviewers who are assigned to the same proposal.
For example, senior scholars tend to give higher weight on innovative-
ness of the proposal than their junior counterparts, while junior scholars
tend to put higher weight on methodology rigor in comparison with
senior fellows. Let e be the desired average productivity level of the
potential reviewers. This can be determined by relevant decision makers
such as panel chairs or division managers. We want the average review-
er expertise levels to be close enough to this desired level. For example,
if the potential reviewer is a junior scholar whose ej is signiļ¬cantly
lower than e, then the proposal would need a senior scholar whose
Table 3
Algorithm to cluster the collaboration network into communities.
Step 1. Initially there are n vertices representing researchers. uij is the collabora-
tion frequency between researchers i and j. Initially each vertex is the sole member
of a distinctive community. Calculate the within and between community collabo-
ration fraction uII and uIJ, and form matrix W Ā¼
u11 ā€¦ u1n
ā‹® ā‹± ā‹®
un1 ā€¦ unn












. Calculate aI.
Step 2. Calculate Ī”QIJ =uIJ +uJI āˆ’2aIaJ. Choose (I,J)=argMaxĪ”QIJ to join if Ī”QIJ ā‰„0
or (I,J)=argMinĪ”QIJ if Ī”QIJb0.
Step 3. Update the matrix elements uIJ by adding together the rows and columns cor-
responding to the joined communities. Update aI. Calculate Q|W| according to Eq. (13).
Step 4. Repeat steps 2 and 3 to join communities in pairs until all vertices are joined.
Step 5. The optimal community structure is determined by s=argMaxQ|W|.
964 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
productivity measure is signiļ¬cantly higher than e to review the
proposal.
First we construct a network model where each proposal and
potential reviewer is represented as a node in the network. The poten-
tial reviewer node is called the supply node, and the proposal node is
called the demand node. Assume that there is a set of I proposals and
a set of J potential reviewers. Let xij be the integer decision variable
indicating the assignment of proposal i to potential reviewer j. There-
fore, xij=1 implies recommending assignment and xij=0 implies that
the assignment is not recommended. We maximize the relevance sub-
ject to the ļ¬‚ow constraints which reļ¬‚ect the management's require-
ment of the reviewer assignment. The optimization problem can be
expressed as:
Max āˆ‘
iāˆˆI
āˆ‘jāˆˆJcijgijrijxij
s:t: āˆ‘jāˆˆJxij ā‰„ b for iāˆˆI
āˆ‘iāˆˆIxij ā‰¤ d for jāˆˆJ
āˆ‘jāˆˆJ ejxijāˆ’e
 
ā‰¤ Īµ for iāˆˆI
xijāˆˆ 0; 1
f g for iāˆˆI; jāˆˆJ
: ư16ƞ
The coefļ¬cients in the objective function ensure that we maximize
the overall relevance measure in the reviewer and proposal pools. cij
is the indicator variable to remove conļ¬‚ict of interest, and gij is the
coefļ¬cient for preferential assignment of reviewers in the same scien-
tiļ¬c research community.
The ļ¬rst set of constraints ensures that each proposal has at least b
reviewers. The second set of constraints guarantees that each reviewer
cannot review more than d proposals. In practice, usually b=3 and d=
20. The third set of constraints is used to balance reviewer expertise.
Note that, Īµ0 is the tolerance level that can be chosen by the panel
chair or the management team.
As to the implementation of this model, we ļ¬rst analyze community
structures to remove conļ¬‚ict of interest and to identify potential re-
viewers. Next, we calculate the relevance degree between the reviewers
and the proposals in such a way that the PIs of the proposals belong to
the same community as their potential reviewers. The calculated rele-
vance degrees are sorted and reviewers with high relevance degree
are selected to evaluate those proposals. Finally, productivity among
the reviewers who are assigned to one proposal is balanced and work-
load is evenly distributed among reviewers.
In order to achieve a higher degree of computational performance,
the collaboration networks for reviewers and PIs of the proposals
under each division are constructed and the optimal numbers of com-
munities are derived before the reviewer assignment process is carried
out. First, the time complexity for traversing through the community
graph for connectivity index calculation is O(n0 +m0), where n0 is the
total number of nodes in one community and m0 is the number of
connections between individuals. Second, it requires O(n0m0) time
complexity for spanning through the whole set of reviewers and pro-
posals when generating the relevance degree matching. Third, the
time complexity for sorting the end result is O(n1logn1), where n1
represents the number of matching results. Finally, the time complexity
for balancing productivity of reviewers in the same group is O(1) and it
is negligible. In summary, the worst-case computational complexity of
the proposed technique is O(n0 +m0 +n0m0 +n1logn1).
6. Implementation and evaluation
The proposed research analytics framework is implemented to aid
the largest government funding agency in China for its grant proposal
evaluation. It aims at funding scientiļ¬c research projects that could
make huge social impact. The organizational hierarchy of the funding
agency consists of one general ofļ¬ce, ļ¬ve bureaus, and eight scientiļ¬c
departments. These departments are responsible for funding and
managing research projects. Each department is further divided into
divisions which are more focused on speciļ¬c research areas.
There is intensive competition for getting research projects funded,
with the most recent funding rate of only 21% in 2011. The government
funding agency received around 147,000 and 170,000 proposals in 2011
and 2012, respectively. Proposals are widespread over many scientiļ¬c
disciplines. These conditions make it difļ¬cult for the evaluation com-
mittee to directly participate in every project evaluation. The committee
groups the proposals in different areas and delegates their authority
to groups of experts according to research areas. Each area may consist
of multiple related disciplines. For example, Business is an area that in-
cludes Management Science, Information Systems, and other business
disciplines. There is a general budget to be distributed among the
areas. The distribution of fund is not uniform and represents priorities
set by the evaluation committee of the funding agency. The distribution
could be adjusted based on the quality and quantity of proposals sub-
mitted to each area.
Research project selection is a process that involves multiple
phases illustrated in Table A in Appendix A. To facilitate the project
selection, the government funding agency has established an evalua-
tion system which includes the peer review and expert panel evalua-
tion. Division managers assign and invite external reviewers and
panel experts to evaluate the proposals. The reviewers judge the
quality of the project proposal based on their expertise, professional
experience and with norms and criteria set by the funding agency.
As seen, reviewer assignment is the most important phase that affects
the quality and efļ¬ciency of the research project selection.
We provide computerized support for the second phase of research
project selection. In the prototype implementation of our system, distri-
bution of fund is out of scope of this study. Our focus is the reviewer
assignment recommendation. We have tested different subsets of
proposals and reviewers. The system computes the score of matching
in the relevance dimension for each pair of proposal and potential
reviewer. The ļ¬nal assignment problem can be solved in reasonable
amount of time. The solution is recommended to the review panels in
their respective divisions. The review panels examine the recommenda-
tion and have the right to either accept or reject our recommended
assignment. Additionally, we provide data visualization to help man-
agers view the assignment progress. Fig. 6 shows an example of the
visualization.
Overall, it takes a maximum of 6 hours to compute matching
degrees of 34,000 proposals and 30,000 reviewers, which is the largest
number of proposals received in a single department of the government
funding agency. Thus if we use parallel and distributed computing for
the assignment optimization in each department (there are 8 distinctive
departments in total), we can ļ¬nish the recommendation task within
6 hours. It greatly improves work efļ¬ciency as manual process of
assigning reviewers usually takes up to two weeks to complete.
Quality of recommendation is acknowledged by the review panels.
The proļ¬le-based recommendation takes into consideration the de-
tailed information in terms of relevance, productivity and connectiv-
ity. It can avoid conļ¬‚ict of interests and provide decision makers with
most relevant information that can hardly be obtained by manual
processes. The largest government funding agency has agreed to
adopt our recommendation system in the next round of proposal
evaluation.
7. Conclusion
Building upon a research analytics framework, this study presents
a new approach for research project selection in a research social
network environment. We built proļ¬les of research entities (e.g. re-
search proposals, reviewers) from three aspects including relevance,
productivity and connectivity. Information for building the proļ¬les
of research entities can be obtained from the research social network
(Scholarmate). Degrees of matching based on the proļ¬les of research
965
T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
entitles can be calculated by aggregating subjective, objective and so-
cial information as collected from multiple sources. We implemented
the system to aid the largest funding agency in China to optimize
reviewer recommendation and support reviewer assignment. The
implementation results showed that the proposed method greatly
improved work efļ¬ciency.
Our approach can be easily generalized to support different types of
recommendations in the research social network environment. A direct
application is journal article review. Based on the analysis of article
features, our system can be used to select the initial pool of reviewers,
calculate the degree of match between potential reviewers and the
article, remove conļ¬‚ict of interests, balance reviewer expertise and pro-
ductivity, and make ļ¬nal reviewer assignment recommendations. The
process can be automated and monitored by journal editors. In compar-
ison with the current practice that mainly relies on editors' subjective
judgment facilitated by automated search tools, our system has the
ability to optimize reviewer recommendation empowered by more
social functionalities. Improved accuracy and work efļ¬ciency can be
expected.
Other potential applications include recommending funding
opportunities, publication outlets for research articles, and potential
research collaborators. For example, researchers can easily promote
their recently published articles using the social tools in the form of
likes, tweets, shares, and more. They can even track results when their
articles are cited by others. Meanwhile, the system may recommend
researchers who work in the same research areas to each other within
and across different research communities. Based on a researcher's pro-
ļ¬le, the research social network may also recommend journals that have
published relevant topics as potential journal outlet for working papers.
All these functions are very useful to promote timely distribution and
target dissemination of research work.
There are a number of limitations and possible future research
directions. First, a research project has various attributes that can
potentially inļ¬‚uence both the impact and the probability of success
of the projects. We do not model the decision makers' preferences,
beliefs, priorities, and their risk attitudes. Presumably the reviewer
assignment decision problem can be modeled as a multi-objective
decision problem.
Second, our proposed framework only focuses on the evaluation of
individual projects without building a portfolio of the most promising
projects among all submitted proposals. The portfolio of projects to be
funded and the individual amount that will be funded to each project
are out of the scope of this research. Project evaluation, like product
review, is highly subjective. There is no feedback mechanism available
in the current framework to assess the quality of reviews. Historical
records of funded projects, including the relevant characteristics, evalu-
ation given by the reviewers, and the research output measured by pub-
lications, could be valuable to make better evaluation of new proposals
and to select unbiased reviewers. Future extension of the research
framework may take into account these aspects.
Finally, the power of Scholarmate is its ability to extract and aggre-
gate information from multiple sources. We need to continuously
improve the search tool to meet the increasing search needs of users.
Moreover, standardization of the keyword dictionary can greatly help
the phrase pattern recognition. While we keep evaluating and updating
the keyword dictionary based on feedback of algorithm performance,
we are aware that social vote is another efļ¬cient approach to identify
relevant keywords and remove those less meaningful ones. We have
implemented many social tools to aid the system improvement. The
ultimate goal is to promote a healthy research environment for
researchers to engage in innovative research production.
Acknowledgment
This research is partially funded by the General Research Fund of the
Hong Kong Research Grant Council (Project No: CityU 119611), the
National Natural Science Foundation of China (Project Nos: 71171172
and J1124003) and the City University of Hong Kong (Project No:
6000201).
Project
Clustering
Reviewer
Clustering
Reviewer
Assignment
Invite
Reviewers
View
Submission
Review
Status
View Reviews
and Comments
40
Management
Evaluation Proposal
Project Clustering
30
Progress Report
Reviewer Assignment
Reviewer Log In
100 10
Returned (100)
Unreturned (70)
Declined (30)
Invited (200)
In progress (40)
Not started (30)
Unreturned (70)
-
Log in (150)
Never log in (50)
Clustered (240)
Not clustered (60)
Not assigned (100)
Assigned but not invited yet (200)
Invited (200)
Home Application
Fig. 6. Visualization of the reviewer assignment progress.
966 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
Appendix A
References
[1] H. Abe, S. Tsumoto, Analysis of research keys as temporal patterns of technical term
usages in bibliographical data, in: A. An, P. Lingras, S. Petty, R. Huang (Eds.), Active
Media Technology, 6335, Springer, Berlin Heidelberg, 2010, pp. 150ā€“157.
[2] E.M. Airoldi, X. Bai, K.M. Carley, Network sampling and classiļ¬cation: an investi-
gation of network model representations, Decision Support Systems 51 (3)
(2011) 506ā€“518.
[3] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Second edition
Addison-Wesley, Wokingham, UK, 2011.
[4] A. Bajaj, R. Russell, AWSM: allocation of workļ¬‚ows utilizing social network
metrics, Decision Support Systems 50 (1) (2010) 191ā€“202.
[5] A.L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, T. Vicsek, Evolution of the
social network of scientiļ¬c collaborations, Physica A: Statistical Mechanics and Its
Applications 311 (3) (2002) 590ā€“614.
[6] J.P. Caulkins, W. Ding, G.T. Duncan, R. Krishnan, E. Nyberg, A method for managing
access to web pages: ļ¬ltering by Statistical Classiļ¬cation (FSC) applied to text,
Decision Support Systems 42 (1) (2006) 144ā€“161.
[7] J. Choi, S. Yi, K.C. Lee, Analysis of keyword networks in MIS research and implications
for predicting knowledge evolution, Information Management 48 (8) (2011) 371ā€“381.
[8] Y. Dang, Y. Zhang, P.J. Hu, S.A. Brown, H. Chen, Knowledge mapping for rapidly
evolving domains: a design science approach, Decision Support Systems 50 (2)
(2011) 415ā€“427.
[9] Y. Dong, Z. Sun, H. Jia, A cosine similarity-based negative selection algorithm for
time series novelty detection, Mechanical Systems and Signal Processing 20 (6)
(2006) 1461ā€“1472.
[10] W. Fan, M.D. Gordon, P. Pathak, Effective proļ¬ling of consumer information
retrieval needs: a uniļ¬ed framework and empirical comparison, Decision Support
Systems 40 (2) (2005) 213ā€“233.
[11] M.A.H. Farquad, I. Bose, Preprocessing unbalanced data using support vector
machine, Decision Support Systems 53 (1) (2012) 226ā€“233.
[12] M. Girvan, M.E.J. Newman, Community structure in social and biological
networks, Proceedings of the National Academy of Sciences of the United States
of America 99 (12) (2002) 7821ā€“7826.
[13] A.D. Henriksen, A.J. Traynor, A practical RD project-selection scoring tool, IEEE
Transactions on Engineering Management 46 (2) (1999) 158ā€“170.
[14] E. Herrera-Viedma, C. Porcel, Using incomplete fuzzy linguistic preference relations to
characterize user proļ¬les in recommender systems, Ninth International Conference
on Intelligent Systems Design and Applications, ISDA '09, 2009, pp. 90ā€“95.
[15] C.C. Huang, P.Y. Chu, Y.H. Chiang, A fuzzy AHP application in government-
sponsored RD project selection, Omega 36 (6) (2008) 1038ā€“1052.
[16] T. Joachims, A statistical learning model of text classiļ¬cation with support vector
machines, Proceedings of ACM SIGIR'01, 2001, pp. 128ā€“136.
[17] R.N. Kostoff, J.A. Del Roi, J.A. Humenik, E.O. Garcia, A.M. Ramirez, Citation mining:
integrating text mining and bibliometrics for research user proļ¬ling, Journal of
the American Society for Information Science and Technology 52 (13) (2001)
1148ā€“1156.
[18] R.N. Kostoff, T. Braun, A. Schubert, D.R. Toothman, J.A. Humenik, Fullerene data
mining using bibliometrics and database tomography, Journal of Chemical Infor-
mation and Computer Science 40 (Janā€“Feb 2000) 19ā€“39.
[19] Y. Li, C. Zhang, J.R. Swan, An information ļ¬ltering model on the web and its applica-
tion in job agent, Knowledge-Based Systems 13 (5) (2000) 285ā€“296.
[20] Y. Li, X. Zhou, P. Bruza, Y. Xu, R.Y.K. Lau, A two-stage decision model for information
ļ¬ltering, Decision Support Systems (2011), http://dx.doi.org/10.1016/j.dss.2011.
11.005.
[21] T.M. Mitchell, Machine Learning, McGraw-Hill, New York, NY, 1997.
[22] J. Mostafa, W. Lam, Automatic classiļ¬cation using supervised learning in a medical
document ļ¬ltering application, Information Processing and Management 36 (3)
(2000) 415ā€“444.
[23] M.E.J. Newman, The structure of scientiļ¬c collaboration networks, Proceedings of
the National Academy of Sciences of the United States of America 98 (2001)
404ā€“409.
[24] M.E.J. Newman, Coauthorship networks and patterns of scientiļ¬c collaboration,
Proceedings of the National Academy of Sciences of the United States of America
(PNAS) 101 (Suppl. 1) (2004) 5200ā€“5205.
[25] M.E.J. Newman, Fast algorithm for detecting community structure in networks,
Physical Review E 69 (6) (2004).
[26] G. Oestreicher-Singer, A. Sundararajan, Recommendation networks and the long
tail of electronic commerce, MIS Quarterly 36 (1) (2012) 65ā€“83.
[27] J. Qiu, Z. Lin, A framework for exploring organizational structure in dynamic social
networks, Decision Support Systems 51 (4) (2011) 760ā€“771.
[28] S. Raghuram, P. Tuertscher, R. Garud, Research note: mapping the ļ¬eld of virtual
work: a cocitation analysis, Information Systems Research 21 (4) (December
2010) 983ā€“999.
[29] S. Robertson, I. Soboroff, The TREC 2002 Filtering Track Report, TREC, 2002.
[30] J. Scott, Social Network Analysis: A Handbook, Sage Publications, London, 2000.
[31] N. Shibata, Y. Kajikawa, I. Sakata, Measuring relatedness between communities in
a citation network, Journal of the American Society for Information Science and
Technology 62 (7) (2011) 1360ā€“1369.
[32] T. Strzalkowski, Robust text processing in automated information retrieval,
Proceedings of the 4th Applied Natural Language Processing Conference (ANLP),
1994, pp. 168ā€“173.
[33] Y.H. Sun, J. Ma, Z. Fan, J. Wang, A group decision support approach to evaluate
experts for RD project selection, IEEE Transactions on Engineering Management
55 (1) (2008) 158ā€“170.
[34] Y.H. Sun, J. Ma, Z.P. Fan, J. Wang, A hybrid knowledge and model approach for
reviewer assignment, Expert Systems with Applications 34 (2008) 817ā€“824.
[35] Q. Tian, J. Ma, J. Liang, R.C.W. Kwok, O. Liu, An organizational decision support system
for effective RD project selection, Decision Support Systems 39 (2005) 403ā€“413.
[36] E. Turban, D. Zhou, J. Ma, A group decision support approach to evaluating
journals, Information Management 42 (1) (2004) 31ā€“44.
Table A
Research project selection process at the government funding agency.
Phases in R  D pro-
ject selection
Key decisions
Call for proposal and
proposal
submission
1) Check the validity of the submitted proposal content
2) Fulļ¬llment of application requirement by the principle
investigator and by the proposal
Identifying the most
suitable external
reviewers for
proposal evaluation
1) Selection of potential reviewers based on claimed expertise
2) Assignment of external reviewers for validated proposals
based on predeļ¬ned criteria
3) Transferring proposals to responsible divisions
Peer review 1) Review the quality and content of proposals by external
reviewers based on the provides guidelines
2) Validate the review content
3) Coordinate with external reviewers and completion of
the review process as scheduled
Review results
aggregation
1) Aggregate the review results and transform the review
results into comparable measurement and rank the proposal
accordingly
2) Recommend proposals for panel evaluation
Panel evaluation 1) Reļ¬ne the suggested proposal list by making decisions
on marginal proposals by panel of expertise
2) Suggestion on funded project list
Final decision making 1) Consideration of exceptional cases
2) Recommend list of projects to be funded
Table B
Table of notation.
Notation Description
Proļ¬ling
P={p1, ā€¦ pm} Initial set of m phases
D={d1,d2, ā€¦,dn} Initial set of n documents
r Number of clusters
fik Occurrence frequency of phrase k in document
di, k=1, 2, ā€¦, m
rpi Initial phrase set of the document di
crpr Cluster of phrases
support(crpr) Supporting measure of cluster crpr
wrk Normalized phrase frequency for cluster r,
k=1, 2, ā€¦, m
Ī²ik Relative importance weight of phase pk
in document i, k=1, 2, ā€¦, m
Ī²(crpr) Normal form of the cluster phrase patterns
Ī²i ={Ī²i1,Ī²i2, ā€¦,Ī²im} Phrase weighted distribution of document i
Relevance index rij
Jij Jaccard similarity index of proposal i and reviewer j
Cij Cosine similarity index of proposal i and reviewer j
Connectivity index cij
uij Collaboration frequency between researchers i and j
uIJ Collaboration frequency among researchers in
community I to those in community J
aI Weighted fraction of edges that connect to vertices
in community I
Qs Modularity measure for a network with s communities
gij Goodness of ļ¬t between proposal i to reviewer j
Productivity index ej
Gj Potential reviewer j's publication score
Rj Potential reviewer j's academic rank
Hj Potential reviewer j's H-index
ej Potential reviewer j's productivity measure
e Desired average productivity level determined
by panel chairs or division managers
967
T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
[37] A.S. Vivacqua, J. Oliveira, J.M. De Souza, i-ProSE: inferring user proļ¬les in a scientiļ¬c
context, The Computer Journal 52 (7) (2009) 789ā€“798.
[38] K.M. Wang, C.K. Wang, C. Hu, Analytic hierarchy process with fuzzy scoring
in evaluating multidisciplinary RD projects in China, IEEE Transactions on
Engineering Management 52 (1) (2005) 119ā€“129.
[39] D.J. Watts, S.H. Strogatz, Collective dynamics of ā€˜small-worldā€™ networks, Nature
393 (1998) 440ā€“442.
[40] Z. Zheng, K. Chen, G. Sun, H. Zha, A regression framework for learning ranking func-
tions using relative relevance judgments, Proc. of SIGIR'07, 2007, pp. 287ā€“294.
Thushari Silva is currently pursuing her PhD in the department of Information
Systems at the City University of Hong Kong. She received her MSc. in Information
and Communication Technology from Asian Institute of Technology, Thailand in
2010. Her research interests include research social network analysis, recommender
systems, business intelligence and semantic web.
Zhiling Guo is an Assistant Professor in Information Systems at the City University of
Hong Kong. She received her Ph.D. in Management Science and Information Systems
from The University of Texas at Austin in 2005. Dr. Guo's general research interests
include online auctions, electronic markets, cloud computing, crowdsourcing, social
networks, social media marketing, and supply chain risk management. Dr. Guo's
papers have been published in Management Science, Information Systems Research,
Journal of Management Information Systems, Decision Support Systems, among others.
Jian Ma is a Professor in the Department of Information Systems at the City University
of Hong Kong. He received his Doctor of Engineering degree in Computer Science from
Asia Institute of Technology in 1991. Prof. Ma's general research interests include
business intelligence, research and Innovation Social Networks, research information
systems and decision support systems. His past research has been published in
IEEE Transactions on Engineering Management, IEEE Transactions on Education, IEEE
Transactions on Systems, Man and Cybernetics, Decision Support Systems and European
Journal of Operational Research, among others.
Hongbing Jiang is currently pursuing his PhD in the University of Science and Technology
of Chinaā€“City University of Hong Kong joint Advanced Research Center, Suzhou. His
research interests include recommendation systems and social network analysis.
Huaping Chen is a Professor of School of Management at the University of Science and
Technology of China. His research interests include information strategies, business
intelligence and application. His past research has been published in Journal of Opera-
tions Management, Decision Support Systems and Computers  Operations Research,
among others.
968 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968

More Related Content

Similar to A Social Network-Empowered Research Analytics Framework For Project Selection

TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGTOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGcsandit
Ā 
A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...theijes
Ā 
A Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationA Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationDaniel Wachtel
Ā 
Concept on e-Research
Concept on e-ResearchConcept on e-Research
Concept on e-ResearchMd. Nazrul Islam
Ā 
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...IOSR Journals
Ā 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
Ā 
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...AIRCC Publishing Corporation
Ā 
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...ijcsit
Ā 
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
A Survey And Taxonomy Of Distributed Data Mining Research Studies  A Systemat...A Survey And Taxonomy Of Distributed Data Mining Research Studies  A Systemat...
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...Sandra Long
Ā 
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfA Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfAmber Ford
Ā 
Access Lab 2020: Context aware unified institutional knowledge services
Access Lab 2020: Context aware unified institutional knowledge servicesAccess Lab 2020: Context aware unified institutional knowledge services
Access Lab 2020: Context aware unified institutional knowledge servicesOpenAthens
Ā 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
Ā 
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...IRJET Journal
Ā 
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...Platforma Otwartej Nauki
Ā 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
Ā 
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...Michele Thomas
Ā 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
Ā 
The Architecture of System for Predicting Student Performance based on the Da...
The Architecture of System for Predicting Student Performance based on the Da...The Architecture of System for Predicting Student Performance based on the Da...
The Architecture of System for Predicting Student Performance based on the Da...Thada Jantakoon
Ā 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachIRJET Journal
Ā 

Similar to A Social Network-Empowered Research Analytics Framework For Project Selection (20)

TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGTOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
Ā 
A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...
Ā 
A Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper RecommendationA Citation-Based Recommender System For Scholarly Paper Recommendation
A Citation-Based Recommender System For Scholarly Paper Recommendation
Ā 
Concept on e-Research
Concept on e-ResearchConcept on e-Research
Concept on e-Research
Ā 
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Ā 
M017116571
M017116571M017116571
M017116571
Ā 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
Ā 
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...
Increasing the Investmentā€™s Opportunities in Kingdom of Saudi Arabia By Study...
Ā 
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
INCREASING THE INVESTMENTā€™S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...
Ā 
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
A Survey And Taxonomy Of Distributed Data Mining Research Studies  A Systemat...A Survey And Taxonomy Of Distributed Data Mining Research Studies  A Systemat...
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
Ā 
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfA Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
Ā 
Access Lab 2020: Context aware unified institutional knowledge services
Access Lab 2020: Context aware unified institutional knowledge servicesAccess Lab 2020: Context aware unified institutional knowledge services
Access Lab 2020: Context aware unified institutional knowledge services
Ā 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
Ā 
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...
A Study on Data Mining Techniques, Concepts and its Application in Higher Edu...
Ā 
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Ā 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
Ā 
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
Ā 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisons
Ā 
The Architecture of System for Predicting Student Performance based on the Da...
The Architecture of System for Predicting Student Performance based on the Da...The Architecture of System for Predicting Student Performance based on the Da...
The Architecture of System for Predicting Student Performance based on the Da...
Ā 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering Approach
Ā 

More from Nat Rice

Entering College Essay. Top 10 Tips For College Admiss
Entering College Essay. Top 10 Tips For College AdmissEntering College Essay. Top 10 Tips For College Admiss
Entering College Essay. Top 10 Tips For College AdmissNat Rice
Ā 
006 Essay Example First Paragraph In An Thatsno
006 Essay Example First Paragraph In An Thatsno006 Essay Example First Paragraph In An Thatsno
006 Essay Example First Paragraph In An ThatsnoNat Rice
Ā 
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free Dow
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free DowPPT - Urgent Essay Writing Help PowerPoint Presentation, Free Dow
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free DowNat Rice
Ā 
How To Motivate Yourself To Write An Essay Write Ess
How To Motivate Yourself To Write An Essay Write EssHow To Motivate Yourself To Write An Essay Write Ess
How To Motivate Yourself To Write An Essay Write EssNat Rice
Ā 
PPT - Writing The Research Paper PowerPoint Presentation, Free D
PPT - Writing The Research Paper PowerPoint Presentation, Free DPPT - Writing The Research Paper PowerPoint Presentation, Free D
PPT - Writing The Research Paper PowerPoint Presentation, Free DNat Rice
Ā 
Self Evaluation Essay Examples Telegraph
Self Evaluation Essay Examples TelegraphSelf Evaluation Essay Examples Telegraph
Self Evaluation Essay Examples TelegraphNat Rice
Ā 
How To Start A Persuasive Essay Introduction - Slide Share
How To Start A Persuasive Essay Introduction - Slide ShareHow To Start A Persuasive Essay Introduction - Slide Share
How To Start A Persuasive Essay Introduction - Slide ShareNat Rice
Ā 
Art Project Proposal Example Awesome How To Write A
Art Project Proposal Example Awesome How To Write AArt Project Proposal Example Awesome How To Write A
Art Project Proposal Example Awesome How To Write ANat Rice
Ā 
The Importance Of Arts And Humanities Essay.Docx - T
The Importance Of Arts And Humanities Essay.Docx - TThe Importance Of Arts And Humanities Essay.Docx - T
The Importance Of Arts And Humanities Essay.Docx - TNat Rice
Ā 
For Some Examples, Check Out Www.EssayC. Online assignment writing service.
For Some Examples, Check Out Www.EssayC. Online assignment writing service.For Some Examples, Check Out Www.EssayC. Online assignment writing service.
For Some Examples, Check Out Www.EssayC. Online assignment writing service.Nat Rice
Ā 
Write-My-Paper-For-Cheap Write My Paper, Essay, Writing
Write-My-Paper-For-Cheap Write My Paper, Essay, WritingWrite-My-Paper-For-Cheap Write My Paper, Essay, Writing
Write-My-Paper-For-Cheap Write My Paper, Essay, WritingNat Rice
Ā 
Printable Template Letter To Santa - Printable Templates
Printable Template Letter To Santa - Printable TemplatesPrintable Template Letter To Santa - Printable Templates
Printable Template Letter To Santa - Printable TemplatesNat Rice
Ā 
Essay Websites How To Write And Essay Conclusion
Essay Websites How To Write And Essay ConclusionEssay Websites How To Write And Essay Conclusion
Essay Websites How To Write And Essay ConclusionNat Rice
Ā 
Rhetorical Analysis Essay. Online assignment writing service.
Rhetorical Analysis Essay. Online assignment writing service.Rhetorical Analysis Essay. Online assignment writing service.
Rhetorical Analysis Essay. Online assignment writing service.Nat Rice
Ā 
Hugh Gallagher Essay Hugh G. Online assignment writing service.
Hugh Gallagher Essay Hugh G. Online assignment writing service.Hugh Gallagher Essay Hugh G. Online assignment writing service.
Hugh Gallagher Essay Hugh G. Online assignment writing service.Nat Rice
Ā 
An Essay With A Introduction. Online assignment writing service.
An Essay With A Introduction. Online assignment writing service.An Essay With A Introduction. Online assignment writing service.
An Essay With A Introduction. Online assignment writing service.Nat Rice
Ā 
How To Write A Philosophical Essay An Ultimate Guide
How To Write A Philosophical Essay An Ultimate GuideHow To Write A Philosophical Essay An Ultimate Guide
How To Write A Philosophical Essay An Ultimate GuideNat Rice
Ā 
50 Self Evaluation Examples, Forms Questions - Template Lab
50 Self Evaluation Examples, Forms Questions - Template Lab50 Self Evaluation Examples, Forms Questions - Template Lab
50 Self Evaluation Examples, Forms Questions - Template LabNat Rice
Ā 
40+ Grant Proposal Templates [NSF, Non-Profi
40+ Grant Proposal Templates [NSF, Non-Profi40+ Grant Proposal Templates [NSF, Non-Profi
40+ Grant Proposal Templates [NSF, Non-ProfiNat Rice
Ā 
Nurse Practitioner Essay Family Nurse Practitioner G
Nurse Practitioner Essay Family Nurse Practitioner GNurse Practitioner Essay Family Nurse Practitioner G
Nurse Practitioner Essay Family Nurse Practitioner GNat Rice
Ā 

More from Nat Rice (20)

Entering College Essay. Top 10 Tips For College Admiss
Entering College Essay. Top 10 Tips For College AdmissEntering College Essay. Top 10 Tips For College Admiss
Entering College Essay. Top 10 Tips For College Admiss
Ā 
006 Essay Example First Paragraph In An Thatsno
006 Essay Example First Paragraph In An Thatsno006 Essay Example First Paragraph In An Thatsno
006 Essay Example First Paragraph In An Thatsno
Ā 
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free Dow
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free DowPPT - Urgent Essay Writing Help PowerPoint Presentation, Free Dow
PPT - Urgent Essay Writing Help PowerPoint Presentation, Free Dow
Ā 
How To Motivate Yourself To Write An Essay Write Ess
How To Motivate Yourself To Write An Essay Write EssHow To Motivate Yourself To Write An Essay Write Ess
How To Motivate Yourself To Write An Essay Write Ess
Ā 
PPT - Writing The Research Paper PowerPoint Presentation, Free D
PPT - Writing The Research Paper PowerPoint Presentation, Free DPPT - Writing The Research Paper PowerPoint Presentation, Free D
PPT - Writing The Research Paper PowerPoint Presentation, Free D
Ā 
Self Evaluation Essay Examples Telegraph
Self Evaluation Essay Examples TelegraphSelf Evaluation Essay Examples Telegraph
Self Evaluation Essay Examples Telegraph
Ā 
How To Start A Persuasive Essay Introduction - Slide Share
How To Start A Persuasive Essay Introduction - Slide ShareHow To Start A Persuasive Essay Introduction - Slide Share
How To Start A Persuasive Essay Introduction - Slide Share
Ā 
Art Project Proposal Example Awesome How To Write A
Art Project Proposal Example Awesome How To Write AArt Project Proposal Example Awesome How To Write A
Art Project Proposal Example Awesome How To Write A
Ā 
The Importance Of Arts And Humanities Essay.Docx - T
The Importance Of Arts And Humanities Essay.Docx - TThe Importance Of Arts And Humanities Essay.Docx - T
The Importance Of Arts And Humanities Essay.Docx - T
Ā 
For Some Examples, Check Out Www.EssayC. Online assignment writing service.
For Some Examples, Check Out Www.EssayC. Online assignment writing service.For Some Examples, Check Out Www.EssayC. Online assignment writing service.
For Some Examples, Check Out Www.EssayC. Online assignment writing service.
Ā 
Write-My-Paper-For-Cheap Write My Paper, Essay, Writing
Write-My-Paper-For-Cheap Write My Paper, Essay, WritingWrite-My-Paper-For-Cheap Write My Paper, Essay, Writing
Write-My-Paper-For-Cheap Write My Paper, Essay, Writing
Ā 
Printable Template Letter To Santa - Printable Templates
Printable Template Letter To Santa - Printable TemplatesPrintable Template Letter To Santa - Printable Templates
Printable Template Letter To Santa - Printable Templates
Ā 
Essay Websites How To Write And Essay Conclusion
Essay Websites How To Write And Essay ConclusionEssay Websites How To Write And Essay Conclusion
Essay Websites How To Write And Essay Conclusion
Ā 
Rhetorical Analysis Essay. Online assignment writing service.
Rhetorical Analysis Essay. Online assignment writing service.Rhetorical Analysis Essay. Online assignment writing service.
Rhetorical Analysis Essay. Online assignment writing service.
Ā 
Hugh Gallagher Essay Hugh G. Online assignment writing service.
Hugh Gallagher Essay Hugh G. Online assignment writing service.Hugh Gallagher Essay Hugh G. Online assignment writing service.
Hugh Gallagher Essay Hugh G. Online assignment writing service.
Ā 
An Essay With A Introduction. Online assignment writing service.
An Essay With A Introduction. Online assignment writing service.An Essay With A Introduction. Online assignment writing service.
An Essay With A Introduction. Online assignment writing service.
Ā 
How To Write A Philosophical Essay An Ultimate Guide
How To Write A Philosophical Essay An Ultimate GuideHow To Write A Philosophical Essay An Ultimate Guide
How To Write A Philosophical Essay An Ultimate Guide
Ā 
50 Self Evaluation Examples, Forms Questions - Template Lab
50 Self Evaluation Examples, Forms Questions - Template Lab50 Self Evaluation Examples, Forms Questions - Template Lab
50 Self Evaluation Examples, Forms Questions - Template Lab
Ā 
40+ Grant Proposal Templates [NSF, Non-Profi
40+ Grant Proposal Templates [NSF, Non-Profi40+ Grant Proposal Templates [NSF, Non-Profi
40+ Grant Proposal Templates [NSF, Non-Profi
Ā 
Nurse Practitioner Essay Family Nurse Practitioner G
Nurse Practitioner Essay Family Nurse Practitioner GNurse Practitioner Essay Family Nurse Practitioner G
Nurse Practitioner Essay Family Nurse Practitioner G
Ā 

Recently uploaded

Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
Ā 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
Ā 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
Ā 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
Ā 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
Ā 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
Ā 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
Ā 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
Ā 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Nguyen Thanh Tu Collection
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
Ā 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
Ā 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
Ā 

Recently uploaded (20)

Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
Ā 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
Ā 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
Ā 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
Ā 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
Ā 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
Ā 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
Ā 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
Ā 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
Ā 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
Ā 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
Ā 
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Hį»ŒC Tį»T TIįŗ¾NG ANH 11 THEO CHĘÆĘ NG TRƌNH GLOBAL SUCCESS ĐƁP ƁN CHI TIįŗ¾T - Cįŗ¢ NĂ...
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
Ā 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
Ā 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
Ā 

A Social Network-Empowered Research Analytics Framework For Project Selection

  • 1. A social network-empowered research analytics framework for project selection Thushari Silva a , Zhiling Guo a, āŽ, Jian Ma a , Hongbing Jiang a,b , Huaping Chen b a Department of Information Systems, City University of Hong Kong, Hong Kong b School of Management, University of Science and Technology of China and USTC-CityU Joint Advanced Research Centre, Suzhou, PR China a r t i c l e i n f o a b s t r a c t Available online 9 January 2013 Traditional approaches for research project selection by government funding agencies mainly focus on the matching of research relevance by keywords or disciplines. Other research relevant information such as social Keywords: connections (e.g., collaboration and co-authorship) and productivity (e.g., quality, quantity, and citations Research project selection of published journal articles) of researchers is largely ignored. To overcome these limitations, this paper Research social networks proposes a social network-empowered research analytics framework (RAF) for research project selections. Research analytics Scholarmate.com, a professional research social network with easy access to research relevant information, serves as a platform to build researcher proļ¬les from three dimensions, i.e., relevance, productivity and con- nectivity. Building upon proļ¬les of both proposals and researchers, we develop a unique matching algorithm to assist decision makers (e.g. panel chairs or division managers) in optimizing the assignment of reviewers to research project proposals. The proposed framework is implemented and tested by the largest government funding agency in China to aid the grant proposal evaluation process. The new system generated signiļ¬cant economic beneļ¬ts including great cost savings and quality improvement in the proposal evaluation process. Ā© 2013 Elsevier B.V. All rights reserved. 1. Introduction There is a steadily growing trend for government funding agencies to support an increasing number of research proposals. For example, there were 42,225 research grant proposals submitted to the National Science Foundation (NSF) in the U.S. in 2010. The estimated number of submission for 2012 will increase to 46,000. The number of proposals submitted to the National Natural Science Foundation of China (NSFC) has increased from 23,636 in 2001 to over 147,000 in 2011. The sheer volume of submission has posed a signiļ¬cant challenge for research project selection due to difļ¬culties of assigning the most suitable reviewers to the most relevant project proposals. A research project can be characterized by a set of qualitative and quantitative, tangible and intangible attributes. Management scientists, Economist and IS practitioners have proposed various decision models, methodologies and decision support systems to assist decision making tasks related to research project selection [13,15,34,35]. Traditional approaches based on mathematical programming and optimization are useful for handling large volume of submissions, but are less efļ¬- cient in dealing with subjective judgment and information. Machine learning techniques incorporating fuzzy logic, genetic algorithms and artiļ¬cial intelligence techniques are capable of learning complex pat- terns in data, but are limited by their ability to generalize from training data and optimize decisions over the entire decision space. Other tradi- tional approaches involve manually assigning proposals to reviewers based on their claimed expertise, which is neither efļ¬cient nor practical to support increasing complexity of decision making faced by funding agencies. Current computer-based methods mainly consider matching research relevance in terms of keywords or disciplines, while ignoring the social connections (e.g., collaboration and co-authorship) and productivity (e.g., quality, quantity, and citations of published journal articles) of researchers. It is desirable to incorporate all these aspects into a uniļ¬ed evaluation framework. To achieve this goal, we propose a research analytics framework that is empowered by a research social network (www.scholarmate.com) for effective research project selection. Better identiļ¬cation of social connection can effectively cluster researchers based on topics of interests, methodologies, and research disciplines. Being able to identify community structure in the social network helps us understand and exploit the research network more effectively. On the one hand, such information can be used to identify most suitable reviewers. On the other hand, it can help avoid conļ¬‚ict of interests to ensure fair evaluation. Speciļ¬cally, we propose to deļ¬ne proļ¬les of research entities (e.g. project proposals, researchers) from three dimensions, i.e. relevance (e.g., keywords and research disciplines), productivity (e.g., quality, quantity, and citations of published journal articles), and connectivity (e.g., project collaborators, co-authors and colleagues). Represented by visual research CVs, proļ¬les of proposals and potential reviewers are built by extracting information from multiple sources including submitted proposals, bibliographic databases (e.g., ISI, Scopus, and Decision Support Systems 55 (2013) 957ā€“968 āŽ Corresponding author. E-mail addresses: tpsilva2@student.cityu.edu.hk (T. Silva), zhiling.guo@cityu.edu.hk (Z. Guo), isjian@cityu.edu.hk (J. Ma), jhbymx@foxmail.com (H. Jiang), hpchen@ustc.edu.cn (H. Chen). 0167-9236/$ ā€“ see front matter Ā© 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.dss.2013.01.005 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss
  • 2. EI), and research social network (i.e. www.Scholarmate.com). By aggregating information in the three dimensions, we construct a unique matching algorithm to assist decision makers (e.g. panel chairs or divi- sion managers) in optimizing the assignment of reviewers to research project proposals. To demonstrate the usability of the proposed framework, we implemented the system to aid China's largest government funding agency in its grant proposal evaluation. The research analytics frame- work builds upon scientometrics, business intelligence and social network analysis techniques. Its powerful search and data access capa- bilities provide timely and relevant information in visualized forms for research project evaluation. The implemented system generates signif- icant economic beneļ¬ts including cost savings and quality improve- ment in the proposal evaluation process. This paper is organized as follows. Section 2 reviews the relevant literature. Section 3 provides an overview of the research analytics framework and the Scholarmate research social network. Section 4 presents the detailed methods for proļ¬ling and algorithms to calculate the key performance indicators. An optimization problem for reviewer assignment is proposed in Section 5. Section 6 reports evaluation of the proposed system by China's largest government funding agency for its grant proposal evaluation. Section 7 concludes with a summary of contribution and directions for future research. 2. Literature review The major challenge in reviewer assignment for proposal evaluation is identifying and recommending the most suitable reviewers who have a high level of expertise and will make valuable professional judgment on given proposals [13,38]. In this paper, we propose a proļ¬le-based approach to assign reviewers for proposal evaluation. Previous research has identiļ¬ed two approaches to scientiļ¬c re- searcher proļ¬ling. One approach relies on subjective self-claimed infor- mation declared by researchers themselves. The other approach is based on objective measurement obtained through automated inferences about the researcher's behavior patterns related to publications and citations derived from relevant resources [37]. The ļ¬rst approach uses qualitative methods (e.g. surveys, questionnaires, or interviews) and traditional information retrieval models (e.g., term-based modeling [3] and rough-set modeling [19]) to gain knowledge of a researcher's inter- ests and resulted proļ¬les. The latter approach utilizes various feature selection techniques in machine learning to learn user proļ¬le [10]. The machine learning approaches tend to learn the mapping between incoming set of documents relevant to user input and real numbers which represent the strength of user preferences. The features of the documents are ļ¬rst extracted by widely used techniques including information gain [8,21] and correlation coefļ¬cient [32]. Then the key fea- tures are used as attributes in the mapping functions. Some studies focus on techniques such as neural networks [22], Support Vector Machine (SVM) [11,16,29], K-Nearest Neighbors (K-NN) and logistic regres- sion [6,40] before generating a mapping with a set of real numbers. Li et al. [20] proposed a rough threshold model (RTM) to analyze and extract keywords from the scientiļ¬c publications. In our approach we augment the original rough threshold model with phrase analysis algo- rithm to resolve semantic ambiguity that is not handled by the original rough threshold model for topic generation. Collaboration network is one type of popular social networks that has been widely studied in the literature [4,5,23]. A property that many social networks have in common is clustering, or network tran- sitivity [2,26,39]. Clustering coefļ¬cient is deļ¬ned as the probability that two of one's friends are friends themselves [7,39]. It typically ranges from 0.1 to 0.5 in many real-world networks. A related concept is com- munity in which connection within the same community is dense and outside the community is sparse. Community structure in a social net- work represents real social groupings by interest or background [27]. For example, communities in a citation network represent related papers on a single topic [31]. There are two broad classes of hierarchical clustering methods to detect the community structure in a social network: agglomerative and divisive [28,30]. The agglomerative approach focuses on ļ¬nding the strongly connected cores of communities by adding links [24], and the divisive approach uses information about edge betweenness to detect community boundaries by removing links [23]. For example, the Girvanā€“Newman algorithm [12] is one of the most widely used divisive methods and is effective at discovering the latent groups or communities that are deļ¬ned by the link structure of a graph. Newman's fast algorithm [25] is an efļ¬cient reference algorithm for clustering in large networks. It falls in the general category of agglom- erative hierarchical clustering methods. This method can be easily generalized to weighted networks in which each edge has a numeric value indicating link strength. It has been successfully applied to a collaboration network of more than 50,000 physicists. In this study, we adopt Newman's fast algorithm in our research social network analysis. 3. An overview of the RAF and Scholarmate Research Analytics is the application of methods and theories in scientometrics, business intelligence and social network analysis to transform research related data into relevant information in research management. In this paper we demonstrate the research analytics framework in the context of reviewer recommendation for research project selection. 3.1. The RAF for reviewer recommendation This study takes a proļ¬le-based approach to reviewer recommen- dation. Fig. 1 illustrates the key framework. Research Online (http://rol.scholarmate.com) is an institutional repository service provided by Scholarmate (http://www.scholarmate. com) to analyze proposals submitted through the Internet-based Science Information System (ISIS, https://isis.nsfc.gov.cn). It helps build standardized visual research CVs of researchers and identify the social groups to which they belong. These steps greatly ease the proļ¬l- ing of proposals and researchers. Key features and attributes such as discipline codes and keywords to represent proposals and researchers are derived from the standard keyword dictionary. Phrase patterns are discovered by data mining the free text categories of the electronic documents from various databases (e.g. ISI, Scopus and EI). Based on the constructed comprehensive proļ¬les of both the proposals and potential reviewers, the system generates key performance indicators in three dimensions, i.e., relevance, productivity and connectivity. Finally, a matching algorithm that takes into account all three dimensional measures is proposed for reviewer recommendation. Speciļ¬cally, relevance refers to the keywords, research discipline and expertise area that are derived from both the researcher's scientiļ¬c pub- lications and prior funded projects. Productivity is measured by quality, quantity, citations, and impacts of one's research, as well as other academic achievements. Connectivity among researchers is inferred through collaborations, such as collaborators in projects, co-authorship in publications, and colleagues in the same organizations. Their speciļ¬c roles in the reviewer assignment process can be demonstrated in Fig. 2. We will discuss each of them in detail in Section 4. 3.2. Scholarmate research social network Scholarmate (http://www.scholarmate.com) is a professional research social network that connects people to research with the aim of ā€œinnovating smarterā€. It offers research social network services that help researchers ļ¬nd suitable funding opportunities and potential re- search collaborators. In addition to its important function of connecting 958 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 3. people with similar interests, Scholarmate has a search tool to help researchers extract their publications from existing bibliographic data- bases (e.g., ISI, Scopus) directly, along with citations of the paper and impact factor of the journal. Moreover, Scholarmate provides researchers with the ability to disseminate research outcomes and in- formation about their current interests over established social connec- tions. On the one hand, researchers can use Scholarmate to manage their research outcomes and research in progress, including research proposal preparation. On the other hand, transparency in information sharing among scholars in Scholarmate will open an opportunity for researchers to timely participate in relevant scholarly activities, such as becoming potential reviewers. For example, a panel chair will be able to judge the recent research expertise of a researcher after analyzing the knowledge sharing activities in Scholarmate. In Scholarmate, several types of networks can be constructed, such as citation networks, project collaboration and journal article co-authorship networks. An example of the collaboration network is presented in Fig. 3. The numbers beside the nodes are researcher identiļ¬cation numbers (RIDs). The numbers on the edges are the collaboration frequencies of two researchers. The frequency of collaboration is measured in terms of the number of co-authored publications, number of collaborated pro- jects and number of co-cited papers extracted through the Scholarmate platform. Three major communities are identiļ¬ed and are indicated by the ovals in the ļ¬gure. The communities are derived according to research expertise. We are also able to identify top researchers in the social network in terms of connectivity by degree, betweenness, and closeness, as shown in Table 1. The numbers in brackets denote the rankings under the corresponding measures. The researchers who have high ranks in the same community as principle investigators are identiļ¬ed as the potential reviewers subject to the condition that there's no direct connection between the potential reviewers and the principle investigators. For example, researcher 51 is a principle investigator and researcher 55 is identiļ¬ed as a potential reviewer because these two researchers are in the same community but they have no direct collab- oration. The fact that both of them have collaborations with researchers 37 and 38 indicates potential overlap of research interests in some com- mon research areas. The research social network can enhance data representation in several ways. For example, existing databases only store data about published articles. Working papers that reļ¬‚ect the most recent research activities cannot be obtained by a search in bibliographic databases, but may be available on the social network site. Similarly, a researcher who has secured an industry grant that is relevant to the required reviewer expertise may be suitable to serve as a potential reviewer. However, traditional method cannot identify this researcher due to inability to access such information. Social network facilitates real-time information sharing and therefore is effective for such type of information acquisition. Such additional information greatly en- hances the completeness and timeliness of our data representation. 4. Proļ¬ling and key indices In this section, we present a comprehensive representation of the proposal and researcher proļ¬les from both available databases and the research social network, based on which three key performance indicators are derived: relevance, productivity, and connectivity. Fig. 4 shows relationship between three key performance indicators and their usage in reviewer recommendation. Reviewer Profiling Discipline codes, keywords, phrase patterns Reviewer Recommendation Proposal Profiling Key words dictionary Matching Algorithm Discipline codes, keywords, phrase patterns Research Online Platform Scholarmate Research Social Network Research Analytics Relevance Productivity Connectivity Fig. 1. The framework of proļ¬le-based reviewer recommendation. Proposal clustering Selection of eligible reviewers (Relevance Index) Exclusion of conflict of interest (Connectivity Index) Balance expertise of reviewers (Productivity Index) Assignment of reviewers Fig. 2. Stage diagram for proposal-reviewer recommendation. 959 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 4. Initially the system constructs proļ¬les of proposals (indexed by i) and researchers (indexed by j), respectively. The proposal proļ¬ling and reviewer proļ¬ling are discussed in detail in Section 4.1. The three key indices are developed as follows. We ļ¬rst use a component-based matching algorithm to calculate the relevance index (rij), which denotes the degree of matching between the proposal proļ¬le and the reviewer proļ¬le. Based on the Scholarmate platform services, we construct the connectivity index (cij) via the collaboration network indicating frequency of research collaboration among reviewers, PIs and co-PIs. The generated collaboration network is analyzed by identifying com- munities and their features such as structure and closeness, and those features are used in the generation of connectivity index. The connectiv- ity index is used to resolve the conļ¬‚ict of interest and to identify the most relevant reviewers. Finally, we generate potential reviewers' pro- ductivity index (ej), which considers quality of the publications, research impact and academic achievement. The productivity index is used to balance the expertise of potential set of reviewers in the optimization program of reviewer recommendation. 4.1. Proļ¬ling In general, proļ¬ling is the process of determining key attributes that can be used to characterize a given object. In our project selection context we focus on proposal proļ¬ling and researcher proļ¬ling. The objective of proposal proļ¬ling is to extract proposal relevant features and that of researcher proļ¬ling is to extract researcher expertise. The quality of proļ¬ling directly affects the effectiveness of research project selection. The integration of both subjective and objective information is necessary during the process of proļ¬le generation. We ļ¬rst focus on proposal proļ¬ling. The proposal submitted through the Internet-based Science Information System (ISIS) has standard template to be ļ¬lled in up to two discipline codes and ļ¬ve keywords. We express the self-claimed discipline code (Discode) and keywords (Key) in the following sequence: bPropNo; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 > Ć°1ƞ where PropNo is the proposal number that uniquely identiļ¬es a proposal. This sequence can be directly extracted from the proposal. To verify whether the claimed information is accurate, an objective examination of the proposal title and abstract is necessary. The second type of information is obtained through data mining the title and abstract sections of the proposal. It can be expressed in the following sequence: bPropNo; key1; key2; ā€¦; keym > : Ć°2ƞ Note here that we use lower case key to represent keywords extracted from the non-standard content area (i.e., title and abstract). This set of keywords has some overlaps with, but is generally larger than, the standard keyword database deļ¬ned by the government funding agency. For the fair comparison of any two proposal docu- ments, we extract m keywords in each document. The search algorithm that we will discuss later determines the preferred number of key- words. Ideally we can add the whole content of the proposal to obtain the highest ranked keywords through word frequency analysis. We found that this would increase the computational effort without adding too much new insight. Mining the title and the abstract is accurate enough to classify proposals according to the keywords. We next consider researcher proļ¬ling. The funding agency main- tains an expert dictionary for the pool of potential reviewers. The expert Fig. 3. An example of collaborated network. Table 1 Researchers' Connectivity Ranking. RID Degree n-Degree Betweenness Closeness Overall 37 11 (1) 0.1930 0.0345 (4) 0.2069 (1) 0.1685 (1) 18 9 (2) 0.1579 0.0459 (1) 0.1787 (3) 0.1459 (2) 27 6 (2) 0.1053 0.0382 (3) 0.1474 (7) 0.1129 (5) 10 6 (4) 0.1053 0.0453 (2) 0.1843 (2) 0.1328 (3) 15 5 (4) 0.0877 0.0143 (9) 0.1685 (4) 0.1134 (4) 31 5 (6) 0.0877 0.0244 (5) ā€“ ā€“ 19 5 (6) 0.0877 0.0169 (7) ā€“ ā€“ 38 5 (6) 0.0877 ā€“ 0.1345 (8) ā€“ 52 4 (8) 0.0702 0.0122 (10) 0.1638 (5) 0.1054 (6) 43 4 (8) 0.0702 0.0163 (8) ā€“ ā€“ 34 ā€“ ā€“ 0.0207 (6) ā€“ ā€“ 36 ā€“ ā€“ ā€“ 0.1340 (9) ā€“ 44 ā€“ ā€“ ā€“ 0.1512 (6) ā€“ 960 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 5. dictionary is standardized and the available choices are the same as those in the proposal application. Initially each potential reviewer chooses his/her own disciplines and expertise areas (expressed as key- words). The self-claimed discipline code (Discode) and keywords (Key) are expressed in the following sequence: bResearcherID; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 > Ć°3ƞ where ResearcherID is used to uniquely identify a potential reviewer. Each potential reviewer may have successful grants from different funding agencies and have publications, patents, or awards from various sources. We extract such objective information from several databases and list them as: bResearcherID; GrantNo; DisCode1; DisCode2; Key1; Key2; ā€¦; Key5 > Ć°4ƞ bResearcherID; PubNo; key1; key2; ā€¦; keym > : Ć°5ƞ In addition, the potential reviewers may have social tags. Social tags are labels about expertise areas that are maintained by friends or other concerned parties who may know reviewers well in other capacities. For example, a panel chair may know research expertise of the reviewer from his/her previous service to the funding agency. Information extracted from reviewers' social tags can be aggregated and expressed as: bResearcherID; key1; key2; ā€¦; keym > : Ć°6ƞ 4.2. Extracting topic features from texts During the process of objective information extraction, it is neces- sary to analyze non-free text areas such as titles and abstracts of elec- tronic documents. The determination of a set of topic features from non-text ļ¬elds follows several steps including extracting phrases, ļ¬ltering out non-key phrases, resolve semantic heterogeneity and constructing keyword dictionary. In this study we combine several techniques including Rough Threshold Model and Database Tomogra- phy and develop an algorithm to calculate document phrase weight distribution. When extracting information from texts such as titles and abstracts in funded projects and publications, we ļ¬rst need to build a standard research keyword dictionary. Phrases (a combination of multiple words) rather than single word are used to solve semantic ambiguity as single words are rarely sufļ¬cient to accurately distinguish standing researcher interests [32]. Generally phrases carry more meaning than single words. We ļ¬nd a phrase with length of 2ā€“4 keywords strong enough to capture the meaning effectively. The free-text category ļ¬elds of scientiļ¬c publications (e.g. title, abstract and keywords) are analyzed and technical phrases were extracted using the Database Tomography (DT) process [17,18]. DT is a textual database analysis system that provides algorithms for extracting multi-word phrase frequencies with their proximities. We applied DT algorithm to extract all adjacent double, adjacent triple and adjacent quadruple word phrases from the text (i.e. title, abstract and keywords) along with their frequencies. We discarded those phrases with extremely high frequencies (not useful to distinguish documents) and those with extremely low frequencies (not useful to compare documents). Finally these phrases are built into the keyword dictionary. Extract self- claimed disciplines and key words from proposal Extract key words from proposal title and abstract Proposal Profiling Researcher Profiling Extract self-claimed disciplines and expertise from keyword dictionary Extract key words from previously funded proposal title and abstract Extract key words from publication title and abstract Relevance Index Extract PI/ Co-PI Info Connectivity Index Weighted publication score from research databases (e.g., ISI, Scopus, EI) Citation score based on SCI/SSCI search and H-index Academic ranking & institutional reputation Research social network (e.g., ScholarMate) Relevance Index Connectivity Index Key Performance Indicators of RAF Remove conflict of interests Balance reviewer expertise Productivity Index Fig. 4. Process model and relationship with key indices in reviewer recommendation. 961 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 6. According to Rough Threshold Model (RTM) [20], documents are represented in terms of weight distribution over topic features. We use an augmented RTM topic ļ¬ltering algorithm to generate topic features from the documents. Speciļ¬cally, let P={p1, ā€¦ pm} be the initial set of phrases extracted from all documents D={d1,d2, ā€¦,dn}. Let fij be the number of appearances of phrase j in document di. A document di can be expressed by a set of phrases with corresponding occurring frequencies: di ={(p1,fi1), ā€¦ (pn,fim)}. The initial phrase set of di is rpi ={pj|fij >0}. If two documents have the same phrase patterns, the two initial phrase patterns can be com- posed. For example, {(p1,1),(p2,3)}āŠ•{(p1,2),(p2,2)}={(p1,3),(p2,5)}, where āŠ• denotes the composition operation. We can group the initial phrase patterns that have the same phrase sets into clusters and use their composed phrase pattern to represent the cluster. Assume that there are rbn clusters. The cluster can be represented by crpr ={(p1, cfr1),(p2, cfr2), ā€¦,(pm, cfrm)}, where the cluster frequency cf rk Ā¼ āˆ‘ crpr j j iĀ¼1 f ik, for k=1, 2, ā€¦ m, is the composed frequency in the cluster. We deļ¬ne the support for phrase pattern rpi āˆˆcrpr as follows. support crpr Ć° ƞ Ā¼ crpr j j D j j Ć°7ƞ Furthermore, āˆ‘rsupport crpr Ć° ƞ Ā¼ 1. The normal form of the cluster phrase pattern can be described by the following association mapping function: Ī²(crpr)={(p1,wr1),(p2, wr2), ā€¦,(pm,wrm)}, where phrase normalized frequency is deļ¬ned as: wrk Ā¼ cf rk āˆ‘m iĀ¼1cf ri ; k Ā¼ 1; 2; ā€¦; m: Ć°8ƞ The relative importance weight of phrase pk in document i over all documents can be deļ¬ned as: Ī²ik Ā¼ āˆ‘ pkāˆˆrpiāˆˆĪ² crpr Ć° ƞ support crpr Ć° ƞwrk f ik cf rk : Ć°9ƞ The document i can be alternatively represented by its phrase weighted distribution Ī²i ={Ī²i1, Ī²i2 ā€¦ Ī²im)}. For a given document (i.e. set of publications and projects), all initial phrase patterns are calculated with their pattern frequency. Generated patterns are combined to construct clusters and clusters are labeled using phrases in combined patterns. Each pattern frequency in the cluster is normalized and normalized weights are calculated. Finally, a document that is uniquely represented by its initial phrase patterns can be characterized by its phrase weight distribution across all documents. The algorithm is described as follows. 4.3. Relevance The relevance index is used to determine how well reviewer expertise is matched with the content of the proposal. It is calculated from matching the proposal and reviewer proļ¬les. The task of proļ¬le matching is to decide whether a sequence of key phrases that describe proposal proļ¬le attributes matches key phrases that represent reviewer proļ¬le attributes. Two widely accepted approaches for calcu- lating similarity between terms are Euclidean distance and cosine sim- ilarity measure [14]. For the self-claimed information that is extracted in standard terms, we use the Jaccard similarity measure [1] to perform component-based matching over reviewer and proposal proļ¬les. Data extracted by Eqs. (1), (3) and (4) can be matched using this method. The Jaccard index between reviewer i and proposal j is expressed as: Jij Ā¼ F Keyi1; Keyi2; ā€¦Keyi5 Ć° ƞāˆ© Keyj1; Keyk2; ā€¦Keyj5 h i F Keyi1; Keyi2; ā€¦Keyi5 Ć° ƞāˆŖ Keyj1; Keyk2; ā€¦Keyj5 h i Ć°10ƞ where Keyik and Keyjk, k=1, 2 ā€¦, 5, are the ļ¬ve keywords associated with reviewer i and proposal j in standard terms. The numerator denotes the number of keywords in common, and the denominator represents the total number of unique keywords in both proļ¬les. As shown, the Jaccard similarity is measured by the ratio of the frequency of an intersection divided by the frequency of a union between two sets of keywords [1]. To determine the similarity of the non-standard phrase patterns, we adopt the cosine similarity measure. For researcher proļ¬le i and proposal proļ¬le j, the similarity can be calculated as follows [9]. Cij Ā¼ Ī²iĪ²j āˆ„Ī²iāˆ„āˆ„Ī²jāˆ„ Ā¼ āˆ‘ m kĀ¼1Ī²ikĪ²jk ļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒļ¬ƒ āˆ‘m kĀ¼1Ī²2 ikāˆ‘m kĀ¼1Ī²2 jk q Ć°11ƞ where Ī²ik and Ī²jk are the normalized frequency of phrase patterns pk in two proļ¬les i and j. Phrase patterns extracted by Eqs. (2), (5), and (6) are processed by the algorithm presented in Table 2. The resulting weight distribution is used to derive the similarity measure. Note that each researcher may have several grants or publications. There are different ways to deļ¬ne the similarity measure in the respective categories (grant or publication). The ļ¬rst possibility is to consolidate several documents in the same category into one integrated document that represents the researcher proļ¬le in that speciļ¬c category. Then the algorithm generates one weight distribution for the consoli- dated document. Within each category only one consolidated measure Cij is derived. Another method is to treat the documents separately. The algorithm will result in one weight distribution for each document. Pair-wise similarity measure can be calculated between each of the researcher's documents and the proposal. We then choose the maxi- mum similarity in a category as the ļ¬nal measure of similarity between the proposal and the potential reviewer in that speciļ¬c category. Since multiple sources of information, both subjective and objective, need to be aggregated, an appropriate weighting strategy is needed to reļ¬‚ect the relative importance in the overall evaluation [9]. Denote rij as the degree of matching between proposal i and the potential reviewer j. An aggregate measure in the relevance dimension can be obtained as follows: rij Ā¼ Ī±Selfij Ć¾ Ī²Garntij Ć¾ Ī³Pubij Ć¾ Ī“Socialij Ć°12ƞ where Ī±+Ī²+Ī³+Ī“=1. The four terms refer to self-claimed information, grants, publications, and social tags. Note that self-claimed information from proposal (Self) and the social tags that label the potential reviewers (Social) are related to the subjective judgment, while grants and publications provide ob- jective measures related to the match between proposals and potential Table 2 Algorithm to calculate document phrase weight distribution. Input: A document set D and a phrase set P Output: A document's phrase weight distribution Ī²i =(Ī²i1,Ī²i2, ā€¦,Ī²im) Initialize RP=Ī¦ for each (di āˆˆD){ for (pj āˆˆP) di ={(p1,fi1), ā€¦ (pm,fim)} rpi ={pj|fij 0} RP=RPāˆŖrpi } RP=āŠ• RP Cluster document crpr based on rpi Calculate support (crpr) based on Eq. (7) crpr ={(p1,cfr1),(p2,wr2),ā€¦,(pm,wrm)} based on Eq. (8) for each (di āˆˆD) { for (pj āˆˆrpi āˆˆcrpr) calculate Ī²ik based on Eq. (9)} END 962 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 7. reviewers. Decision makers may assign different weights to aggregate both subjective and objective information. As shown in Fig. 5, the proposal related information including discipline codes, keywords, abstracts and PI is displayed on the top of the screen. The relevance score is calculated and displayed in the middle. Clicking on each tab will show the matches that are identiļ¬ed by the system. Efļ¬ciency of the matching algorithm can be calculated in terms of time complexity. The algorithm requires a single traverse through all the set of reviewer proļ¬les for each proposal. The matching between pre-generated subjective, objective and social information patterns in proposal and reviewer proļ¬les requires O(nāˆ—m) in its worst-case, where n is the number of proposals and m is the number of reviewers. The proposals are clustered according to their disciplines and a set of proposals is matched against all the reviewers. In order to reduce the computational complexity of the algorithm, proļ¬les of reviewers and proposals are constructed beforehand. 4.4. Connectivity The nature of connection between reviewers, PIs and co-PIs is very important when assigning reviewers to evaluate proposals. Bearing same expertise as principle investigators and having no direct personal conļ¬‚ict with PIs are essential constraints that should be satisļ¬ed by the reviewers. Thus in this study we utilize social network analysis related concepts such as community structure, closeness of individuals in the same community to discover non-trivial relationships among researches. After analyzing individuals in one community we are able to identify group of individuals who have similar research interests, who are active in the corresponding research area, and who have close connection with PIs or co-PIs. Such information is then used to remove conļ¬‚ict of interest and to aid preferential assignment to the most relevant reviewers. Several types of networks can be constructed using the available social network data in Scholarmate. For example, we can represent scientiļ¬c papers as vertices in a graph. Vertices are connected by the edges when one paper cites the other one, which cites others as well. Alternatively, we can construct the researcher network. Each researcher is represented as a vertex in a graph. An edge is built when one researcher cites another's work (directed citation network), or when one researcher co-authored with another researcher (undirected collaboration network). We deļ¬ne the edge weight as the number of citations or collaborations between two researchers. Higher weight implies more connectivity between the two researchers. We use graph clustering method to cluster graphs. Hierarchical clustering is a traditional method for detecting community structure. Here we focus on the collaboration network. We ļ¬rst assign a weight uij for a pair of vertices in the network, which is deļ¬ned as frequency of collaboration between two researchers and therefore represents how closely the researchers are connected. By analyzing the implicit community structure and estimating the strength of ties between individuals, we are able to discover nontrivial patterns of interactions in the scientiļ¬c collaboration networks. Assume that there are s predeļ¬ned communities. Deļ¬ne uIJ as the fraction of collaboration frequency among researchers in community I to those in community J. Denote aI =āˆ‘JuIJ, which represents the Relevance Score Overall Self (40%) Grants (20%) Publications (20%) Social (20%) 81 100 40 85 80 Proposal 53361479 Discipline code 1: F020508 (Pattern recognition theory and application) Discipline code 2 : Keywords: Machine learning, Semi-supervised learning, Spectral clustering, Support vector machine Abstract: Machine learning based on data is an important direction for modern artificial intelligence. It allows computers to automatically learn characteristics of the unknown underlying probability distribution and refine learning based on empirical data from observed samples. Semi- supervised learning is a popular machine learning technique that make use of both labeled and unlabeled data in the learning process. Support vector machine is a new method for supervised learning that can be used for classification and regression analysis. Spectral methods are of fundamental importance in statistics and machine learning. This project aims to combine support vector machine and spectral clustering methods to study the semi-supervised learning problems for very large datasets. Principle Investigator : XXX Institution:XXX Grant No. Project Title Abstract 60775045 Data Reduction Method for Machine Learning Data reduction is one of the main topics of machine learning. High dimension and nonlinearity are two key problems ... 61033013 Theories and Technologies of Image Invariant Features Based on Cognitive Models This interdisciplinary research aimsto integrate human knowledge, image processing, computer pattern recognition, and machine learning to extract image invariant features ā€¦ Grant No. Title Discipline Code Keywords 60775045 Data Reduction Method for Machine Learning Discipline Code 1:F030504 Data mining and machine learning Discipline Code 2: Data reduction, Lie Group,Machine learning, Crystal classification 61033013 Theories and Technologies of Image Invariant Features Based on Cognitive models Discipline Code 1: F0205 Computer application technology Discipline Code 2: F020512 Knowledge discovery and knowledge engineering Image processing, Recognition, Machine learning, Scale space, Image invariant features More More Fig. 5. Matching between proposal and reviewer to calculate the relevance score. 963 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 8. weighted fraction of edges that connect to vertices in community I (i.e., the fraction of collaborations that researchers in community I collaborate with researchers in other communities). The Newman's fast algorithm is based on the idea of modularity [25]. Following their approach, we deļ¬ne the modularity measure for a network with s communities as: Qs Ā¼ āˆ‘ s IĀ¼1 uIIāˆ’a 2 I Ć°13ƞ where uII is the weighted fraction of edges in the network that con- nect vertices in the same community. A high value of Qs represents a good community division. However, optimizing Qs over all possible divisions is infeasible in practice for networks larger than thirty verti- ces. Various approximation methods are available, such as simulated annealing, genetic algorithms, and so on. A standard ā€œgreedyā€ optimi- zation algorithm is used. The algorithm to determine the optimal community structure takes the following steps (Table 3). The algorithm starts with n communities, where n is the total number of nodes in the collaboration network. Assuming that each vertex is the sole member of a distinctive community, and the algo- rithm iteratively merges each pair of communities in which there are edges connecting them. The time taken to join any pair of communities will at most be m, where m is the total number of edges in the graph. The change of Qs can be calculated in constant time in each iteration. Following the join some elements in the matrix W should be updated by adding together the rows and columns corresponding to the joined communities. At each step the algorithm takes worst-case time O(n+m). When the algorithm completes its execution minimum nāˆ’ 1 joins are required. Then the time complexity of the algorithm is O((n+m)n) or O(n2 ). Since the value of Q|W| is calculated in each iteration, ļ¬nding the optimal community structure is straightforward. The hierarchical clustering method also enables us to deļ¬ne the community structure according to our required granularity level. To ļ¬nd the connectivity, we extract all principle investigators and other members of proposal i. If none of them is in the same community as the potential reviewer j, we deem the reviewer is not an ideal candidate to review the pro- posal. Therefore, we label gij ā‰Ŗ1 to suggest a mismatch. Otherwise, we label gij =1, indicating a high goodness of ļ¬t. Resolving conļ¬‚ict of interest is an important step in the reviewer assignment process. For example, to ensure an objective review of the proposal, the government funding agency requires that applicants and reviewers should not have co-author relationship in the last ļ¬ve years. Conļ¬‚ict of interest can be immediately identiļ¬ed by a direct link in our collaboration network. If any of the primary members of proposal i has conļ¬‚ict of interest with a potential reviewer j, we label cij =0, enforcing a ā€œNoā€ decision in the reviewer assignment. 4.5. Productivity Productivity index is calculated for potential reviewers and is used to indicate the contribution to the ļ¬eld made by them. For fair and unbiased project selection, productivity needs to be balanced among the reviewers who are to be assigned to evaluate same proposals. We measure the productivity of a potential reviewer in terms of the number of publications, quality of the publications and citation impact in the past ļ¬ve years. A productivity index can be computed with aggregation of quality and quantity of publications. Generally academic journals are classiļ¬ed into different disciplines and they are assigned a rank, such as level A journals, level B journals or level C journals. As in [33] we assume that the journal rank reļ¬‚ects the quality of the articles published in that journal as it is widely used in many research performance measuring activities related to merit increases and for allocation of research funding in university settings [36]. According to [33], we adopt a weighted scheme to generate the productivity index as a measure of overall contribution of a researcher to the ļ¬eld. Let qij be reviewer j's total number of publications in rank i level's journals, where i=A, B, C. The publication score of reviewer j is expressed as: Gj Ā¼ wAqAj Ć¾ wBqBj Ć¾ wCqCj Ć°14ƞ where wA wB wC, indicating the emphasis on quality work. There are different ways to deļ¬ne the weights. For example, the average impact factors for all the journals classiļ¬ed at the same level can be used to deļ¬ne the corresponding weight. Professional titles (e.g. senior scholars like Professor and Associate Professor, or junior scholars like Assistant Professor) and H-index can also be taken into consideration for recommending reviewers to proposals. We may assign higher rank score to higher professional titles. Let Rj and Hj be potential reviewer j's rank score and H-index, respectively. An integrated research productivity measure can be obtained as follows: ej Ā¼ uGj Ć¾ vRj Ć¾ tHj Ć°15ƞ where u+v+t=1. 5. Assigning reviewers for proposal evaluation The reviewer assignment process deals with assigning reviewers to evaluate proposals in speciļ¬c discipline area. Current practice is manual matching of proposals to reviewers based on their declared expertise. This is inefļ¬cient because subjective expertise judgment alone is insufļ¬cient to decide reviewer expertise as it lacks objective evidences. We introduce the relevance measure to balance the self- claimed expertise and the expertise induced from the derived objec- tive information. The key objective is to maximize relevance between proposals and potential reviewers. Because the quality of evaluation largely depends on the experiences and judgments from the reviewers, there is a need to balance reviewer expertise among the reviewers who are assigned to the same proposal. For example, senior scholars tend to give higher weight on innovative- ness of the proposal than their junior counterparts, while junior scholars tend to put higher weight on methodology rigor in comparison with senior fellows. Let e be the desired average productivity level of the potential reviewers. This can be determined by relevant decision makers such as panel chairs or division managers. We want the average review- er expertise levels to be close enough to this desired level. For example, if the potential reviewer is a junior scholar whose ej is signiļ¬cantly lower than e, then the proposal would need a senior scholar whose Table 3 Algorithm to cluster the collaboration network into communities. Step 1. Initially there are n vertices representing researchers. uij is the collabora- tion frequency between researchers i and j. Initially each vertex is the sole member of a distinctive community. Calculate the within and between community collabo- ration fraction uII and uIJ, and form matrix W Ā¼ u11 ā€¦ u1n ā‹® ā‹± ā‹® un1 ā€¦ unn . Calculate aI. Step 2. Calculate Ī”QIJ =uIJ +uJI āˆ’2aIaJ. Choose (I,J)=argMaxĪ”QIJ to join if Ī”QIJ ā‰„0 or (I,J)=argMinĪ”QIJ if Ī”QIJb0. Step 3. Update the matrix elements uIJ by adding together the rows and columns cor- responding to the joined communities. Update aI. Calculate Q|W| according to Eq. (13). Step 4. Repeat steps 2 and 3 to join communities in pairs until all vertices are joined. Step 5. The optimal community structure is determined by s=argMaxQ|W|. 964 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 9. productivity measure is signiļ¬cantly higher than e to review the proposal. First we construct a network model where each proposal and potential reviewer is represented as a node in the network. The poten- tial reviewer node is called the supply node, and the proposal node is called the demand node. Assume that there is a set of I proposals and a set of J potential reviewers. Let xij be the integer decision variable indicating the assignment of proposal i to potential reviewer j. There- fore, xij=1 implies recommending assignment and xij=0 implies that the assignment is not recommended. We maximize the relevance sub- ject to the ļ¬‚ow constraints which reļ¬‚ect the management's require- ment of the reviewer assignment. The optimization problem can be expressed as: Max āˆ‘ iāˆˆI āˆ‘jāˆˆJcijgijrijxij s:t: āˆ‘jāˆˆJxij ā‰„ b for iāˆˆI āˆ‘iāˆˆIxij ā‰¤ d for jāˆˆJ āˆ‘jāˆˆJ ejxijāˆ’e ā‰¤ Īµ for iāˆˆI xijāˆˆ 0; 1 f g for iāˆˆI; jāˆˆJ : Ć°16ƞ The coefļ¬cients in the objective function ensure that we maximize the overall relevance measure in the reviewer and proposal pools. cij is the indicator variable to remove conļ¬‚ict of interest, and gij is the coefļ¬cient for preferential assignment of reviewers in the same scien- tiļ¬c research community. The ļ¬rst set of constraints ensures that each proposal has at least b reviewers. The second set of constraints guarantees that each reviewer cannot review more than d proposals. In practice, usually b=3 and d= 20. The third set of constraints is used to balance reviewer expertise. Note that, Īµ0 is the tolerance level that can be chosen by the panel chair or the management team. As to the implementation of this model, we ļ¬rst analyze community structures to remove conļ¬‚ict of interest and to identify potential re- viewers. Next, we calculate the relevance degree between the reviewers and the proposals in such a way that the PIs of the proposals belong to the same community as their potential reviewers. The calculated rele- vance degrees are sorted and reviewers with high relevance degree are selected to evaluate those proposals. Finally, productivity among the reviewers who are assigned to one proposal is balanced and work- load is evenly distributed among reviewers. In order to achieve a higher degree of computational performance, the collaboration networks for reviewers and PIs of the proposals under each division are constructed and the optimal numbers of com- munities are derived before the reviewer assignment process is carried out. First, the time complexity for traversing through the community graph for connectivity index calculation is O(n0 +m0), where n0 is the total number of nodes in one community and m0 is the number of connections between individuals. Second, it requires O(n0m0) time complexity for spanning through the whole set of reviewers and pro- posals when generating the relevance degree matching. Third, the time complexity for sorting the end result is O(n1logn1), where n1 represents the number of matching results. Finally, the time complexity for balancing productivity of reviewers in the same group is O(1) and it is negligible. In summary, the worst-case computational complexity of the proposed technique is O(n0 +m0 +n0m0 +n1logn1). 6. Implementation and evaluation The proposed research analytics framework is implemented to aid the largest government funding agency in China for its grant proposal evaluation. It aims at funding scientiļ¬c research projects that could make huge social impact. The organizational hierarchy of the funding agency consists of one general ofļ¬ce, ļ¬ve bureaus, and eight scientiļ¬c departments. These departments are responsible for funding and managing research projects. Each department is further divided into divisions which are more focused on speciļ¬c research areas. There is intensive competition for getting research projects funded, with the most recent funding rate of only 21% in 2011. The government funding agency received around 147,000 and 170,000 proposals in 2011 and 2012, respectively. Proposals are widespread over many scientiļ¬c disciplines. These conditions make it difļ¬cult for the evaluation com- mittee to directly participate in every project evaluation. The committee groups the proposals in different areas and delegates their authority to groups of experts according to research areas. Each area may consist of multiple related disciplines. For example, Business is an area that in- cludes Management Science, Information Systems, and other business disciplines. There is a general budget to be distributed among the areas. The distribution of fund is not uniform and represents priorities set by the evaluation committee of the funding agency. The distribution could be adjusted based on the quality and quantity of proposals sub- mitted to each area. Research project selection is a process that involves multiple phases illustrated in Table A in Appendix A. To facilitate the project selection, the government funding agency has established an evalua- tion system which includes the peer review and expert panel evalua- tion. Division managers assign and invite external reviewers and panel experts to evaluate the proposals. The reviewers judge the quality of the project proposal based on their expertise, professional experience and with norms and criteria set by the funding agency. As seen, reviewer assignment is the most important phase that affects the quality and efļ¬ciency of the research project selection. We provide computerized support for the second phase of research project selection. In the prototype implementation of our system, distri- bution of fund is out of scope of this study. Our focus is the reviewer assignment recommendation. We have tested different subsets of proposals and reviewers. The system computes the score of matching in the relevance dimension for each pair of proposal and potential reviewer. The ļ¬nal assignment problem can be solved in reasonable amount of time. The solution is recommended to the review panels in their respective divisions. The review panels examine the recommenda- tion and have the right to either accept or reject our recommended assignment. Additionally, we provide data visualization to help man- agers view the assignment progress. Fig. 6 shows an example of the visualization. Overall, it takes a maximum of 6 hours to compute matching degrees of 34,000 proposals and 30,000 reviewers, which is the largest number of proposals received in a single department of the government funding agency. Thus if we use parallel and distributed computing for the assignment optimization in each department (there are 8 distinctive departments in total), we can ļ¬nish the recommendation task within 6 hours. It greatly improves work efļ¬ciency as manual process of assigning reviewers usually takes up to two weeks to complete. Quality of recommendation is acknowledged by the review panels. The proļ¬le-based recommendation takes into consideration the de- tailed information in terms of relevance, productivity and connectiv- ity. It can avoid conļ¬‚ict of interests and provide decision makers with most relevant information that can hardly be obtained by manual processes. The largest government funding agency has agreed to adopt our recommendation system in the next round of proposal evaluation. 7. Conclusion Building upon a research analytics framework, this study presents a new approach for research project selection in a research social network environment. We built proļ¬les of research entities (e.g. re- search proposals, reviewers) from three aspects including relevance, productivity and connectivity. Information for building the proļ¬les of research entities can be obtained from the research social network (Scholarmate). Degrees of matching based on the proļ¬les of research 965 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 10. entitles can be calculated by aggregating subjective, objective and so- cial information as collected from multiple sources. We implemented the system to aid the largest funding agency in China to optimize reviewer recommendation and support reviewer assignment. The implementation results showed that the proposed method greatly improved work efļ¬ciency. Our approach can be easily generalized to support different types of recommendations in the research social network environment. A direct application is journal article review. Based on the analysis of article features, our system can be used to select the initial pool of reviewers, calculate the degree of match between potential reviewers and the article, remove conļ¬‚ict of interests, balance reviewer expertise and pro- ductivity, and make ļ¬nal reviewer assignment recommendations. The process can be automated and monitored by journal editors. In compar- ison with the current practice that mainly relies on editors' subjective judgment facilitated by automated search tools, our system has the ability to optimize reviewer recommendation empowered by more social functionalities. Improved accuracy and work efļ¬ciency can be expected. Other potential applications include recommending funding opportunities, publication outlets for research articles, and potential research collaborators. For example, researchers can easily promote their recently published articles using the social tools in the form of likes, tweets, shares, and more. They can even track results when their articles are cited by others. Meanwhile, the system may recommend researchers who work in the same research areas to each other within and across different research communities. Based on a researcher's pro- ļ¬le, the research social network may also recommend journals that have published relevant topics as potential journal outlet for working papers. All these functions are very useful to promote timely distribution and target dissemination of research work. There are a number of limitations and possible future research directions. First, a research project has various attributes that can potentially inļ¬‚uence both the impact and the probability of success of the projects. We do not model the decision makers' preferences, beliefs, priorities, and their risk attitudes. Presumably the reviewer assignment decision problem can be modeled as a multi-objective decision problem. Second, our proposed framework only focuses on the evaluation of individual projects without building a portfolio of the most promising projects among all submitted proposals. The portfolio of projects to be funded and the individual amount that will be funded to each project are out of the scope of this research. Project evaluation, like product review, is highly subjective. There is no feedback mechanism available in the current framework to assess the quality of reviews. Historical records of funded projects, including the relevant characteristics, evalu- ation given by the reviewers, and the research output measured by pub- lications, could be valuable to make better evaluation of new proposals and to select unbiased reviewers. Future extension of the research framework may take into account these aspects. Finally, the power of Scholarmate is its ability to extract and aggre- gate information from multiple sources. We need to continuously improve the search tool to meet the increasing search needs of users. Moreover, standardization of the keyword dictionary can greatly help the phrase pattern recognition. While we keep evaluating and updating the keyword dictionary based on feedback of algorithm performance, we are aware that social vote is another efļ¬cient approach to identify relevant keywords and remove those less meaningful ones. We have implemented many social tools to aid the system improvement. The ultimate goal is to promote a healthy research environment for researchers to engage in innovative research production. Acknowledgment This research is partially funded by the General Research Fund of the Hong Kong Research Grant Council (Project No: CityU 119611), the National Natural Science Foundation of China (Project Nos: 71171172 and J1124003) and the City University of Hong Kong (Project No: 6000201). Project Clustering Reviewer Clustering Reviewer Assignment Invite Reviewers View Submission Review Status View Reviews and Comments 40 Management Evaluation Proposal Project Clustering 30 Progress Report Reviewer Assignment Reviewer Log In 100 10 Returned (100) Unreturned (70) Declined (30) Invited (200) In progress (40) Not started (30) Unreturned (70) - Log in (150) Never log in (50) Clustered (240) Not clustered (60) Not assigned (100) Assigned but not invited yet (200) Invited (200) Home Application Fig. 6. Visualization of the reviewer assignment progress. 966 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 11. Appendix A References [1] H. Abe, S. Tsumoto, Analysis of research keys as temporal patterns of technical term usages in bibliographical data, in: A. An, P. Lingras, S. Petty, R. Huang (Eds.), Active Media Technology, 6335, Springer, Berlin Heidelberg, 2010, pp. 150ā€“157. [2] E.M. Airoldi, X. Bai, K.M. Carley, Network sampling and classiļ¬cation: an investi- gation of network model representations, Decision Support Systems 51 (3) (2011) 506ā€“518. [3] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Second edition Addison-Wesley, Wokingham, UK, 2011. [4] A. Bajaj, R. Russell, AWSM: allocation of workļ¬‚ows utilizing social network metrics, Decision Support Systems 50 (1) (2010) 191ā€“202. [5] A.L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, T. Vicsek, Evolution of the social network of scientiļ¬c collaborations, Physica A: Statistical Mechanics and Its Applications 311 (3) (2002) 590ā€“614. [6] J.P. Caulkins, W. Ding, G.T. Duncan, R. Krishnan, E. Nyberg, A method for managing access to web pages: ļ¬ltering by Statistical Classiļ¬cation (FSC) applied to text, Decision Support Systems 42 (1) (2006) 144ā€“161. [7] J. Choi, S. Yi, K.C. Lee, Analysis of keyword networks in MIS research and implications for predicting knowledge evolution, Information Management 48 (8) (2011) 371ā€“381. [8] Y. Dang, Y. Zhang, P.J. Hu, S.A. Brown, H. Chen, Knowledge mapping for rapidly evolving domains: a design science approach, Decision Support Systems 50 (2) (2011) 415ā€“427. [9] Y. Dong, Z. Sun, H. Jia, A cosine similarity-based negative selection algorithm for time series novelty detection, Mechanical Systems and Signal Processing 20 (6) (2006) 1461ā€“1472. [10] W. Fan, M.D. Gordon, P. Pathak, Effective proļ¬ling of consumer information retrieval needs: a uniļ¬ed framework and empirical comparison, Decision Support Systems 40 (2) (2005) 213ā€“233. [11] M.A.H. Farquad, I. Bose, Preprocessing unbalanced data using support vector machine, Decision Support Systems 53 (1) (2012) 226ā€“233. [12] M. Girvan, M.E.J. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences of the United States of America 99 (12) (2002) 7821ā€“7826. [13] A.D. Henriksen, A.J. Traynor, A practical RD project-selection scoring tool, IEEE Transactions on Engineering Management 46 (2) (1999) 158ā€“170. [14] E. Herrera-Viedma, C. Porcel, Using incomplete fuzzy linguistic preference relations to characterize user proļ¬les in recommender systems, Ninth International Conference on Intelligent Systems Design and Applications, ISDA '09, 2009, pp. 90ā€“95. [15] C.C. Huang, P.Y. Chu, Y.H. Chiang, A fuzzy AHP application in government- sponsored RD project selection, Omega 36 (6) (2008) 1038ā€“1052. [16] T. Joachims, A statistical learning model of text classiļ¬cation with support vector machines, Proceedings of ACM SIGIR'01, 2001, pp. 128ā€“136. [17] R.N. Kostoff, J.A. Del Roi, J.A. Humenik, E.O. Garcia, A.M. Ramirez, Citation mining: integrating text mining and bibliometrics for research user proļ¬ling, Journal of the American Society for Information Science and Technology 52 (13) (2001) 1148ā€“1156. [18] R.N. Kostoff, T. Braun, A. Schubert, D.R. Toothman, J.A. Humenik, Fullerene data mining using bibliometrics and database tomography, Journal of Chemical Infor- mation and Computer Science 40 (Janā€“Feb 2000) 19ā€“39. [19] Y. Li, C. Zhang, J.R. Swan, An information ļ¬ltering model on the web and its applica- tion in job agent, Knowledge-Based Systems 13 (5) (2000) 285ā€“296. [20] Y. Li, X. Zhou, P. Bruza, Y. Xu, R.Y.K. Lau, A two-stage decision model for information ļ¬ltering, Decision Support Systems (2011), http://dx.doi.org/10.1016/j.dss.2011. 11.005. [21] T.M. Mitchell, Machine Learning, McGraw-Hill, New York, NY, 1997. [22] J. Mostafa, W. Lam, Automatic classiļ¬cation using supervised learning in a medical document ļ¬ltering application, Information Processing and Management 36 (3) (2000) 415ā€“444. [23] M.E.J. Newman, The structure of scientiļ¬c collaboration networks, Proceedings of the National Academy of Sciences of the United States of America 98 (2001) 404ā€“409. [24] M.E.J. Newman, Coauthorship networks and patterns of scientiļ¬c collaboration, Proceedings of the National Academy of Sciences of the United States of America (PNAS) 101 (Suppl. 1) (2004) 5200ā€“5205. [25] M.E.J. Newman, Fast algorithm for detecting community structure in networks, Physical Review E 69 (6) (2004). [26] G. Oestreicher-Singer, A. Sundararajan, Recommendation networks and the long tail of electronic commerce, MIS Quarterly 36 (1) (2012) 65ā€“83. [27] J. Qiu, Z. Lin, A framework for exploring organizational structure in dynamic social networks, Decision Support Systems 51 (4) (2011) 760ā€“771. [28] S. Raghuram, P. Tuertscher, R. Garud, Research note: mapping the ļ¬eld of virtual work: a cocitation analysis, Information Systems Research 21 (4) (December 2010) 983ā€“999. [29] S. Robertson, I. Soboroff, The TREC 2002 Filtering Track Report, TREC, 2002. [30] J. Scott, Social Network Analysis: A Handbook, Sage Publications, London, 2000. [31] N. Shibata, Y. Kajikawa, I. Sakata, Measuring relatedness between communities in a citation network, Journal of the American Society for Information Science and Technology 62 (7) (2011) 1360ā€“1369. [32] T. Strzalkowski, Robust text processing in automated information retrieval, Proceedings of the 4th Applied Natural Language Processing Conference (ANLP), 1994, pp. 168ā€“173. [33] Y.H. Sun, J. Ma, Z. Fan, J. Wang, A group decision support approach to evaluate experts for RD project selection, IEEE Transactions on Engineering Management 55 (1) (2008) 158ā€“170. [34] Y.H. Sun, J. Ma, Z.P. Fan, J. Wang, A hybrid knowledge and model approach for reviewer assignment, Expert Systems with Applications 34 (2008) 817ā€“824. [35] Q. Tian, J. Ma, J. Liang, R.C.W. Kwok, O. Liu, An organizational decision support system for effective RD project selection, Decision Support Systems 39 (2005) 403ā€“413. [36] E. Turban, D. Zhou, J. Ma, A group decision support approach to evaluating journals, Information Management 42 (1) (2004) 31ā€“44. Table A Research project selection process at the government funding agency. Phases in R D pro- ject selection Key decisions Call for proposal and proposal submission 1) Check the validity of the submitted proposal content 2) Fulļ¬llment of application requirement by the principle investigator and by the proposal Identifying the most suitable external reviewers for proposal evaluation 1) Selection of potential reviewers based on claimed expertise 2) Assignment of external reviewers for validated proposals based on predeļ¬ned criteria 3) Transferring proposals to responsible divisions Peer review 1) Review the quality and content of proposals by external reviewers based on the provides guidelines 2) Validate the review content 3) Coordinate with external reviewers and completion of the review process as scheduled Review results aggregation 1) Aggregate the review results and transform the review results into comparable measurement and rank the proposal accordingly 2) Recommend proposals for panel evaluation Panel evaluation 1) Reļ¬ne the suggested proposal list by making decisions on marginal proposals by panel of expertise 2) Suggestion on funded project list Final decision making 1) Consideration of exceptional cases 2) Recommend list of projects to be funded Table B Table of notation. Notation Description Proļ¬ling P={p1, ā€¦ pm} Initial set of m phases D={d1,d2, ā€¦,dn} Initial set of n documents r Number of clusters fik Occurrence frequency of phrase k in document di, k=1, 2, ā€¦, m rpi Initial phrase set of the document di crpr Cluster of phrases support(crpr) Supporting measure of cluster crpr wrk Normalized phrase frequency for cluster r, k=1, 2, ā€¦, m Ī²ik Relative importance weight of phase pk in document i, k=1, 2, ā€¦, m Ī²(crpr) Normal form of the cluster phrase patterns Ī²i ={Ī²i1,Ī²i2, ā€¦,Ī²im} Phrase weighted distribution of document i Relevance index rij Jij Jaccard similarity index of proposal i and reviewer j Cij Cosine similarity index of proposal i and reviewer j Connectivity index cij uij Collaboration frequency between researchers i and j uIJ Collaboration frequency among researchers in community I to those in community J aI Weighted fraction of edges that connect to vertices in community I Qs Modularity measure for a network with s communities gij Goodness of ļ¬t between proposal i to reviewer j Productivity index ej Gj Potential reviewer j's publication score Rj Potential reviewer j's academic rank Hj Potential reviewer j's H-index ej Potential reviewer j's productivity measure e Desired average productivity level determined by panel chairs or division managers 967 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968
  • 12. [37] A.S. Vivacqua, J. Oliveira, J.M. De Souza, i-ProSE: inferring user proļ¬les in a scientiļ¬c context, The Computer Journal 52 (7) (2009) 789ā€“798. [38] K.M. Wang, C.K. Wang, C. Hu, Analytic hierarchy process with fuzzy scoring in evaluating multidisciplinary RD projects in China, IEEE Transactions on Engineering Management 52 (1) (2005) 119ā€“129. [39] D.J. Watts, S.H. Strogatz, Collective dynamics of ā€˜small-worldā€™ networks, Nature 393 (1998) 440ā€“442. [40] Z. Zheng, K. Chen, G. Sun, H. Zha, A regression framework for learning ranking func- tions using relative relevance judgments, Proc. of SIGIR'07, 2007, pp. 287ā€“294. Thushari Silva is currently pursuing her PhD in the department of Information Systems at the City University of Hong Kong. She received her MSc. in Information and Communication Technology from Asian Institute of Technology, Thailand in 2010. Her research interests include research social network analysis, recommender systems, business intelligence and semantic web. Zhiling Guo is an Assistant Professor in Information Systems at the City University of Hong Kong. She received her Ph.D. in Management Science and Information Systems from The University of Texas at Austin in 2005. Dr. Guo's general research interests include online auctions, electronic markets, cloud computing, crowdsourcing, social networks, social media marketing, and supply chain risk management. Dr. Guo's papers have been published in Management Science, Information Systems Research, Journal of Management Information Systems, Decision Support Systems, among others. Jian Ma is a Professor in the Department of Information Systems at the City University of Hong Kong. He received his Doctor of Engineering degree in Computer Science from Asia Institute of Technology in 1991. Prof. Ma's general research interests include business intelligence, research and Innovation Social Networks, research information systems and decision support systems. His past research has been published in IEEE Transactions on Engineering Management, IEEE Transactions on Education, IEEE Transactions on Systems, Man and Cybernetics, Decision Support Systems and European Journal of Operational Research, among others. Hongbing Jiang is currently pursuing his PhD in the University of Science and Technology of Chinaā€“City University of Hong Kong joint Advanced Research Center, Suzhou. His research interests include recommendation systems and social network analysis. Huaping Chen is a Professor of School of Management at the University of Science and Technology of China. His research interests include information strategies, business intelligence and application. His past research has been published in Journal of Opera- tions Management, Decision Support Systems and Computers Operations Research, among others. 968 T. Silva et al. / Decision Support Systems 55 (2013) 957ā€“968