This report discusses recommendations for improving a paper on semi-automating the study selection process in systematic literature reviews. It provides background on systematic reviews and their goal of reducing bias. It then outlines recommendations for enhancing the paper's approach to constructing a bag-of-words model and linking key phrases to semantic web resources to better represent concepts. Specific suggestions include considering related work sections, keywords, and common properties among resources when building the model.
1. Report for the course “Empirical
Methods in Software Engineering”.
Faisal Razzak (faisal.razzak@polito.it)
2. What is Systematic Review?
My Goal.
Recommendations.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
3. Literature review to answer a set of
research questions.
Pre-defined protocol for selecting and
categorizing primary sources of
literature.
Objective evaluation of findings.
Reducing subjectivity bias.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
4. 1. Identification of research
2. Selection of studies
3. Study Quality assessment
4. Data extraction and Monitoring
progress.
5. Data synthesis.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
5. Tomassetti et al. [2] proposed to semi-
automate the selection process of studies
(Step 2), to reduce manual work needed and
the resulting subjectivity bias.
This informal report provides
recommendations to improve the work
presented in [2].
[Note] The recommendations are based on
reviewing the relevant paper and no
development and testing have been
performed.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
6. To construct the bag of words model (M)
from the initial set of sources (I0),
‘Related Work or Literature Review’
sections should be considered.
These sections already provide the
related papers discussing similar topics
and their relevant titles, abstract and
introduction should be used to construct
M.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
7. Moreover, many papers these days use
‘keywords’ to describe the content of the
paper. It is not perfect but inclusion of
some keywords from I0 might be helpful
for model M.
Bibbase [5] is also an important resource
to extract paper information.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
8. Each paper (wi) in I0 is processed to extract key
phrases (K).
Each key phrase (ki ε K) is linked to corresponding
Dbpedia resource (if available).
Besides Dbpedia, using resources like WordNet [8],
MultiwordNet [7] or their Linked Data
representations, i.e., GeoWordNet [4], WordNet [6]
might be better.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
9. These word databases provide a much better
resource for getting the description of the concept
and its possible equivalent concepts (key phrases).
Moreover, if the operation has to be restricted to
Linked Data, use of Sindice API [3] can ensure to
search relevant concepts on Linked Data Cloud.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
10. In current version, the bag of words
contain words from all the statements of
a resource. It might be better to find
common property among such resources
and only use it to construct key phrases.
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak
11. 1. B.Kitchenham, Procedures for performing systematic reviews.
Technical Report, 2004.
2. Tomassetti F., Rizzo G., Vetro' A., Ardito L., Torchiano M., Morisio
M. (2011)
Linked Data approach for selection process automation in
Systematic Reviews. In: 15th Annual Conference on Evaluation &
Assessment in Software Engineering (EASE 2011), Durham City
(UK), 11/04/2011-12/04/2011. pp. 31-35
3. http://sindice.com/
4. http://geowordnet.semanticmatching.org/
5. http://bibbase.org/
6. http://www.w3.org/TR/wordnet-rdf/
7. http://multiwordnet.fbk.eu/english/home.php
8. http://wordnet.princeton.edu/
Report for the course “Empirical Methods
in Software Engineering”. -- Faisal Razzak