Report for the course “Empirical
Methods in Software Engineering”.
 Faisal Razzak (faisal.razzak@polito.it)
 What is Systematic Review?
 My Goal.
 Recommendations.




          Report for the course “Empirical Methods
          in Software Engineering”. -- Faisal Razzak
 Literature review to answer a set of
  research questions.
 Pre-defined protocol for selecting and
  categorizing primary sources of
  literature.
 Objective evaluation of findings.
 Reducing subjectivity bias.



           Report for the course “Empirical Methods
           in Software Engineering”. -- Faisal Razzak
1.   Identification of research
2.   Selection of studies
3.   Study Quality assessment
4.   Data extraction and Monitoring
     progress.
5.   Data synthesis.



             Report for the course “Empirical Methods
             in Software Engineering”. -- Faisal Razzak
   Tomassetti et al. [2] proposed to semi-
    automate the selection process of studies
    (Step 2), to reduce manual work needed and
    the resulting subjectivity bias.
   This informal report provides
    recommendations to improve the work
    presented in [2].
   [Note] The recommendations are based on
    reviewing the relevant paper and no
    development and testing have been
    performed.

             Report for the course “Empirical Methods
             in Software Engineering”. -- Faisal Razzak
 To construct the bag of words model (M)
  from the initial set of sources (I0),
  ‘Related Work or Literature Review’
  sections should be considered.
 These sections already provide the
  related papers discussing similar topics
  and their relevant titles, abstract and
  introduction should be used to construct
  M.
           Report for the course “Empirical Methods
           in Software Engineering”. -- Faisal Razzak
 Moreover, many papers these days use
  ‘keywords’ to describe the content of the
  paper. It is not perfect but inclusion of
  some keywords from I0 might be helpful
  for model M.
 Bibbase [5] is also an important resource
  to extract paper information.



           Report for the course “Empirical Methods
           in Software Engineering”. -- Faisal Razzak
   Each paper (wi) in I0 is processed to extract key
    phrases (K).

   Each key phrase (ki ε K) is linked to corresponding
    Dbpedia resource (if available).

   Besides Dbpedia, using resources like WordNet [8],
    MultiwordNet [7] or their Linked Data
    representations, i.e., GeoWordNet [4], WordNet [6]
    might be better.

                Report for the course “Empirical Methods
                in Software Engineering”. -- Faisal Razzak
   These word databases provide a much better
    resource for getting the description of the concept
    and its possible equivalent concepts (key phrases).
   Moreover, if the operation has to be restricted to
    Linked Data, use of Sindice API [3] can ensure to
    search relevant concepts on Linked Data Cloud.




                Report for the course “Empirical Methods
                in Software Engineering”. -- Faisal Razzak
   In current version, the bag of words
    contain words from all the statements of
    a resource. It might be better to find
    common property among such resources
    and only use it to construct key phrases.




             Report for the course “Empirical Methods
             in Software Engineering”. -- Faisal Razzak
1.   B.Kitchenham, Procedures for performing systematic reviews.
     Technical Report, 2004.
2.   Tomassetti F., Rizzo G., Vetro' A., Ardito L., Torchiano M., Morisio
     M. (2011)
     Linked Data approach for selection process automation in
     Systematic Reviews. In: 15th Annual Conference on Evaluation &
     Assessment in Software Engineering (EASE 2011), Durham City
     (UK), 11/04/2011-12/04/2011. pp. 31-35
3.   http://sindice.com/
4.   http://geowordnet.semanticmatching.org/
5.   http://bibbase.org/
6.   http://www.w3.org/TR/wordnet-rdf/
7.   http://multiwordnet.fbk.eu/english/home.php
8.   http://wordnet.princeton.edu/

                    Report for the course “Empirical Methods
                    in Software Engineering”. -- Faisal Razzak
Faisal Razzak (Faisal.razzak@polito.it)

Recommendations for selection process automation in systematic reviews

  • 1.
    Report for thecourse “Empirical Methods in Software Engineering”. Faisal Razzak (faisal.razzak@polito.it)
  • 2.
     What isSystematic Review?  My Goal.  Recommendations. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 3.
     Literature reviewto answer a set of research questions.  Pre-defined protocol for selecting and categorizing primary sources of literature.  Objective evaluation of findings.  Reducing subjectivity bias. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 4.
    1. Identification of research 2. Selection of studies 3. Study Quality assessment 4. Data extraction and Monitoring progress. 5. Data synthesis. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 5.
    Tomassetti et al. [2] proposed to semi- automate the selection process of studies (Step 2), to reduce manual work needed and the resulting subjectivity bias.  This informal report provides recommendations to improve the work presented in [2].  [Note] The recommendations are based on reviewing the relevant paper and no development and testing have been performed. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 6.
     To constructthe bag of words model (M) from the initial set of sources (I0), ‘Related Work or Literature Review’ sections should be considered.  These sections already provide the related papers discussing similar topics and their relevant titles, abstract and introduction should be used to construct M. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 7.
     Moreover, manypapers these days use ‘keywords’ to describe the content of the paper. It is not perfect but inclusion of some keywords from I0 might be helpful for model M.  Bibbase [5] is also an important resource to extract paper information. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 8.
    Each paper (wi) in I0 is processed to extract key phrases (K).  Each key phrase (ki ε K) is linked to corresponding Dbpedia resource (if available).  Besides Dbpedia, using resources like WordNet [8], MultiwordNet [7] or their Linked Data representations, i.e., GeoWordNet [4], WordNet [6] might be better. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 9.
    These word databases provide a much better resource for getting the description of the concept and its possible equivalent concepts (key phrases).  Moreover, if the operation has to be restricted to Linked Data, use of Sindice API [3] can ensure to search relevant concepts on Linked Data Cloud. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 10.
    In current version, the bag of words contain words from all the statements of a resource. It might be better to find common property among such resources and only use it to construct key phrases. Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 11.
    1. B.Kitchenham, Procedures for performing systematic reviews. Technical Report, 2004. 2. Tomassetti F., Rizzo G., Vetro' A., Ardito L., Torchiano M., Morisio M. (2011) Linked Data approach for selection process automation in Systematic Reviews. In: 15th Annual Conference on Evaluation & Assessment in Software Engineering (EASE 2011), Durham City (UK), 11/04/2011-12/04/2011. pp. 31-35 3. http://sindice.com/ 4. http://geowordnet.semanticmatching.org/ 5. http://bibbase.org/ 6. http://www.w3.org/TR/wordnet-rdf/ 7. http://multiwordnet.fbk.eu/english/home.php 8. http://wordnet.princeton.edu/ Report for the course “Empirical Methods in Software Engineering”. -- Faisal Razzak
  • 12.