Information Retrieval methods have been largely adopted to identify traceability links based on the textual similarity of software artifacts. However, noise due to word usage in software artifacts might negatively affect the recovery accuracy. We propose the use of smoothing filters to reduce the effect of noise in software artifacts and improve the performances of traceability recovery methods. An empirical evaluation performed on two repositories indicates that the usage of a smoothing filter is able to significantly improve the performances of Vector Space Model and Latent Semantic
Indexing. Such a result suggests that other than being used for traceability recovery the proposed filter can be used to improve performances of various other software engineering approaches based on textual analysis.
Cross-project defect prediction is very appealing because (i) it allows predicting defects in projects for which the availability of data is limited, and (ii) it allows producing generalizable prediction models. However, existing research suggests that cross-project prediction is particularly challenging and, due to heterogeneity of projects, prediction accuracy is not always very good. This paper proposes a novel, multi-objective approach for cross-project defect prediction, based on a multi-objective logistic regression model built using a genetic algorithm. Instead of providing the software engineer with a single predictive model, the multi-objective approach allows software engineers to choose predictors achieving a compromise between number of likely defect-prone artifacts (effectiveness) and LOC to be analyzed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the Promise repository indicate the superiority and the usefulness of the multi-objective approach with respect to single-objective predictors. Also, the proposed approach outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.
Cross-project defect prediction is very appealing because (i) it allows predicting defects in projects for which the availability of data is limited, and (ii) it allows producing generalizable prediction models. However, existing research suggests that cross-project prediction is particularly challenging and, due to heterogeneity of projects, prediction accuracy is not always very good. This paper proposes a novel, multi-objective approach for cross-project defect prediction, based on a multi-objective logistic regression model built using a genetic algorithm. Instead of providing the software engineer with a single predictive model, the multi-objective approach allows software engineers to choose predictors achieving a compromise between number of likely defect-prone artifacts (effectiveness) and LOC to be analyzed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the Promise repository indicate the superiority and the usefulness of the multi-objective approach with respect to single-objective predictors. Also, the proposed approach outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.
How the Evolution of Emerging Collaborations Relates to Code Changes: An Empi...Sebastiano Panichella
Developers contributing to open source projects spontaneously group into "emerging" teams, re
ected by messages ex-changed over mailing lists, issue trackers and other communication means. Previous studies suggested that such teams somewhat mirror the software modularity. This paper empirically investigates how, when a project evolves, emerging teams re-organize themselves|e.g., by splitting or merging. We relate the evolution of teams to the les they change, to investigate whether teams split to work on cohesive groups
of files. Results of this study conducted on the evolution
history of four open source projects, namely Apache HTTPD, Eclipse JDT, Netbeans, and Samba provide indications of what happens in the project when teams reorganize. Specifically, we found that emerging team splits imply working on more cohesive groups of les and emerging team merges imply working on groups of les that are cohesive from structural perspective. Such indications serve to better understand the evolution of software projects. More important, the observation of how emerging teams change can serve to suggest software remodularization actions.
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...hblanca
We present two terminology extraction tools to compare a knowledge-poor and a knowledge-rich approach. Both tools process SWT and MWT and are designed to handle multilingualism. We run an evaluation on 6 languages and 2 different domains using crawled comparable corpora and hand-crafted reference term lists (RTL). We discuss the 3 main results achieved for terminology extraction. The first two evaluation scenarios concern the knowledge-rich framework. Scenario 1 (S1) compares performances for each of the languages depending on the ranking that is applied: specificity score vs. the number of occurrences. Scenario 2 (S2) examines the relevancy of the term variant identification to increase the precision ranking for any of the languages. Scenario 3 (S3) compares both tools and demonstrates that a probabilistic term extraction approach, developed with minimal effort, achieves satisfactory results when compared to a rule-based method.
conference: cicling 2013 - samos
How the Evolution of Emerging Collaborations Relates to Code Changes: An Empi...Sebastiano Panichella
Developers contributing to open source projects spontaneously group into "emerging" teams, re
ected by messages ex-changed over mailing lists, issue trackers and other communication means. Previous studies suggested that such teams somewhat mirror the software modularity. This paper empirically investigates how, when a project evolves, emerging teams re-organize themselves|e.g., by splitting or merging. We relate the evolution of teams to the les they change, to investigate whether teams split to work on cohesive groups
of files. Results of this study conducted on the evolution
history of four open source projects, namely Apache HTTPD, Eclipse JDT, Netbeans, and Samba provide indications of what happens in the project when teams reorganize. Specifically, we found that emerging team splits imply working on more cohesive groups of les and emerging team merges imply working on groups of les that are cohesive from structural perspective. Such indications serve to better understand the evolution of software projects. More important, the observation of how emerging teams change can serve to suggest software remodularization actions.
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...hblanca
We present two terminology extraction tools to compare a knowledge-poor and a knowledge-rich approach. Both tools process SWT and MWT and are designed to handle multilingualism. We run an evaluation on 6 languages and 2 different domains using crawled comparable corpora and hand-crafted reference term lists (RTL). We discuss the 3 main results achieved for terminology extraction. The first two evaluation scenarios concern the knowledge-rich framework. Scenario 1 (S1) compares performances for each of the languages depending on the ranking that is applied: specificity score vs. the number of occurrences. Scenario 2 (S2) examines the relevancy of the term variant identification to increase the precision ranking for any of the languages. Scenario 3 (S3) compares both tools and demonstrates that a probabilistic term extraction approach, developed with minimal effort, achieves satisfactory results when compared to a rule-based method.
conference: cicling 2013 - samos
Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb Bert version three for Chinese lanuage. bert is a bert bert bert bert bert bergt bert bert bert bertbert bert bet bertb
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfSease
f you want to expand your query/documents with synonyms in Apache Lucene, you need to have a predefined file containing the list of terms that share the same semantic. It’s not always easy to find a list of basic synonyms for a language and, even if you find it, this doesn’t necessarily match with your contextual domain.
The term “daemon” in the domain of operating system articles is not a synonym of “devil” but it’s closer to the term “process”.
Word2Vec is a two-layer neural network that takes as input a text and outputs a vector representation for each word in the dictionary. Two words with similar meanings are identified with two vectors close to each other.
On how to change the utility curve of deep learning to make deep learning projects deliver an ROI no matter how accurate the machine learning system is - presented at the Nasscom Analytics Summit 2018.
The presentation was given to Rivier Scala / Clojure User Group meeting on 10.6.2013. It is half-baked presentation. Will upload the final version when ready.
The first part is about DSLs in general, complexities in software engineering and abstraction. The seconds part presents an quick overview about DSLs in Scala and touches some of the technologies used for deep embedding.
Similar to ICPC 2011 - Improving IR-based Traceability Recovery Using Smoothing Filters (20)
Maliheh (Mali) Izadi, PhD, Andrea Di Sorbo, and Sebastiano Panichella co-chaired the 3rd Intl. Workshop on NL-based Software Engineering
April 20 2024, Lisbon, Portugal.
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Sebastiano Panichella
Timo Blattner, Christian Birchler, Timo Kehrer, Sebastiano Panichella: Diversity-guided Search Exploration for Self-driving Cars Test Generation through Frenet Space Encoding. Intl. Workshop on Search-Based and Fuzz Testing (SBFT). 2024
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
Nicolas Erni, Al-Ameen, Mohammed, Christian Birchler, Pouria Derakhshanfar, Stephan Lukasczyk, Sebastiano Panichella: SBFT Tool Competition 2024 -- Python Test Case Generation Track 17th International Workshop on Search-Based and Fuzz Testing
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSebastiano Panichella
Sajad Khatiri, Prasun Saurabh, Timothy Zimmermann, Charith Munasinghe, Christian Birchler, Sebastiano Panichella: SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track 17th International Workshop on Search-Based and Fuzz Testing
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
Sajad Khatiri, Sebastiano Panichella, Paolo Tonella: Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist. International Conference on Software Engineering. 2024
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
Lecture entitled "Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective Test Generation and Selection" at the International Summer School
on Search- and Machine Learning-based Software Engineering
June 22-24, 2022 - Córdoba, Spain
Sebastiano Panichella and Christian Birchler
COSMOS:
DevOps for Complex Cyber-physical Systems
Sebastiano Panichella
Zurich University of Applied Sciences (ZHAW)
Workshop on Adaptive CPSoS (WASOS) 2023
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
Keynote presentation </b>at ICST (AIST workshop) entitled "Testing and Development Challenges for Complex Cyber-Physical Systems: Insights from the COSMOS H2020 Project"
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...Sebastiano Panichella
Presentation at 16th IEEE International Conference on Software
Testing, Verification and Validation (ICST): An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical Systems. Journal of Systems & Software (JSS).
Automated Identification and Qualitative Characterization of Safety Concerns ...Sebastiano Panichella
Presentation at the IEEE/ACM International Conference on
Automated Software Engineering (ASE 2023):
“Automated Identification and Qualitative Characterization of Safety Concerns
Reported in UAV Software Platforms” -
Transactions on Software Engineering and Methodology
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Sebastiano Panichella
Here are the slides of the presentation of the paper entitled "Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Neighborhood of Real Flights". It was presented at the IEEE International Conference on Software Testing, Verification, and Validation (ICST) 2023.
The presentation concerns the ongoing research in the COSMOS H2020 project (https://www.cosmos-devops.org/), as outlined by the ICST Program (https://conf.researchr.org/program/icst-2023/program-icst-2023/?past=Show%20upcoming%20events%20only).
Exposed! A case study on the vulnerability-proneness of Google Play AppsSebastiano Panichella
Title: Exposed! A case study on the vulnerability-proneness
of Google Play Apps
Authors:
Andrea Di Sorbo, Sebastiano Panichella
Venue:
ESEC/FSE - Journal First Presentation
14-18, November 2022, Singapore
Video:
https://www.youtube.com/watch?v=9lv3WGuNM0A&ab_channel=Sebastiano
Search-based Software Testing (SBST) '22
Workshop Co-Chairs:
Giovani Guizzo
UNIVERSITY COLLEGE LONDON, UNITED KINGDOM
Sebastiano Panichella
ZURICH UNIVERSITY OF APPLIED SCIENCE, SWITZERLAND
Competition Co-Chairs:
Alessio Gambi
UNIVERSITY OF PASSAU, GERMANY
Gunel Jahangirova
UNIVERSITÀ DELLA SVIZZERA ITALIANA, SWITZERLAND
Vincenzo Riccio
UNIVERSITÀ DELLA SVIZZERA ITALIANA, SWITZERLAND
Fiorella Zampetti
UNIVERSITY OF SANNIO, ITALY
Website Chair:
Rebecca Moussa
UNIVERSITY COLLEGE LONDON, UNITED KINGDOM
Program Committee:
Nazareno Aguirre, Universidad Nacional de Río Cuarto - CONICET, Argentina
Aldeida Aleti, Monash University, Australia
Giuliano Antoniol, Ecole Polytechnique de Montréal, Canada
Kate Bowers, Oakland University, USA
Jose Campos, University of Washington, USA
Thelma E. Colanzi, State University of Maringá, Brazil
Byron DeVries, Grand Valley State University, USA
Gordon Fraser, University of Passau, Germany
Erik Fredericks, Oakland University, USA
Gregory Gay, Chalmers and the University of Gothenburg, Sweden
Alessandra Gorla, IMDEA Software Institute, Spain
Gregory Kapfhammer, Allegheny College, USA
Yiling Lou, Peking University, China
Mitchell Olsthoorn, Delft University of Technology, Netherlands
Justyna Petke, University College London, UK
Silvia R. Vergilio, Universidade Federal do Paraná, Brazil
Simone do Rocio Senger de Souza, University of São Paulo, Brazil
Thomas Vogel, Humboldt-Universität zu Berlin, Germany
Jie Zhang, University College London, UK
Tool Competition
Introduction
NLP-based approaches and tools have been proposed to improve the efficiency of software engineers, processes, and products, by automatically processing natural language artifacts (issues, emails, commits, etc.).
We believe that the availability of accurate tools is becoming increasingly necessary to improve Software Engineering (SE) processes. One important process is issue management and prioritization where developers have to understand, classify, prioritize, assign, etc. incoming issues reported by end-users and developers.
This year, we are pleased to announce the first edition of the NLBSE’22 tool competition on issue report classification, an important task in issue management and prioritization.
For the competition, we provide a dataset encompassing more than 800k labeled issue reports (as bugs, enhancements, and questions) extracted from real open-source projects. You are invited to leverage this dataset for evaluating your classification approaches and compare the achieved results against a proposed baseline approach (based on FastText).
Competition overview
We created a Colab notebook with detailed information about the competition (provided data, baseline approach, paper submission, paper format, etc.).
If you want to participate, you must:
Train and tune a multi-label multi-class classifier using the provided training set. The classifier should assign one label to an issue.
Evaluate your classifier on the provided test set
Write a paper (4 pages max.) describing:
The architecture and details of the classifier
The procedure used to pre-process the data
The procedure used to tune the classifier on the training set
The results of your classifier on the test set
Additional info.: provide a link to your code/tool with proper documentation on how to run it
Submit the paper by emailing the tool competition organizers (see below)
Submissions will be evaluated and accepted based on correctness and reproducibility, defined by the following criteria:
Clarity and detail of the paper content
Availability of the code/tool, released as open-source
Correct training/tuning/evaluation of your code/tool on the provided data
Clarity of the code documentation
The accepted submissions will be published at the workshop proceedings.
The submissions will be ranked based on the F1 score achieved by the proposed classifiers on the test set, as indicated in the papers.
The submission with the highest F1 score will be the winner of the competition.
How to participate?
Email your paper to Oscar Chaparro (oscarch@wm.edu) and Rafael Kallis (rk@rafaelkallis.com) by the submission deadline.
ICPC 2011 - Improving IR-based Traceability Recovery Using Smoothing Filters
1. Improving IR-based Traceability
Recovery Using Smoothing Filters
Andrea Massimiliano Rocco Annibale Sebastiano
De Lucia Di Penta Oliveto Panichella Panichella
2. Software traceability
“The degree to which a relationship can be established
between two products of a software development process”
[IEEE Glossary for Software Terminology]
Source
Use case Test case
code
Source
Use case Test case
code
Important for:
Up-to-date traceability
program comprehension Up-to-date traceability
requirement tracing links rarely exists →
links rarely exists →
impact analysis
need to recover them
need to recover them
software reuse
…
3. IR-based traceability recovery
Antoniol et al., 2002 (VSM+Probabilistic model)
Antoniol et al., 2002 (VSM+Probabilistic model)
Marcus and Maletic, 2003 (LSI)
Marcus and Maletic, 2003 (LSI)
4. Traditional IR vs.
IR applied to Software Engineering
Traditional IR IR applied to SE
Deals with We have sets of
heterogeneous homogeneous
documents for what documents for what
concerns: concerns
Linguistic choices Syntax, linguistic
Syntax choices
Semantics Examples:
We just live with that Use cases, test
differences documents, design
documents follow a
common template and
contain recurrent words
5. Problem
Different kinds of software artifacts require specific
preprocessing
Test case Change the date for a visit:
Test case Change the date for a visit:
C51
C51 Version: 0 02 000
Version: 0 02 000
Use case
Use case Satisfies the request to modify a visit
for a patient request to modify a visit
Satisfies the
for a patient
UcModVis
UcModVis
Priority
Priority High
High
....
....
Test description
Test description
Input
Input Select a visit:
Select a visit:
26/09/2003 11:00 First visit
26/09/2003 11:00 First visit
Change: 03/10/2003 11:00
Change: 03/10/2003 11:00
Oracle
Oracle Invalid sequence: The system does not allow
Invalid sequence: The system does not allow
to change a booking
to change a booking
Coverage
Coverage Valid classes: CE1 CE8 CE14 CE19 CE21
Valid classes: CE1 CE8 CE14 CE19 CE21
Invalid classes: None
Invalid classes: None
6. Problem
Different kinds of software artifacts require specific
preprocessing
Test case Change the date for a visit:
Test case Change the date for a visit:
C51
C51 Version: 0 02 000
Version: 0 02 000
Use case
Use case Satisfies the request to modify a visit
for a patient request to modify a visit
Satisfies the
for a patient
UcModVis
UcModVis
Priority
Priority High
High Artifact-specific words do
....
.... not bring useful
Test description
Test description
Input Select a visit:
information
Input Select a visit:
26/09/2003 11:00 First visit
26/09/2003 11:00 First visit
Change: 03/10/2003 11:00
Change: 03/10/2003 11:00
Oracle
Oracle Invalid sequence: The system does not allow
Invalid sequence: The system does not allow
to change a booking
to change a booking
Coverage
Coverage Valid classes: CE1 CE8 CE14 CE19 CE21
Valid classes: CE1 CE8 CE14 CE19 CE21
Invalid classes: None
Invalid classes: None
8. Noisy images
Pixels with peaks of low Pixels with peaks of
color intensity high color intensity
Noise
9. Reducing noise using smoothing filters
Mean filter
1
g ( x, y ) =
M
∑ f ( n, m )
f ( n , m )∈S
10. Image vs. traceability noise
Image noise: Traceability noise:
Pixels with high or Terms and linguistic
low color intensity patterns occurring in
Pixels are position many artifacts of a given
dependent category
Use cases, test
cases..
Artifacts (columns) are
position independent
d1 d2 d2 d1
11. Representing the noise
Source Documents Target Documents
s1 s2 s3 L sk t1 t2 t3 L tz
word1 v1,1 v1,2 v1,3 L v1, k v1,1 v1,2 v1,3 L v1, z
v v2,2 v2,3 v2, k v2,1 v2,2 v2,3 v2, z
word 2 2,1 L L
M M O M O M O M M O M O M O M
word n vn ,1 L vn ,2 L vn ,3 L vn ,k vn ,1 L vn ,2 L vn ,3 L vn , z
Linguistic information strictly Linguistic information strictly belonging
belonging to source documents to target documents
Common Information Common Information
for Source Documents For target documents
12. Representing the noise
Source Documents Target Documents
s1 s2 s3 L sk t1 t2 t3 L tz
word1 v1,1 v1,2 v1,3 L v1, k v1,1 v1,2 v1,3 L v1, z
v v2,2 v2,3 v2, k v2,1 v2,2 v2,3 v2, z
word 2 2,1 L L
M M O M O M O M M O M O M O M
word n vn ,1 L vn ,2 L vn ,3 L vn ,k vn,1 L vn,2 L vn,3 L vn, z
1 k
k ∑ v1, j 1 m
z ∑ v1, j
j =1 j = k +1
1 k 1 m
∑ v2, j
Mean source vector S= k ∑
S = j =1
v2, j Mean target vector T= T = z j = k +1
M
M m
k 1
1 z ∑ vn , j
∑ vn, j
Common Information k j =1 Common Information j = k +1
for Source Documents For target documents
The Mean Vectors are like the continuous component of a signal…
The Mean Vectors are like the continuous component of a signal…
13. Representing the noise
Source Documents Target Documents
s1 s2 s3 L sk t1 t2 t3 L tz
word1 v1,1 v1,2 v1,3 L v1, k v1,1 v1,2 v1,3 L v1, z
v v2,2 v2,3 v2, k v2,1 v2,2 v2,3 v2, z
word 2 2,1 L L
M M O M O M O M M O M O M O M
word n vn ,1 L vn ,2 L vn ,3 L vn ,k vn ,1 L vn ,2 L vn ,3 L vn , z
- -
S T
(mean target (mean target vector)
vector)
Filtered
Filtered Filtered
Filtered
Source Set
Source Set Target Set
Target Set
14. Empirical Study
Goal: analyze the effect of smoothing filter
Purpose: investigating how the filter affects
traceability recovery
Quality focus: traceability recovery performance
(precision and recall)
Perspective:
Researchers: evaluating the novel technique
Project managers: adopt a better traceability recovery
technique
Context: artifacts from two systems
EasyClinic and Pine
15. Context
EasyClinic Pine
Description Medical doctor office Text-based
management email client
Language Java C
Files/Classe 37 31
s
KLOC 20 130
Documents 113 100
Language Italian English
Artifacts Use cases Requirements
Interaction diagrams Use cases
Source code
Test cases
16. Research Questions and Factors
RQ1: Does the smoothing filter improve the
recovery performances of VSM-based traceability
recovery?
RQ2: Does the smoothing filter improve the
recovery performances of LSI-based traceability
recovery?
RQ3: How do the performances vary for different
types of artifacts?
Factors:
Use of filter: YES, NO
Technique: VSM, LSI
Artifact: Req., UC, Int. Diagrams, Code, TC
System: Easyclinic, Pine
17. Analysis Method
Performances evaluated by precision and recall:
correct ∩ retrieved correct ∩ retrieved
precision = recall =
retrieved correct
M1 M2
We statistically compare the #
of false positives of different 0
methods for each correct link
2
identified
2
Wilcoxon Rank Sum test
3
Cliff’s delta effect size
21. EasyClinic: Test cases into source (LSI)
Test cases are:
Short documents
Limited vocabulary
Mostly consistent with
source code
Precision
Filtered
Not Filtered
Recall
22. Pine: Use cases into requirements (LSI)
Precision
Filtere
d
Not Filtered
Recall
24. Link precision improvement
Login Patient
Login Patient
vs. Person
vs. Person
Poor vocabulary
Poor vocabulary
overlap (10%)
overlap (10%)
25. Threats to validity
Construct validity
Mainly related to our oracle
Provided by developers and for EasyClinic also peer-
reviewed
Internal validity
Improvements could be due to other reasons…
However we compared different techniques (VSM, LSI)
The approach works well regardless of stop word
removal/stemming and use of tf-idf
Conclusion validity
Conclusions based on proper (non-parametric) statistics
External validity
We considered systems with different characteristics and
artifacts
… but further studies are desirable
26. Conclusions
We proposed the use of smoothing filter to
improve performances of IR-based traceability
recovery
Idea inspired from digital signal processing
The filter significantly improves IR-based
traceability recovery based on VSM (RQ1) and LSI
(RQ2)
Filter particularly suitable for artifacts having a
higher verbosity (RQ3)
e.g., requirements and use cases
Less useful for artifacts composed of short
sentences and using a limited vocabulary
e.g., test cases
27. Work-in-progress
Study replication
Different systems and artifacts
Use of relevance feedback
More sophisticated smoothing technique
Non linear filters
Use in other applications of IR to software
engineering
E.g. impact analysis or feature location