Like this presentation? Why not share!

Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal Analysis"

by West Virginia University on Sep 21, 2011

• 2,264 views

Promise 2011:

Promise 2011:
"Selecting Discriminating Terms for Bug Assignment: A Formal Analysis"
Ibrahim Aljarah, Shadi Banitaan, Sameer Abufardeh, Wei Jin and Saeed Salem.

Views

Total Views
2,264
Views on SlideShare
284
Embed Views
1,980

Likes
0
12
0

Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal Analysis"Presentation Transcript

• Selecting Discriminating Terms for Bug Assignment: AFormal Analysis.Ibrahim Aljarah, Shadi Banitaan, Sameer Abufardeh, Wei Jin and Saeed SalemNorth Dakota State University, Fargo, ND, USA This research is supported by
• Presentation Outlines2  Bug Assignment Problem Overview  Bug Assignment Steps.  Term Selections  Log Odds Ratio based Term Selection Techniques.  Experimental Results  Conclusion  Future Directions
• Bug Assignment Problem3 Suggest whom to assign this bug to.Assign the bug to an appropriate developer. New Bugs D1 B2 D2 B2 B6 B1 B1 B3 B4 B7 B5 D3 D4 B5 B7 B6 Bug Triager B3 B4
• 4 9/21/2011
• Bug Assignment Steps
• Bug Reports Preprocessing6
• Bug-term matrix (M) and Bug-developer vector (Y) construction7 t1 t2 t3 ……. tR b1 0 0 1 ….… 1 d1 b2 1 1 1 ….… 0 d1 b3 0 0 1 ….… 1 d3 b4 1 1 0 ….… 0 d1 b5 0 0 1 ….… 1 d5 . 1 0 0 ….… 0 . . 0 0 1 ….… 1 . . 1 1 1 ….… 0 . . . . . ……. . . bN . . . ……. . d9 Need to assign a value {0,1} to each entry of the bug- M Y term matrix. T = {t1, t2, · · · , tR} is a set of R terms. D = {d1,....., dL} is a set of L pre-defined developers. B = {b1,..... bN} is a set of N bug reports to be assigned.
• Term Selection8  Term Selection:  It selects a subset of terms to describe the bug report.  It has been noted that the terms selection can be a good idea to reduce the calculations time.  Thus, it Leads to significant improvement in classification performance.  Common Techniques: Information Gain, Latent Semantic analysis.
• Discriminating Terms9  A term that it is commonly found in the bug reports that have been fixed by a specific developer, but rarely found in other bug reports.  Log Odds Ratio Score used to decide which terms are discriminated.  Research goal: improving the classification quality by discarding non-discriminating terms before doing the classification task(bug assignment).
• Log Odds Ratio (LOR)10  The LOR score is calculated with respect to the individual developer (class) which discriminates the terms in that class.  High score means that it is more discriminated.  The LOR score is calculated as follows:
• Log Odds Ratio Calculation Example11 Term1 Term2 Term3 Class Bug Report1 1 1 1 D1 Bug Report2 1 0 0 D1 Bug Report3 0 0 1 D1 Bug Report4 0 1 1 D2 Bug Report5 0 0 0 D2 Bug Report6 0 0 1 D3 Bug Report7 1 0 0 D3LogOdds(Term1 |D1)  2/3*log((2/3)/(1/4)= 1.78 ( Term1 has a highest Log Odds Ratio)LogOdds(Term2 |D1)  1/3*log((1/3)/(1/4)=0.44LogOdds(Term3 |D1)  2/3*log((2/3)/(2/4)=0.88
• Proposed Term Selection Techniques Log-Odds-Ratio-based techniques12  Terms From All selection (TFA)  In this method, the R terms that have the highest LOR scores will be chosen without considering the terms distribution over all developers.  All the LOR scores for the terms that are related to each class terms are combined in one common list  Then scores are sorted  And finally the R′ terms with the highest scoring are extracted from the list.
• Terms From All selection (TFA)13 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 d1 0 0 1 1 1 1 0 1 1 1 d3 d1 d2 d3 1 1 1 0 1 0 1 1 1 1 d2 t1 1.04 1.95 1.33 0 0 1 1 0 1 1 0 0 1 d2 t2 1.75 1.64 1.02 1 1 0 0 1 1 1 1 0 0 d1 t3 1.07 1.43 1.35 0 0 1 1 0 0 1 1 1 1 d3 t4 1.54 1.88 1.62 0 0 1 0 0 1 1 0 0 0 d1 t5 1.85 1.16 1.53 0 0 1 1 1 1 0 1 1 1 d1 t6 1.19 1.23 1.23 0 0 1 1 1 0 0 1 1 0 d3 0 0 0 0 1 1 1 0 0 1 d1 t7 1.63 1.43 1.67 0 1 1 1 0 1 1 1 1 0 d1 t8 1.12 1.92 1.43 0 1 0 1 1 0 1 1 1 1 d2 t9 1.12 1.12 1.39 0 1 0 0 0 1 1 1 0 0 d3 t10 1.13 1.98 1.11 M Y LOR Values (virtual) We have 12 bug reports, 3 developers and 10 different terms. If we want to select 6 terms to generate the reduced Bug-term matrix M‘ Select 6 terms from highest scores regardless the distribution between developer
• Proposed Term Selection Techniques Log-Odds-Ratio-based techniques14  Term-Class Related selection (TCR):  Idea : Select k terms from each class (developer).  It enhances the selection criteria by targeting terms that have the highest LOR scores in each class.  Two ways are suggested to specify k, which are:  Equally Likely.  Variable.
• Proposed Term Selection Techniques Log-Odds-Ratio-based techniques15  TCR- ki Equally Likely:  Choosing fixed number of terms for each class. (k)  For example:  if we have 10 classes (developers) and we need to select 100 terms then we select 10 terms from the highest LOR scored terms for each developer.  We maintain a unique set of terms, i.e., the number of obtained terms R′ can be less than or equal to k × L.
• TCR- k Equally Likely:16 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 d1 0 0 1 1 1 1 0 1 1 1 d3 d1 d2 d3b1b2 1 1 1 0 1 0 1 1 1 1 d2 t1 1.04 1.95 1.33b3 0 0 1 1 0 1 1 0 0 1 d2 t2 0.75 1.64 1.02b4 1 1 0 0 1 1 1 1 0 0 d1 t3 1.07 1.43 1.35b5 0 0 1 1 0 0 1 1 1 1 d3 t4 1.54 1.88 1.62b6 0 0 1 0 0 1 1 0 0 0 d1 t5 1.85 1.16 1.53b7 0 0 1 1 1 1 0 1 1 1 d1 t6 1.19 1.23 1.23b8 0 0 1 1 1 0 0 1 1 0 d3b9 0 0 0 0 1 1 1 0 0 1 d1 t7 1.63 0.43 1.67b10 0 1 1 1 0 1 1 1 1 0 d1 t8 1.12 1.92 1.43b11 0 1 0 1 1 0 1 1 1 1 d2 t9 1.12 1.12 1.39b12 0 1 0 0 0 1 1 1 0 0 d3 t10 1.13 1.98 1.11 M Y LOR Values (virtual) We have 12 bug reports, 3 developers and 10 different terms. If we want to select 6 terms to generate The reduced Bug-term matrix M′ 2 term  d3 , 2 term  d2 , 2 term d1
• Proposed Term Selection Techniques Log-Odds-Ratio-based techniques17  TCR- ki Variable:  Choosing a variable number of terms for each class.  k is specified based on the developer fixing rate.  Fixing rate: is proportional to the number of bug reports assigned to the developer from all available bug reports. Selection of the highest scored terms with (R =20) from 100 bug reports and 5 developers:
• TCR- ki Variable:18 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 d1 0 0 1 1 1 1 0 1 1 1 d3 d1 D2 d3b1b2 1 1 1 0 1 0 1 1 1 1 d2 t1 1.04 1.95 1.33b3 0 0 1 1 0 1 1 0 0 1 d2 t2 1.75 1.64 1.02b4 1 1 0 0 1 1 1 1 0 0 d1 t3 1.07 1.43 1.35b5 0 0 1 1 0 0 1 1 1 1 d3 t4 1.54 1.88 1.62b6 0 0 1 0 0 1 1 0 0 0 d1 t5 1.85 1.16 1.53b7 0 0 1 1 1 1 0 1 1 1 d1 0 0 1 1 1 0 0 1 1 0 T6 1.19 1.23 1.23b8 d3b9 0 0 0 0 1 1 1 0 0 1 d1 t7 1.13 1.43 1.67b10 0 1 1 1 0 1 1 1 1 0 d1 t8 1.12 1.92 1.43b11 0 1 0 1 1 0 1 1 1 1 d2 t9 1.63 1.12 1.39b12 0 1 0 0 0 1 1 1 0 0 d3 t10 1.13 1.98 1.11 M Y LOR Values (virtual) We have 12 bug reports, 3 developers and 10 different terms. If we want to select 6 terms to generate The reduced Bug-term matrix M′ 1 term  d3 , 2 term  d2 , 3 term d1
• Reduced Bug-term matrix M19 t2 t3 t4 ……. tR b1 t1 t2 t3 ……. tR D1 0 0 1 ….… 1 b1 0 0 1 ….… 1 D1 b2 D1 b2 1 1 1 ….… 0 D1 b3 1 1 1 ….… 0 D3 0 0 1 ….… 1 b3 0 0 1 ….… 1 D3 b4 D1 Term selection b4 1 1 0 ….… 0 D1 b5 1 1 0 ….… 0 D5 0 0 1 ….… 1 b5 0 0 1 ….… 1 D5 . . . 1 0 0 ….… 0 . . 1 0 0 ….… 0 . 0 0 1 ….… 1 . 0 0 1 ….… 1 . . . . 1 1 1 ….… 0 . . 1 1 1 ….… 0 . . . . ……. . . . . . ……. . . bN DL bN . . . ……. . D9 . . . ……. . M M
• Training and Testing Data Preparation20 Training Data Set B1 1 1 1 ….… 0 d2 B2 1 1 1 ….… 1 d2 t1 t2 t3 tR B3 0 0 1 ….… 1 … B4 1 1 0 ….… 0 dL B1 1 1 1 ….… 1 d1 B5 0 0 1 ….… 1 1 ….… 0 d2 …. B2 1 1 ….. B? . . . ……. . B3 1 1 1 ….… 1 d2 B4 1 1 0 ….… 0 d4 B5 0 0 1 ….… 1 d4 . 1 0 0 ….… 0 . . 0 0 1 ….… 1 . . 1 1 1 ….… 0 . . . . . …………. ……. . . . . . . …………. ……. d1 . Bm . . . .………… ……. dL. B6 1 1 1 ….… 0 d1 B7 …. M B8 1 0 1 0 1 ….… 1 ….… 1 1 d1 B9 1 1 0 ….… 0 ….Applying 5 folds Cross Validation. B10 0 0 1 ….… 1 …. B? . . . ……. . dL Testing Data Set
• Experimental results21  Eclipse Project Bugs Dataset:  A variety of open bug repositories are used in open source development, our experiments applied on Bugzilla repository related to Eclipse (https://bugs.eclipse.org).  Number of bugs in 2009 are: Total Reported 38843 Bugs. FIXED 20502 Bugs. WONTFIX 1182 Bugs. DUBLICATE 3120 Bugs. WORKSFORME 1362 Bugs INVALID 1465 Bugs Not Eclipse 365 Bugs Other (REASSINED ,NEW,REOPEN) 10847 Bugs (Still without Resolution)
• Bugs Reports Status and Resolutions22 FIXED WONTFIX DUBLICATE WORKSFORME INVALID NOTECLIPSE OTHER 28% 53% 1% 4% 3% 8% 3%
• Experimental results23  Eclipse Bugs Reports Components:  Bugzilla Repository - Eclipse Project divided in 907 different components.  We use the most motivated components (have maximum Fixed Bugs) are :  Core Component: JDT Core is the Java infrastructure of the Java IDE  http://www.eclipse.org/jdt/core/index.php  UI Component: Java Development Toolkit UI.  http://www.eclipse.org/jdt/ui/index.html  SWT Component: Eclipse standard Widgets Toolkit.  http://www.eclipse.org/swt/
• Number Of Fixed Bugs Per Component24 2500 2000 1500 Count Of Fixed Bugs 1000 500 0 UI Core SWT
• Experimental results25  Evaluations:  Precision is the ratio of the correctly classified bug reports to the total number of misclassified bug reports and correctly classified bug reports.  Recall is the ratio of correctly classified bug reports to the total number of unclassified bug reports and correctly classified bug reports. .  We used the Bayesian network Classifier.
• Experimental results26  Other Techniques used to compare are:  Information Gain which is calculated for each term with respect to all classes, and terms with top R information gain values are returned.  Latent Semantic Analysis which is transforming terms into concepts by extracting relations between terms in the selected bug reports.
• Experimental results27 F-measure results of the five term selection methods using different number of terms. These methods were applied on the Core component and only active developers were considered.
• Experimental results28 The results for the SWT Component TRC - ki Variable had the highest precision (0.59) and highest recall (0.55).
• Experimental results29 The results for the UI Component TRC - ki Variable achieved the highest precision (0.56) and was from the highest recall (0.46) values.
• Conclusion30  This research investigates the impact of several term selection methods on effectiveness of the classification.  Three Log Odds Ratio (LOR) variants selection methods were proposed.  A comparison between the proposed selection methods and the Information Gain (IG) and Latent Semantic Analysis (LSA) techniques was done.  The LOR-based selection method (TRC - ki Variable) achieved:  up to 30% improvement in the precision and up to 5% in recall  These results demonstrate the impact of incorporating effective term selection techniques on improving classification performance.
• Future Directions31  Investigation of other alternative weighting schemes to better identify discriminating terms for improving classification accuracy.  Exploring the potential of incorporating external domain knowledge and other evidence sources to better address the general bug assignment task.  Expanding the data sets from multiple domains to further examine the effectiveness of proposed term selection techniques.
• Thank You.32 Any Questions ?