AAAI 2011

                                        CosTriage: A Cost-Aware Algorithm for
Information & Database Systems Lab




                                        Bug Reporting Systems
                                     Jin-woo Park1, Mu-Woong Lee1, Jinhan Kim1, Seung-won Hwang1, Sunghun Kim2
                                                           POSTECH, Korea, Republic of1
                                                                HKUST, Hong Kong2
Bug reporting systems
                                      Bugs!!
                                        More than 300 bug reports per day in Mozilla (a big software
                                         project)
Information & Database Systems Lab




                                      Bug Solving
                                        One of the important issues in a software development process
                                        Bug reports are posted, discussed, and assigned to developers


                                      Open sources projects
                                          Apache
                                          Eclipse
                                          Linux kernel
                                          Mozilla
Bug reporting systems
                                      Bug reports
                                         Has Bug ID, title, description, status, and other meta data
                                         Assigned to developers
                                         Fixed by developers
Information & Database Systems Lab




                                      Challenges
                                         Bug triage
                                         Duplicate bug detection
                                         …
                                         bug ID
                                                                                                        title (summary)
                                         status
                                                                                                   bug fix history (time)

                                      other data



                                      description
Bug reporting systems
                                      Bug Triage
                                        Assigning a new bug report to a suitable developer
                                        Bottleneck of bug fixing process
                                           Labor intensive
Information & Database Systems Lab




                                           Miss-assignment can lead to slow bug fix
                                                                                          Bottleneck of
                                                                                          the bug fixing
                                                                                             process




                                                                                           assign
                                           Open
                                          Source             Bug
                                          Project           Reports
                                                                                Triager

                                                                                                           Developers

                                                                                               Can be
                                                                                            automated!!!
Preliminary (recommendation)
                                      Recommender algorithms
                                        Content-based recommendation (CBR)
                                           Predicting user’s interests based on item features
                                           Machine learning methods
Information & Database Systems Lab




                                           Over-specialization problem


                                        Collaborative filtering recommendation (CF)
                                           Predicting user’s interests based on affinity’s interests for items
                                           User neighborhood
                                           Sparsity problem


                                        Hybrid recommendation
                                           Content-boosted collaborative filtering (CBCF)
                                           Combining an existing CBR with a CF
                                           Better performance than either approach alone
Preliminary (recommendation)
                                      Recommender algorithms
                                         Content-based recommendation (CBR)
                                                Predict user’s interests based on item features
                                                   Title, Status, Description, …
Information & Database Systems Lab




                                                Learn features using machine learning methods
                                                Recommend similar items
                                                Over-specialization problem



                                                                                                    Developers
                                                      Bug
                                                    Report 4                        Title   Bug 1      Bug 2     Bug 3   Bug 4

                                                                                    Dev 1    10          9         1      ?

                                     Feature      word1    word2     word3          Dev 2     6          5        10      ?

                                      Count        1           1      4             Dev 3     8          7         7      ?


                                                                                             Rating score table
Preliminary (recommendation)
                                      Recommender algorithms
                                         Content-based recommendation (CBR)
                                                Predict user’s interests based on item features
                                                   Title, Status, Description, …
Information & Database Systems Lab




                                                Learn features using machine learning methods
                                                Recommend similar items
                                                Over-specialization problem



                                                                                                    Developers
                                                      Bug
                                                    Report 4                        Title   Bug 1      Bug 2     Bug 3   Bug 4

                                                                                    Dev 1    10          9         1       9

                                     Feature      word1    word2     word3          Dev 2     6          5        10       5

                                      Count        1           1      4             Dev 3     8          7         7       7


                                                                                             Rating score table
Preliminary (recommendation)
                                      Recommender algorithms
                                        Collaborative filtering recommendation (CF)
                                           Predicting user’s interests based on affinity’s interests for items
                                           User neighborhood
Information & Database Systems Lab




                                           Sparsity problem



                                                                           Bug
                                                                          Reports


                                                                  Bug 1             Bug 2      Bug 3
                                                Developer 1         10               5           10
                                                Developer 2          5               10           6
                                                Developer 3          7               7            7
                                                Developer 4          9               5            ?
Preliminary (recommendation)
                                      Recommender algorithms
                                        Collaborative filtering recommendation (CF)
                                           Predicting user’s interests based on affinity’s interests for items
                                           User neighborhood
Information & Database Systems Lab




                                           Sparsity problem



                                                                           Bug
                                                                          Reports


                                                                  Bug 1             Bug 2      Bug 3
                                                Developer 1         10               5           10
                                                Developer 2          5               10           6
                                                Developer 3          7               7            7
                                                Developer 4          9               5            ?
Preliminary (recommendation)
                                      Recommender algorithms
                                        Collaborative filtering recommendation (CF)
                                           Predicting user’s interests based on affinity’s interests for items
                                           User neighborhood
Information & Database Systems Lab




                                           Sparsity problem



                                                                           Bug
                                                                          Reports


                                                                  Bug 1             Bug 2      Bug 3
                                                Developer 1         10               5           10
                                                Developer 2          5               10           6
                                                Developer 3          7               7            7
                                                Developer 4          9               5            9
Preliminary (recommendation)
                                      Recommender algorithms
                                        Collaborative filtering recommendation (CF)
                                            Predicting user’s interests based on affinity’s interests for items
                                            User neighborhood
Information & Database Systems Lab




                                            Sparsity problem


                                     Unsuitable for triaging because:
                                                                           Bug
                                      No one solved new bug!!
                                                                         Report 4


                                                          Bug 1          Bug 2          Bug 3          Bug 4
                                        Developer 1         10             5              10             ?
                                        Developer 2          5             10             6              ?
                                        Developer 3          7             7              7              ?
                                        Developer 4          9             5              9              ?
Preliminary (recommendation)
                                      Recommender algorithms
                                        Collaborative filtering recommendation (CF)
                                            Predicting user’s interests based on affinity’s interests for items
                                            User neighborhood
Information & Database Systems Lab




                                            Sparsity problem


                                     Unsuitable for triaging because:
                                                                           Bug
                                      Rating is extremely sparse!!
                                                                         Report 4


                                                          Bug 1          Bug 2          Bug 3          Bug 4
                                        Developer 1         10             5              ?              ?
                                        Developer 2          ?             ?              ?              ?
                                        Developer 3          ?             ?              10             ?
                                        Developer 4          ?             ?              ?              3
Preliminary (recommendation)
                                      Recommender algorithms
                                        Hybrid recommendation
                                          Content-boosted collaborative filtering (CBCF)
                                          Combining an existing CBR with a CF
Information & Database Systems Lab




                                          Better performance than either approach alone


                                           Two Phases
                                                                         Bug
                                           - CBR phase                                Feature     word1   word2   word3
                                                                       Report 5
                                           - CF Phase                                 Count        3        2      4



                                                     Bug 1      Bug 2         Bug 3      Bug 4            Bug 5
                                           Dev 1       10          ?              ?           ?            ?
                                           Dev 2         ?         8              3           ?            ?
                                           Dev 3         ?         ?              ?           7            ?
Preliminary (recommendation)
                                      Recommender algorithms
                                        Hybrid recommendation
                                          Content-boosted collaborative filtering (CBCF)
                                          Combining an existing CBR with a CF
Information & Database Systems Lab




                                          Better performance than either approach alone


                                           CBR phase
                                                                         Bug
                                                                       Report 5



                                                     Bug 1      Bug 2         Bug 3    Bug 4   Bug 5
                                           Dev 1       10         10              10    10      10
                                           Dev 2        3          8              3     8       3
                                           Dev 3        7          7              7     7       7
Preliminary (recommendation)
                                      Recommender algorithms
                                        Hybrid recommendation
                                          Content-boosted collaborative filtering (CBCF)
                                          Combining an existing CBR with a CF
Information & Database Systems Lab




                                          Improving CBR/CF by combining the complementary strength of CBR
                                           and CF

                                           CF phase
                                                                       Bug
                                                                     Report 5



                                                      Bug 1    Bug 2        Bug 3   Bug 4      Bug 5
                                           Dev 1       10        9              7     9          8
                                           Dev 2       5         8              3     8          5
                                           Dev 3       7         8              5     7          6

                                        Existing recommendation approaches are not suitable!
Preliminary (Bug triage)
                                      PureCBR [Anvik06]
                                        Construct multi-class classifier using a SVM classifier
                                        Bug reports B are converted into pair <F(B), D> for training
                                            Bug Report History  Input Data
Information & Database Systems Lab




                                            Developers  Classes

                                         F(B) is the feature vector indicating the counts of the keyword w of description of B
                                         D is the developer who fixed B (class)



                                                            Bug Fix
                                                            History

                                                                                           training
                                                                                                        Assign a new bug to Dev 1

                                      New Bug
                                       Report
                                                                                                            Classifier’s scores
Preliminary (Bug triage)
                                      PureCBR [Anvik06]
                                        Good performances
                                        Problem
                                          This approach only considers accuracy
Information & Database Systems Lab




                                             Many bugs  One super-developer
                                             Over-specialization problem
                                          Are developer happy?
                                            We consider developer’s cost (e.g., interests, bug fix time, and
                                           expertise)
                                            We assume that faster bug fixing time has higher developer cost




                                                                   Bug            Bug
                                                                 Report 1       Report 2


                                                                  50 days         2 days
Goal
                                      Goal
                                        Find efficient bug-developer matching
                                           Optimizing not only accuracy but also cost
                                        Use modified CBCF approach
Information & Database Systems Lab




                                           Constructing developer profiles for cost

                                      Challenge
                                        Enhancing CBCF approach for sparse data
                                        Extreme sparseness of the past bug fix history data
                                           A bug fixed by a developer
                                           Need to reduce sparseness for enhancing quality of CBCF




                                                              Bug fix time from bug fix history
Overview
                                                            Cost

                                                             Developer profiles                   Cost score
Information & Database Systems Lab




                                                                                                                         Recommended
                                        New Bug                                                  Aggregation
                                                                                                                           Developer
                                         Report             Accuracy

                                                               Bug classifier                   Accuracy score
                                                                 <SVM>



                                      Merging classifier’s scores and developer’s cost scores.
                                        The accuracy scores are obtained using PureCBR [Anvik06]
                                        The developer cost scores are obtained from “de-sparsified” bug
                                         fix history.
                                        Two scores are then merged for prediction
                                                                                                                 Assign a new bug to Dev 1

                                                                   +                                       =
                                          Accuracy scores                         Cost scores                           Hybrid scores
CosTriage (Cost Estimation)
                                      CosTriage: A Cost-aware Triage Algorithm for bug reporting
                                       system
                                      Challenge to estimate the developer cost?
                                      How to reduce the sparseness problem?
Information & Database Systems Lab




                                         Using a Topic Modeling




                                                           Bug fix time from bug fix history




                                                     Categorization bugs to reduce the sparseness
CosTriage
                                      Categorizing bugs
                                        Topic Modeling
                                           Latent Dirichlet Allocation (LDA) [BleiNg03]
                                           Each topic is represented as a bug type
Information & Database Systems Lab




                                           The topic distribution of reports determine bug types
                                        We adopt the divergence measure proposed in [Arun, R. PAKDD ‘10]
                                           Finding the natural number of topics (# bug types)


                                            t is the natural number of bug types
CosTriage
                                      Developer profiles modeling
                                        Developer Profiles
                                           N-dimensional feature vector
                                           The element of developer profiles, Pu[i], denotes the developer cost
Information & Database Systems Lab




                                            for ith-type bugs


                                                    T denotes the number of bug types



                                        Developer Cost
                                           The average time to fix ith type bugs
CosTriage
                                      Predicting missing values in profiles
                                        Using CF for developer profiles
                                           Similarity measure:
Information & Database Systems Lab




                                               k=1
CosTriage
                                      Obtaining developer’s cost for a new bug report

                                                 Bug type = 1
Information & Database Systems Lab




                                                 New Bug
                                                  Report




                                                                Developer cost for a new bug
Merging
                                                             Cost

                                                              Developer profiles                  Cost score
                                                                <Bug types>
Information & Database Systems Lab




                                                                                                                     Recommended
                                          Bug                                                    Aggregation
                                        Reports                                                                        Developer
                                                             Accuracy

                                                                Bug classifier
                                                                  <SVM>                         Accuracy score



                                      Merging classifier’s scores and developer’s cost scores.


                                                                    +                                          =

                                           Accuracy scores                       Cost scores (CosTriage)           Hybrid scores
                                              [Anvik06]
Experiments
                                      Subject Systems
                                        97,910 valid bug reports
                                        255 active developers
                                        From four open source projects
Information & Database Systems Lab




                                      Approaches
                                        PureCBR: State of the art CBR-based approach
                                        CBCF: Original CBCF
                                        CosTraige: Our approach
Experiments
                                      Two research questions
                                       Q1. How much can our approach improve cost (bug fix time) without
                                           sacrificing bug assignment accuracy?

                                       Q2. What are the trade-offs between accuracy and cost (bug fix
Information & Database Systems Lab




                                           time)?


                                      Evaluation measures




                                       W is the set of bug reports predicted correctly.
                                       N is the number of bug reports in the test set.

                                        The real fix time is unknown, we only use the fix time for correctly
                                         matched bugs.
Experiments
                                      Relative errors of expected bug fix time
Information & Database Systems Lab




                                      Improvement of bug fix time (Q1)




                                        CosTriage improves the costs efficiently up to 30% without
                                         seriously compromising accuracy
Experiments
                                      Trade-off between accuracy and bug fix time (Q2)
Information & Database Systems Lab
Conclusion
                                      We proposed a new bug triaging technique
                                        Optimize not only accuracy but also cost
                                        Solve data sparseness problem by using topic modeling
Information & Database Systems Lab




                                      Experiments using four real bug report corpora
                                        Improve the cost without heavy losses of accuracy
Q&A
Information & Database Systems Lab




                                                Thank you!


                                           Do you have any questions?
Contact
Information & Database Systems Lab




                                                     Jin-woo Park
                                               jwpark85@postech.ac.kr
Back up
                                      We adopt the divergence measure proposed in [Arun10]
                                        Finding the natural number of topics (# bug types)
Information & Database Systems Lab




                                            t is the natural number of bug types




                                                        Mozilla                         Apache
Back up - Bug features
                                      Bug features
                                        Keywords of title and description
                                        Other meta data
Information & Database Systems Lab




                                                                    Title : Traditional Memory Rendering refactoring request

                                          New Bug                   Description : Request additional refactoring so we can
                                                                                                …
                                           Report                                    Traditional Rendering.



                                                                                                    Remove stopwords



                                                                   Traditional Memory Rendering refactoring request Request
                                                                               refactoring … Traditional Rendering

CosTriage: A Cost-Aware Algorithm for Bug Reporting Systems (AAAI 2011)

  • 1.
    AAAI 2011 CosTriage: A Cost-Aware Algorithm for Information & Database Systems Lab Bug Reporting Systems Jin-woo Park1, Mu-Woong Lee1, Jinhan Kim1, Seung-won Hwang1, Sunghun Kim2 POSTECH, Korea, Republic of1 HKUST, Hong Kong2
  • 2.
    Bug reporting systems  Bugs!!  More than 300 bug reports per day in Mozilla (a big software project) Information & Database Systems Lab  Bug Solving  One of the important issues in a software development process  Bug reports are posted, discussed, and assigned to developers  Open sources projects  Apache  Eclipse  Linux kernel  Mozilla
  • 3.
    Bug reporting systems  Bug reports  Has Bug ID, title, description, status, and other meta data  Assigned to developers  Fixed by developers Information & Database Systems Lab  Challenges  Bug triage  Duplicate bug detection  … bug ID title (summary) status bug fix history (time) other data description
  • 4.
    Bug reporting systems  Bug Triage  Assigning a new bug report to a suitable developer  Bottleneck of bug fixing process  Labor intensive Information & Database Systems Lab  Miss-assignment can lead to slow bug fix Bottleneck of the bug fixing process assign Open Source Bug Project Reports Triager Developers Can be automated!!!
  • 5.
    Preliminary (recommendation)  Recommender algorithms  Content-based recommendation (CBR)  Predicting user’s interests based on item features  Machine learning methods Information & Database Systems Lab  Over-specialization problem  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood  Sparsity problem  Hybrid recommendation  Content-boosted collaborative filtering (CBCF)  Combining an existing CBR with a CF  Better performance than either approach alone
  • 6.
    Preliminary (recommendation)  Recommender algorithms  Content-based recommendation (CBR)  Predict user’s interests based on item features  Title, Status, Description, … Information & Database Systems Lab  Learn features using machine learning methods  Recommend similar items  Over-specialization problem Developers Bug Report 4 Title Bug 1 Bug 2 Bug 3 Bug 4 Dev 1 10 9 1 ? Feature word1 word2 word3 Dev 2 6 5 10 ? Count 1 1 4 Dev 3 8 7 7 ? Rating score table
  • 7.
    Preliminary (recommendation)  Recommender algorithms  Content-based recommendation (CBR)  Predict user’s interests based on item features  Title, Status, Description, … Information & Database Systems Lab  Learn features using machine learning methods  Recommend similar items  Over-specialization problem Developers Bug Report 4 Title Bug 1 Bug 2 Bug 3 Bug 4 Dev 1 10 9 1 9 Feature word1 word2 word3 Dev 2 6 5 10 5 Count 1 1 4 Dev 3 8 7 7 7 Rating score table
  • 8.
    Preliminary (recommendation)  Recommender algorithms  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood Information & Database Systems Lab  Sparsity problem Bug Reports Bug 1 Bug 2 Bug 3 Developer 1 10 5 10 Developer 2 5 10 6 Developer 3 7 7 7 Developer 4 9 5 ?
  • 9.
    Preliminary (recommendation)  Recommender algorithms  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood Information & Database Systems Lab  Sparsity problem Bug Reports Bug 1 Bug 2 Bug 3 Developer 1 10 5 10 Developer 2 5 10 6 Developer 3 7 7 7 Developer 4 9 5 ?
  • 10.
    Preliminary (recommendation)  Recommender algorithms  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood Information & Database Systems Lab  Sparsity problem Bug Reports Bug 1 Bug 2 Bug 3 Developer 1 10 5 10 Developer 2 5 10 6 Developer 3 7 7 7 Developer 4 9 5 9
  • 11.
    Preliminary (recommendation)  Recommender algorithms  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood Information & Database Systems Lab  Sparsity problem Unsuitable for triaging because: Bug  No one solved new bug!! Report 4 Bug 1 Bug 2 Bug 3 Bug 4 Developer 1 10 5 10 ? Developer 2 5 10 6 ? Developer 3 7 7 7 ? Developer 4 9 5 9 ?
  • 12.
    Preliminary (recommendation)  Recommender algorithms  Collaborative filtering recommendation (CF)  Predicting user’s interests based on affinity’s interests for items  User neighborhood Information & Database Systems Lab  Sparsity problem Unsuitable for triaging because: Bug  Rating is extremely sparse!! Report 4 Bug 1 Bug 2 Bug 3 Bug 4 Developer 1 10 5 ? ? Developer 2 ? ? ? ? Developer 3 ? ? 10 ? Developer 4 ? ? ? 3
  • 13.
    Preliminary (recommendation)  Recommender algorithms  Hybrid recommendation  Content-boosted collaborative filtering (CBCF)  Combining an existing CBR with a CF Information & Database Systems Lab  Better performance than either approach alone Two Phases Bug - CBR phase Feature word1 word2 word3 Report 5 - CF Phase Count 3 2 4 Bug 1 Bug 2 Bug 3 Bug 4 Bug 5 Dev 1 10 ? ? ? ? Dev 2 ? 8 3 ? ? Dev 3 ? ? ? 7 ?
  • 14.
    Preliminary (recommendation)  Recommender algorithms  Hybrid recommendation  Content-boosted collaborative filtering (CBCF)  Combining an existing CBR with a CF Information & Database Systems Lab  Better performance than either approach alone CBR phase Bug Report 5 Bug 1 Bug 2 Bug 3 Bug 4 Bug 5 Dev 1 10 10 10 10 10 Dev 2 3 8 3 8 3 Dev 3 7 7 7 7 7
  • 15.
    Preliminary (recommendation)  Recommender algorithms  Hybrid recommendation  Content-boosted collaborative filtering (CBCF)  Combining an existing CBR with a CF Information & Database Systems Lab  Improving CBR/CF by combining the complementary strength of CBR and CF CF phase Bug Report 5 Bug 1 Bug 2 Bug 3 Bug 4 Bug 5 Dev 1 10 9 7 9 8 Dev 2 5 8 3 8 5 Dev 3 7 8 5 7 6 Existing recommendation approaches are not suitable!
  • 16.
    Preliminary (Bug triage)  PureCBR [Anvik06]  Construct multi-class classifier using a SVM classifier  Bug reports B are converted into pair <F(B), D> for training  Bug Report History  Input Data Information & Database Systems Lab  Developers  Classes F(B) is the feature vector indicating the counts of the keyword w of description of B D is the developer who fixed B (class) Bug Fix History training Assign a new bug to Dev 1 New Bug Report Classifier’s scores
  • 17.
    Preliminary (Bug triage)  PureCBR [Anvik06]  Good performances  Problem  This approach only considers accuracy Information & Database Systems Lab  Many bugs  One super-developer  Over-specialization problem  Are developer happy?  We consider developer’s cost (e.g., interests, bug fix time, and expertise)  We assume that faster bug fixing time has higher developer cost Bug Bug Report 1 Report 2 50 days 2 days
  • 18.
    Goal  Goal  Find efficient bug-developer matching  Optimizing not only accuracy but also cost  Use modified CBCF approach Information & Database Systems Lab  Constructing developer profiles for cost  Challenge  Enhancing CBCF approach for sparse data  Extreme sparseness of the past bug fix history data  A bug fixed by a developer  Need to reduce sparseness for enhancing quality of CBCF Bug fix time from bug fix history
  • 19.
    Overview Cost Developer profiles Cost score Information & Database Systems Lab Recommended New Bug Aggregation Developer Report Accuracy Bug classifier Accuracy score <SVM>  Merging classifier’s scores and developer’s cost scores.  The accuracy scores are obtained using PureCBR [Anvik06]  The developer cost scores are obtained from “de-sparsified” bug fix history.  Two scores are then merged for prediction Assign a new bug to Dev 1 + = Accuracy scores Cost scores Hybrid scores
  • 20.
    CosTriage (Cost Estimation)  CosTriage: A Cost-aware Triage Algorithm for bug reporting system  Challenge to estimate the developer cost?  How to reduce the sparseness problem? Information & Database Systems Lab  Using a Topic Modeling Bug fix time from bug fix history Categorization bugs to reduce the sparseness
  • 21.
    CosTriage  Categorizing bugs  Topic Modeling  Latent Dirichlet Allocation (LDA) [BleiNg03]  Each topic is represented as a bug type Information & Database Systems Lab  The topic distribution of reports determine bug types  We adopt the divergence measure proposed in [Arun, R. PAKDD ‘10]  Finding the natural number of topics (# bug types) t is the natural number of bug types
  • 22.
    CosTriage  Developer profiles modeling  Developer Profiles  N-dimensional feature vector  The element of developer profiles, Pu[i], denotes the developer cost Information & Database Systems Lab for ith-type bugs T denotes the number of bug types  Developer Cost  The average time to fix ith type bugs
  • 23.
    CosTriage  Predicting missing values in profiles  Using CF for developer profiles  Similarity measure: Information & Database Systems Lab k=1
  • 24.
    CosTriage  Obtaining developer’s cost for a new bug report Bug type = 1 Information & Database Systems Lab New Bug Report Developer cost for a new bug
  • 25.
    Merging Cost Developer profiles Cost score <Bug types> Information & Database Systems Lab Recommended Bug Aggregation Reports Developer Accuracy Bug classifier <SVM> Accuracy score  Merging classifier’s scores and developer’s cost scores. + = Accuracy scores Cost scores (CosTriage) Hybrid scores [Anvik06]
  • 26.
    Experiments  Subject Systems  97,910 valid bug reports  255 active developers  From four open source projects Information & Database Systems Lab  Approaches  PureCBR: State of the art CBR-based approach  CBCF: Original CBCF  CosTraige: Our approach
  • 27.
    Experiments  Two research questions Q1. How much can our approach improve cost (bug fix time) without sacrificing bug assignment accuracy? Q2. What are the trade-offs between accuracy and cost (bug fix Information & Database Systems Lab time)?  Evaluation measures W is the set of bug reports predicted correctly. N is the number of bug reports in the test set.  The real fix time is unknown, we only use the fix time for correctly matched bugs.
  • 28.
    Experiments  Relative errors of expected bug fix time Information & Database Systems Lab  Improvement of bug fix time (Q1)  CosTriage improves the costs efficiently up to 30% without seriously compromising accuracy
  • 29.
    Experiments  Trade-off between accuracy and bug fix time (Q2) Information & Database Systems Lab
  • 30.
    Conclusion  We proposed a new bug triaging technique  Optimize not only accuracy but also cost  Solve data sparseness problem by using topic modeling Information & Database Systems Lab  Experiments using four real bug report corpora  Improve the cost without heavy losses of accuracy
  • 31.
    Q&A Information & DatabaseSystems Lab Thank you! Do you have any questions?
  • 32.
    Contact Information & DatabaseSystems Lab Jin-woo Park jwpark85@postech.ac.kr
  • 33.
    Back up  We adopt the divergence measure proposed in [Arun10]  Finding the natural number of topics (# bug types) Information & Database Systems Lab t is the natural number of bug types Mozilla Apache
  • 34.
    Back up -Bug features  Bug features  Keywords of title and description  Other meta data Information & Database Systems Lab Title : Traditional Memory Rendering refactoring request New Bug Description : Request additional refactoring so we can … Report Traditional Rendering. Remove stopwords Traditional Memory Rendering refactoring request Request refactoring … Traditional Rendering