SlideShare a Scribd company logo
1 of 20
Download to read offline
Ticket Tagger
machine learning driven ticket classification
github.com/rafaelkallis/ticket-tagger
Rafael Kallis 1
Andrea Di Sorbo 2
Gerardo Canfora 2
Sebastiano Panichella 3
1
University of Zurich, Switzerland
2
University of Sannio, Italy
3
Zurich University of Applied Sciences, Switzerland
Ticket Tagger
Introduction
• Issue trackers are essential tools for creating, managing and
addressing issues that occur in software systems.
• A critical aspect for handling and prioritizing issues involves the
assignment of labels to them in order to determine their
type [4].
• The labeling process has a positive impact on the effectiveness
of issue processing [6].
1
Ticket Tagger
GitHub issue tracker of Microsoft’s vscode 2
Ticket Tagger
Motivation
• Manually assigning labels to issues is a labor-intensive and
time-consuming task for project managers [3]
• Current labeling mechanism is scarcely used on GitHub [1, 2].
3
Ticket Tagger
GitHub issue tracker of Microsoft’s vscode
4
Ticket Tagger
Goal
Create a tool that:
• Automatically predicts labels to assign to issues.
• Stimulates the use of labeling mechanisms in software projects.
• Facilitates the issue management and priorization processes.
5
Ticket Tagger
GitHub issue tracker of Microsoft’s vscode 6
Ticket Tagger
Tool
We introduce Ticket Tagger, a tool that leverages machine learning
strategies on issue titles and descriptions for automatically labeling
GitHub issues.
Freely accessible to any developer and can be integrated painlessly
into existing repositories.
github.com/apps/ticket-tagger
7
Ticket Tagger
Architecture
Microservice based on Node.js running on a low-end server.1
GitHub as frontend using GitHub apps api.
11 vCPU, 512 MB RAM, 20 GB SSD
8
Ticket Tagger
Model Selection
• Ticket Tagger uses fastText [5] for labeling tickets.
• FastText is less resource intensive but still competitive against
deep learning models.
• Model trained with 10k GitHub issues drawn at random for each
of the labels: “bug”, “enhancement”, and “question” (30k total).
9
Ticket Tagger
github.com/apps/ticket-tagger
10
Ticket Tagger
11
Ticket Tagger
12
Ticket Tagger
13
Ticket Tagger
Performance Evaluation
Bug Enhancement Question
Precision 82.2% 89.4% 78.1%
Recall 84.1% 76.3% 87.4%
F-measure 83.1% 82.3% 82.5%
10-fold cross validation on training set.
14
Ticket Tagger
machine learning driven ticket classification
github.com/rafaelkallis/ticket-tagger
Rafael Kallis 1
Andrea Di Sorbo 2
Gerardo Canfora 2
Sebastiano Panichella 3
1
University of Zurich, Switzerland
2
University of Sannio, Italy
3
Zurich University of Applied Sciences, Switzerland
Ticket Tagger
Training-Set Preview
__label__bug "scala presentation compiler Scala...
__label__bug "Failed to create an external role...
__label__bug support inline image link bot...
__label__enhancement "Autoplay video on channel...
__label__enhancement Resume from backup with...
__label__enhancement "replace redux store...
__label__question How to disable the log? Your...
__label__question "How to read gradle command...
__label__question Results in Transition...
15
References i
T. F. Bissyandé, D. Lo, L. Jiang, L. Réveillere, J. Klein, and
Y. Le Traon.
Got issues? who cares about it? a large scale investigation of
issue trackers from github.
In 2013 IEEE 24th international symposium on software reliability
engineering (ISSRE), pages 188–197. IEEE, 2013.
J. Cabot, J. L. C. Izquierdo, V. Cosentino, and B. Rolandi.
Exploring the use of labels to categorize issues in open-source
software projects.
In 2015 IEEE 22nd International Conference on Software Analysis,
Evolution, and Reengineering (SANER), pages 550–554, 2015.
References ii
Q. Fan, Y. Yu, G. Yin, T. Wang, and H. Wang.
Where is the road for issue reports classification based on text
mining?
In International Symposium on Empirical Software Engineering
and Measurement, ESEM 2017, pages 121–130, 2017.
J. L. C. Izquierdo, V. Cosentino, B. Rolandi, A. Bergel, and J. Cabot.
Gila: Github label analyzer.
In 22nd IEEE International Conference on Software Analysis,
Evolution, and Reengineering, SANER 2015, Montreal, QC, Canada,
March 2-6, 2015, pages 479–483, 2015.
A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov.
Bag of tricks for efficient text classification.
arXiv preprint arXiv:1607.01759, 2016.
References iii
Z. Liao, D. He, Z. Chen, X. Fan, Y. Zhang, and S. Liu.
Exploring the characteristics of issue-related behaviors in
github using visualization techniques.
IEEE Access, 6:24003–24015, 2018.

More Related Content

Similar to Ticket Tagger at IEEE ICSME 2019

Navigate, Understand, Communicate: How Developers Locate Performance Bugs
Navigate, Understand, Communicate: How Developers Locate Performance BugsNavigate, Understand, Communicate: How Developers Locate Performance Bugs
Navigate, Understand, Communicate: How Developers Locate Performance Bugs
Sebastian Baltes
 

Similar to Ticket Tagger at IEEE ICSME 2019 (20)

TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
 
Kanban Workflow Best Practices for each Role in a Software Team — Part 3 of "...
Kanban Workflow Best Practices for each Role in a Software Team — Part 3 of "...Kanban Workflow Best Practices for each Role in a Software Team — Part 3 of "...
Kanban Workflow Best Practices for each Role in a Software Team — Part 3 of "...
 
[Rakuten TechConf2014] [C-6] Japan ICHIBA Daily Work - Tools & Processes
[Rakuten TechConf2014] [C-6] Japan ICHIBA Daily Work - Tools & Processes[Rakuten TechConf2014] [C-6] Japan ICHIBA Daily Work - Tools & Processes
[Rakuten TechConf2014] [C-6] Japan ICHIBA Daily Work - Tools & Processes
 
What would Jesus Developer do?
What would Jesus Developer do?What would Jesus Developer do?
What would Jesus Developer do?
 
Navigate, Understand, Communicate: How Developers Locate Performance Bugs
Navigate, Understand, Communicate: How Developers Locate Performance BugsNavigate, Understand, Communicate: How Developers Locate Performance Bugs
Navigate, Understand, Communicate: How Developers Locate Performance Bugs
 
TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)
 
Exploring the Use of Labels to Categorize Issues in Open-Source Software Pro...
Exploring the Use of Labels to Categorize Issues in Open-Source Software Pro...Exploring the Use of Labels to Categorize Issues in Open-Source Software Pro...
Exploring the Use of Labels to Categorize Issues in Open-Source Software Pro...
 
DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...
DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...
DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...
 
Mirco hering devops for systems of record final
Mirco hering devops for systems of record finalMirco hering devops for systems of record final
Mirco hering devops for systems of record final
 
10 Ways To Improve Your Code
10 Ways To Improve Your Code10 Ways To Improve Your Code
10 Ways To Improve Your Code
 
CASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award TalkCASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award Talk
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
DSAPA.pdf
DSAPA.pdfDSAPA.pdf
DSAPA.pdf
 
Ensuring code quality
Ensuring code qualityEnsuring code quality
Ensuring code quality
 
TDD - Seriously, try it! - Opensouthcode
TDD - Seriously, try it! - OpensouthcodeTDD - Seriously, try it! - Opensouthcode
TDD - Seriously, try it! - Opensouthcode
 
On to code review lessons learned at microsoft
On to code review lessons learned at microsoftOn to code review lessons learned at microsoft
On to code review lessons learned at microsoft
 
Scrum methodology in practice
Scrum methodology in practiceScrum methodology in practice
Scrum methodology in practice
 
Anti-Patterns
Anti-PatternsAnti-Patterns
Anti-Patterns
 
Data science is not Software Development and how Experiment Management can ma...
Data science is not Software Development and how Experiment Management can ma...Data science is not Software Development and how Experiment Management can ma...
Data science is not Software Development and how Experiment Management can ma...
 
Matlab for a computational PhD
Matlab for a computational PhDMatlab for a computational PhD
Matlab for a computational PhD
 

Recently uploaded

"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
hublikarsn
 

Recently uploaded (20)

Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Post office management system project ..pdf
Post office management system project ..pdfPost office management system project ..pdf
Post office management system project ..pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 

Ticket Tagger at IEEE ICSME 2019

  • 1. Ticket Tagger machine learning driven ticket classification github.com/rafaelkallis/ticket-tagger Rafael Kallis 1 Andrea Di Sorbo 2 Gerardo Canfora 2 Sebastiano Panichella 3 1 University of Zurich, Switzerland 2 University of Sannio, Italy 3 Zurich University of Applied Sciences, Switzerland
  • 2. Ticket Tagger Introduction • Issue trackers are essential tools for creating, managing and addressing issues that occur in software systems. • A critical aspect for handling and prioritizing issues involves the assignment of labels to them in order to determine their type [4]. • The labeling process has a positive impact on the effectiveness of issue processing [6]. 1
  • 3. Ticket Tagger GitHub issue tracker of Microsoft’s vscode 2
  • 4. Ticket Tagger Motivation • Manually assigning labels to issues is a labor-intensive and time-consuming task for project managers [3] • Current labeling mechanism is scarcely used on GitHub [1, 2]. 3
  • 5. Ticket Tagger GitHub issue tracker of Microsoft’s vscode 4
  • 6. Ticket Tagger Goal Create a tool that: • Automatically predicts labels to assign to issues. • Stimulates the use of labeling mechanisms in software projects. • Facilitates the issue management and priorization processes. 5
  • 7. Ticket Tagger GitHub issue tracker of Microsoft’s vscode 6
  • 8. Ticket Tagger Tool We introduce Ticket Tagger, a tool that leverages machine learning strategies on issue titles and descriptions for automatically labeling GitHub issues. Freely accessible to any developer and can be integrated painlessly into existing repositories. github.com/apps/ticket-tagger 7
  • 9. Ticket Tagger Architecture Microservice based on Node.js running on a low-end server.1 GitHub as frontend using GitHub apps api. 11 vCPU, 512 MB RAM, 20 GB SSD 8
  • 10. Ticket Tagger Model Selection • Ticket Tagger uses fastText [5] for labeling tickets. • FastText is less resource intensive but still competitive against deep learning models. • Model trained with 10k GitHub issues drawn at random for each of the labels: “bug”, “enhancement”, and “question” (30k total). 9
  • 15. Ticket Tagger Performance Evaluation Bug Enhancement Question Precision 82.2% 89.4% 78.1% Recall 84.1% 76.3% 87.4% F-measure 83.1% 82.3% 82.5% 10-fold cross validation on training set. 14
  • 16. Ticket Tagger machine learning driven ticket classification github.com/rafaelkallis/ticket-tagger Rafael Kallis 1 Andrea Di Sorbo 2 Gerardo Canfora 2 Sebastiano Panichella 3 1 University of Zurich, Switzerland 2 University of Sannio, Italy 3 Zurich University of Applied Sciences, Switzerland
  • 17. Ticket Tagger Training-Set Preview __label__bug "scala presentation compiler Scala... __label__bug "Failed to create an external role... __label__bug support inline image link bot... __label__enhancement "Autoplay video on channel... __label__enhancement Resume from backup with... __label__enhancement "replace redux store... __label__question How to disable the log? Your... __label__question "How to read gradle command... __label__question Results in Transition... 15
  • 18. References i T. F. Bissyandé, D. Lo, L. Jiang, L. Réveillere, J. Klein, and Y. Le Traon. Got issues? who cares about it? a large scale investigation of issue trackers from github. In 2013 IEEE 24th international symposium on software reliability engineering (ISSRE), pages 188–197. IEEE, 2013. J. Cabot, J. L. C. Izquierdo, V. Cosentino, and B. Rolandi. Exploring the use of labels to categorize issues in open-source software projects. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pages 550–554, 2015.
  • 19. References ii Q. Fan, Y. Yu, G. Yin, T. Wang, and H. Wang. Where is the road for issue reports classification based on text mining? In International Symposium on Empirical Software Engineering and Measurement, ESEM 2017, pages 121–130, 2017. J. L. C. Izquierdo, V. Cosentino, B. Rolandi, A. Bergel, and J. Cabot. Gila: Github label analyzer. In 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2015, Montreal, QC, Canada, March 2-6, 2015, pages 479–483, 2015. A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759, 2016.
  • 20. References iii Z. Liao, D. He, Z. Chen, X. Fan, Y. Zhang, and S. Liu. Exploring the characteristics of issue-related behaviors in github using visualization techniques. IEEE Access, 6:24003–24015, 2018.