SlideShare a Scribd company logo
1 of 88
Download to read offline
Development  Emails  Content  Analyzer:  
Intention  Mining  in  Developer  Discussions	
  Andrea	
Di  Sorbo	
  Sebastiano	
Panichella	
  Corrado	
Visaggio	
  Massimiliano	
Di  Penta	
  Gerardo	
Canfora	
  Harald	
Gall
Outline  
	
Context:  	
Wri5en  	
Development  Discussions	
	
Case  Study:  	
Development  Mailing  List	
of  2  Open  Source  Projects	
	
Results:	
Automatic  Classification  of  Relevant	
Contents  in  Developers’  Communication	
	
2
Open  Source  (OS)  and    
Industrial  Projects  	
3
Open  Source  (OS)  and    
Industrial  Projects	
4
Open  Source  (OS)  and    
Industrial  Projects	
5
Open  Source  (OS)  and    
Industrial  Projects	
6
Development    
Communication  Means	
Recommender  systems:	
-­‐‑  Bug  Triaging  [1]	
-­‐‑  Suggest  Mentors  [2]	
-­‐‑  Code  re-­‐‑documentation  [3]	
-­‐‑  Etc.	
[1]  Anvik  et  al.  “Who  should  fix  this  bug?”.	
[2]  Canfora  et  al.  “Who  is  going  to  mentor  newcomers  in  open  source  projects?”  	
[3]  Panichella  et  al.  “Mining  source  code  descriptions  from  developer  communications”	
7
Development    
Communication  Means	
8
Development    
Communication  Means	
[1]  Bacchelli  et  al.  “Content  classification  of  development  emails”.	
[2]  Cerulo  et  al.  “A  Hidden  Markov  Model  to  detect  coded  information  islands  in  free  text.”  	
9
Different  Kinds  of  Data  
	
Structured	
Semi-­‐‑Structured	
Unstructured	
10
A  Considerable  Effort  for  
Developers	
Many  messages  	
Developers  get  lost  in  unnecessary  details  
missing  potential  useful  information…	
11
Previous  Work  
	
12
Hana  et  al.	
“…Lazy”  RTC  occurs  when  
a  core  developer  post  a  
change  to  a  mailing  lists  
and  nobody  responds,	
  it  assumed  that  other  
developers  reviewed  the  
code…”
Previous  Work  
	
Approaches  for:  	
-­‐‑  Generating  summaries  	
        of  emails.  	
            à  Lam  et  al.  ,  	
            à  Rambow  et  al.	
-­‐‑  Generating  summaries  	
          of  bug  reports.	
            à    Rastkar  et  al.	
13
Different  Purposes  
	
Feature  requests	
Bug  disclosures	
Project  Management	
14
DECA  
(Development  Email  Content  Analyzer)	
An  approach  to  Classify  Paragraphs  	
According  to  Intentions	
hSp://www.ifi.uzh.ch/seal/people/panichella/tools/DECA.html	
 15
Why  use  NLP  for  Classifying  
Paragraphs  According  to  
Intentions?	
16
Example	
i.  We  could  use  a  leaky  bucket  algorithm  to  limit  
the  bandwidth	
ii.  The  leaky  bucket  algorithm  fails  in  limiting  the  
bandwidth  	
17
i.  We  could  use  a  leaky  bucket  algorithm  to  limit  
the  bandwidth	
ii.  The  leaky  bucket  algorithm  fails  in  limiting  the  
bandwidth  	
      An  high  percentage  of  words  in  common	
Example	
18
i.  We  could  use  a  leaky  bucket  algorithm  to  limit  
the  bandwidth	
ii.  The  leaky  bucket  algorithm  fails  in  limiting  the  
bandwidth  	
Discuss  about  the  same  topics	
Example	
19
i.  We  could  use  a  leaky  bucket  algorithm  to  limit  
the  bandwidth	
ii.  The  leaky  bucket  algorithm  fails  in  limiting  the  
bandwidth  	
Have  different  intentions	
Example	
20
i.  We  could  use  a  leaky  bucket  algorithm  to  limit  
the  bandwidth	
ii.  The  leaky  bucket  algorithm  fails  in  limiting  the  
bandwidth  	
Have  different  intentions	
Example	
“Techniques  based  on  lexicon  analysis,  such  as  VSM  [1],  LSI  [2],  or  LDA  [3]  would  
not  be  sufficient  to  classify  paragraphs  according  to  intentions”.	
.	
[1]  Baeza-­‐‑Yates  et  al.  “Modern  Information  Retrieval”.	
[2]  de  Marneffe  et  al.,  “The  Stanford  typed  dependencies  representation”.	
[3]  Blei  et  al.,  “Latent  dirichlet  allocation”.	
21
Perspective  
	
22
Goal:  Understanding  to  what  extent  NL  parsing  could  be  
used     in  recognizing  informative  text  fragments  in  emails  
from  a  software  maintenance  and  evolution  perspective	
	
Quality   focus:   Detection   of   text   paragraphs   in  
development   discussions      containing   helpful   information  
for  developers.  	
	
Perspective:   Guide   developers   in   maintaining   and  
evolving  their  products.  	
Case  Study  
	
23
Research  Questions  
	
RQ1:   Can   an   NLP   approach   (i.e.   DECA)   be  
effective   in   classifying   writers’   intentions   in  
development  emails?	
RQ2:  Is  DECA  more  effective  than  existing  
Machine   Learning   techniques   in  
classifying  development  emails  content?	
24
Qt	
Ubuntu	
Context	
25
STEPS:	
1)    Taxonomy  Definition	
  	
  2)    Classification  Based  on  DECA  (NLP  Analyzer)	
26
Taxonomy Definition
27
Sampling  	
	
We  selected  100	
Of  the                            Project      	
28
Clustering	
	
Clusters	
Implementation	
Technical  Infrastructure	
Project  Status	
Social  Interations	
Usage	
Discarded	
Guzzi  et.  al  –  MSR2013	
29
Clustering	
	
Guzzi  et.  al  –  ICSE2012	
30
The  final  taxonomy	
	
31
Differences  with  Guzzi  et.  al.	
	
32
Examples	
	
33
Natural Language
Parsing
DECA  
(Development  Email  Content  Analyzer)	
34
Recurrent  Linguistic  PaSerns	
	
35
Why  NL  parsing?  
	
Well  defined  predicate-­‐‑argument  structures	
use	
we	
 could	
 algorithm	
a	
 leaky	
 bucket	
limit	
to	
 bandwidth	
the	
            nsubj                    aux                    dobj                        xcomp	
            det                  amod                    nn          	
            aux                        dobj            	
det	
fails	
algorithm	
the	
 leaky	
 bucket	
in	
limiting	
bandwidth	
the	
                                    nsubj                                                          prep	
                det                amod          nn    	
                 pcomp  	
                dobj  	
                det  	
36
NL  parsing	
Natural  Language  Templates	
use	
[someone]	
 could	
 [something]	
                          nsubj                    aux                    dobj	
fails	
[somehing]	
nsubj	
37
Natural  Language  Templates	
use	
[someone]	
 could	
 [something]	
                          nsubj                    aux                    dobj	
fails	
[somehing]	
nsubj	
NL  parsing	
38
Natural  Language  Templates	
use	
[someone]	
 could	
 [something]	
                          nsubj                    aux                    dobj	
fails	
[somehing]	
nsubj	
NL  parsing	
39
NLP  Heuristics	
	
40
NLP  Parser	
	
raw  text	
 NLP  parser	
 NLP  heuristics	
41
42
43
RQ1:  	
Is  DECA  effective  in  	
classifying  writers’  intentions  in  
development  emails?	
	
44
Experiment  I	
	
training	
test	
102 87
100
45
Experiment  I	
	
training	
test	
102 87
100
Experiment  II	
False  
Negative	
 46
Experiment  II	
	
training	
100 169
test	
100
Experiment  III	
False  
Negative	
 47
Experiment  III	
	
training	
100 231
test	
100
48
49
50
51
52
53
54
RQ2:  	
Is  the  proposed  approach  more  
effective  than  existing  ML  in  classifying  
development  emails  content?	
	
55
ML  for  Email  Classification	
	
An  Approach  Based  on  ML  for  Email  Content  Classification	
            à  Antoniol  et.  al.,  CASCON  2008    	
            à  Zhou  et  al.  ,  ICSME  2014	
	
56
ML  for  Email  Classification	
	
An  Approach  Based  on  ML  for  Email  Content  Classification	
1)Text  Features	
	
57
ML  for  Email  Classification	
	
An  Approach  Based  on  ML  for  Email  Content  Classification	
1)Text  Features	
	
2)  Split  training  
and  test  sets	
	
58
ML  for  Email  Classification	
	
An  Approach  Based  on  ML  for  Email  Content  Classification	
1)Text  Features	
	
2)  Split  training  
and  test  sets	
	
3)  Oracle  
building	
59
ML  for  Email  Classification	
	
An  Approach  Based  on  ML  for  Email  Content  Classification	
1)Text  Features	
	
2)  Split  training  
and  test  sets	
	
3)  Oracle  
building	
4)  Classification	
training	
prediction	
            à  Antoniol  et.  al.,  CASCON  2008    	
            à  Zhou  et  al.  ,  ICSME  2014	
	
60
61
62
63
64
65
66
67
68
69
Summary	
	
•  RQ2:   DECA outperforms traditional ML techniques in
terms of recall, precision and F-Measure when
classifying e-mail content.	
•  RQ1:   the automatic classification performed by DECA
achieves very good results in terms of both precision,
recall and F-measure (over all the experiments).	
70
Summary	
	
•  RQ2:   DECA outperforms traditional ML techniques in
terms of recall, precision and F-Measure when
classifying e-mail content.	
”…it took the MSR community more than 10 years to
figure out that machine learning is not the best method
for analyzing human-written text. Thank you for helping
move the field forward…”  [One of the ASE Reviewers]	
•  RQ1:   the automatic classification performed by DECA
achieves very good results in terms of both precision,
recall and F-measure (over all the experiments).	
71
72
Code  e-­‐‑documentation	
	
àPanichella  et.  al.  –  ICPC  2012  
Extract  methods’  descriptions  from  
developers  discussions	
	
à  Vector  Space  Models	
à  ad  hoc  heuristics	
“…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
73
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
74
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
75
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
76
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
77
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
78
Code  re-­‐‑documentation	
	
 “…  several  are  the  discourse  
paIerns  that  characterize  false  
negative  method  descriptions…  “	
79
Code  re-­‐‑documentation	
	
delete	
  
80
Conclusion	
	
81
Conclusion	
	
82
Conclusion	
	
83
Conclusion	
	
84
Conclusion	
	
85
Conclusion	
	
86
Future  work	
	
1)DECA  as  preprocessing  
support  to  discard  irrelevant  
sentences  in  summarization  
approaches	
87
Future  work	
	
1)DECA  as  preprocessing  
support  to  discard  irrelevant  
sentences  in  summarization  
approaches	
2)DECA  in  combination  with  
topic  models  for  mining  
contents  with  the  same  intentions  
and  the  same  topics  	
88

More Related Content

What's hot

IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET Journal
 
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an OverviewNatural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overviewalessio_ferrari
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
Nlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_finalNlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_finalJeffrey Shomaker
 
Senjuti Kundu - Resume
Senjuti Kundu - ResumeSenjuti Kundu - Resume
Senjuti Kundu - ResumeSenjuti Kundu
 
Thesis+of+latifa+guerrouj.ppt
Thesis+of+latifa+guerrouj.pptThesis+of+latifa+guerrouj.ppt
Thesis+of+latifa+guerrouj.pptPtidej Team
 
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...PhD Assistance
 
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
 
Butler
ButlerButler
Butleranesah
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
Quality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyQuality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyAnkica Barisic
 
130817 latifa guerrouj - context-aware source code vocabulary normalization...
130817   latifa guerrouj - context-aware source code vocabulary normalization...130817   latifa guerrouj - context-aware source code vocabulary normalization...
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
 

What's hot (17)

CV - DCHATTERJI
CV - DCHATTERJICV - DCHATTERJI
CV - DCHATTERJI
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an OverviewNatural Language Processing (NLP) for Requirements Engineering (RE): an Overview
Natural Language Processing (NLP) for Requirements Engineering (RE): an Overview
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
Nlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_finalNlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_final
 
Icpc13.ppt
Icpc13.pptIcpc13.ppt
Icpc13.ppt
 
Senjuti Kundu - Resume
Senjuti Kundu - ResumeSenjuti Kundu - Resume
Senjuti Kundu - Resume
 
Thesis+of+latifa+guerrouj.ppt
Thesis+of+latifa+guerrouj.pptThesis+of+latifa+guerrouj.ppt
Thesis+of+latifa+guerrouj.ppt
 
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
Conversational AI:An Overview of Techniques, Applications & Future Scope - Ph...
 
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
 
Butler
ButlerButler
Butler
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
Quality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyQuality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case study
 
130817 latifa guerrouj - context-aware source code vocabulary normalization...
130817   latifa guerrouj - context-aware source code vocabulary normalization...130817   latifa guerrouj - context-aware source code vocabulary normalization...
130817 latifa guerrouj - context-aware source code vocabulary normalization...
 
Modest Formalization of Software Design Patterns
Modest Formalization of Software Design PatternsModest Formalization of Software Design Patterns
Modest Formalization of Software Design Patterns
 

Similar to Email Content Analyzer Classifies Developer Discussions

A novel approach based on topic
A novel approach based on topicA novel approach based on topic
A novel approach based on topiccsandit
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationkrws
 
‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud ‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud acijjournal
 
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...KhondokerAbuNaim
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models IJECEIAES
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningIRJET Journal
 
A novel approach for clone group mapping
A novel approach for clone group mappingA novel approach for clone group mapping
A novel approach for clone group mappingijseajournal
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET Journal
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...Sabine Seitz
 
Dominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for RequirementsClément Portet
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringJordi Cabot
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemIRJET Journal
 
Requirementv4
Requirementv4Requirementv4
Requirementv4stat
 
Reusability Metrics for Object-Oriented System: An Alternative Approach
Reusability Metrics for Object-Oriented System: An Alternative ApproachReusability Metrics for Object-Oriented System: An Alternative Approach
Reusability Metrics for Object-Oriented System: An Alternative ApproachWaqas Tariq
 

Similar to Email Content Analyzer Classifies Developer Discussions (20)

A novel approach based on topic
A novel approach based on topicA novel approach based on topic
A novel approach based on topic
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
short-story.pptx
short-story.pptxshort-story.pptx
short-story.pptx
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
 
‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud ‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud
 
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine Learning
 
A novel approach for clone group mapping
A novel approach for clone group mappingA novel approach for clone group mapping
A novel approach for clone group mapping
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...
How a Social Knowledge Graph Improves Remote Working by Capturing Context fro...
 
Dominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender SystemsDominik Kowald PhD Defense Recommender Systems
Dominik Kowald PhD Defense Recommender Systems
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for Requirements
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software Engineering
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
 
Requirementv4
Requirementv4Requirementv4
Requirementv4
 
Reusability Metrics for Object-Oriented System: An Alternative Approach
Reusability Metrics for Object-Oriented System: An Alternative ApproachReusability Metrics for Object-Oriented System: An Alternative Approach
Reusability Metrics for Object-Oriented System: An Alternative Approach
 

More from Sebastiano Panichella

The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Sebastiano Panichella
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSebastiano Panichella
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsSebastiano Panichella
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...Sebastiano Panichella
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Sebastiano Panichella
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingSebastiano Panichella
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Sebastiano Panichella
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsSebastiano Panichella
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Sebastiano Panichella
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22Sebastiano Panichella
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...Sebastiano Panichella
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Sebastiano Panichella
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.Sebastiano Panichella
 

More from Sebastiano Panichella (20)

The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical Systems
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software Engineering
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play Apps
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.
 

Recently uploaded

Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 

Recently uploaded (19)

Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 

Email Content Analyzer Classifies Developer Discussions

  • 1. Development  Emails  Content  Analyzer:   Intention  Mining  in  Developer  Discussions  Andrea Di  Sorbo  Sebastiano Panichella  Corrado Visaggio  Massimiliano Di  Penta  Gerardo Canfora  Harald Gall
  • 2. Outline   Context:   Wri5en   Development  Discussions Case  Study:   Development  Mailing  List of  2  Open  Source  Projects Results: Automatic  Classification  of  Relevant Contents  in  Developers’  Communication 2
  • 3. Open  Source  (OS)  and     Industrial  Projects   3
  • 4. Open  Source  (OS)  and     Industrial  Projects 4
  • 5. Open  Source  (OS)  and     Industrial  Projects 5
  • 6. Open  Source  (OS)  and     Industrial  Projects 6
  • 7. Development     Communication  Means Recommender  systems: -­‐‑  Bug  Triaging  [1] -­‐‑  Suggest  Mentors  [2] -­‐‑  Code  re-­‐‑documentation  [3] -­‐‑  Etc. [1]  Anvik  et  al.  “Who  should  fix  this  bug?”. [2]  Canfora  et  al.  “Who  is  going  to  mentor  newcomers  in  open  source  projects?”   [3]  Panichella  et  al.  “Mining  source  code  descriptions  from  developer  communications” 7
  • 9. Development     Communication  Means [1]  Bacchelli  et  al.  “Content  classification  of  development  emails”. [2]  Cerulo  et  al.  “A  Hidden  Markov  Model  to  detect  coded  information  islands  in  free  text.”   9
  • 10. Different  Kinds  of  Data   Structured Semi-­‐‑Structured Unstructured 10
  • 11. A  Considerable  Effort  for   Developers Many  messages   Developers  get  lost  in  unnecessary  details   missing  potential  useful  information… 11
  • 12. Previous  Work   12 Hana  et  al. “…Lazy”  RTC  occurs  when   a  core  developer  post  a   change  to  a  mailing  lists   and  nobody  responds,  it  assumed  that  other   developers  reviewed  the   code…”
  • 13. Previous  Work   Approaches  for:   -­‐‑  Generating  summaries          of  emails.              à  Lam  et  al.  ,              à  Rambow  et  al. -­‐‑  Generating  summaries            of  bug  reports.            à    Rastkar  et  al. 13
  • 14. Different  Purposes   Feature  requests Bug  disclosures Project  Management 14
  • 15. DECA   (Development  Email  Content  Analyzer) An  approach  to  Classify  Paragraphs   According  to  Intentions hSp://www.ifi.uzh.ch/seal/people/panichella/tools/DECA.html 15
  • 16. Why  use  NLP  for  Classifying   Paragraphs  According  to   Intentions? 16
  • 17. Example i.  We  could  use  a  leaky  bucket  algorithm  to  limit   the  bandwidth ii.  The  leaky  bucket  algorithm  fails  in  limiting  the   bandwidth   17
  • 18. i.  We  could  use  a  leaky  bucket  algorithm  to  limit   the  bandwidth ii.  The  leaky  bucket  algorithm  fails  in  limiting  the   bandwidth        An  high  percentage  of  words  in  common Example 18
  • 19. i.  We  could  use  a  leaky  bucket  algorithm  to  limit   the  bandwidth ii.  The  leaky  bucket  algorithm  fails  in  limiting  the   bandwidth   Discuss  about  the  same  topics Example 19
  • 20. i.  We  could  use  a  leaky  bucket  algorithm  to  limit   the  bandwidth ii.  The  leaky  bucket  algorithm  fails  in  limiting  the   bandwidth   Have  different  intentions Example 20
  • 21. i.  We  could  use  a  leaky  bucket  algorithm  to  limit   the  bandwidth ii.  The  leaky  bucket  algorithm  fails  in  limiting  the   bandwidth   Have  different  intentions Example “Techniques  based  on  lexicon  analysis,  such  as  VSM  [1],  LSI  [2],  or  LDA  [3]  would   not  be  sufficient  to  classify  paragraphs  according  to  intentions”. . [1]  Baeza-­‐‑Yates  et  al.  “Modern  Information  Retrieval”. [2]  de  Marneffe  et  al.,  “The  Stanford  typed  dependencies  representation”. [3]  Blei  et  al.,  “Latent  dirichlet  allocation”. 21
  • 23. Goal:  Understanding  to  what  extent  NL  parsing  could  be   used    in  recognizing  informative  text  fragments  in  emails   from  a  software  maintenance  and  evolution  perspective Quality   focus:   Detection   of   text   paragraphs   in   development   discussions     containing   helpful   information   for  developers.   Perspective:   Guide   developers   in   maintaining   and   evolving  their  products.   Case  Study   23
  • 24. Research  Questions   RQ1:   Can   an   NLP   approach   (i.e.   DECA)   be   effective   in   classifying   writers’   intentions   in   development  emails? RQ2:  Is  DECA  more  effective  than  existing   Machine   Learning   techniques   in   classifying  development  emails  content? 24
  • 26. STEPS: 1)    Taxonomy  Definition    2)    Classification  Based  on  DECA  (NLP  Analyzer) 26
  • 28. Sampling   We  selected  100 Of  the                            Project       28
  • 29. Clustering Clusters Implementation Technical  Infrastructure Project  Status Social  Interations Usage Discarded Guzzi  et.  al  –  MSR2013 29
  • 30. Clustering Guzzi  et.  al  –  ICSE2012 30
  • 32. Differences  with  Guzzi  et.  al. 32
  • 34. Natural Language Parsing DECA   (Development  Email  Content  Analyzer) 34
  • 36. Why  NL  parsing?   Well  defined  predicate-­‐‑argument  structures use we could algorithm a leaky bucket limit to bandwidth the            nsubj                    aux                    dobj                        xcomp            det                  amod                    nn                      aux                        dobj             det fails algorithm the leaky bucket in limiting bandwidth the                                    nsubj                                                          prep                det                amod          nn                    pcomp                  dobj                  det   36
  • 37. NL  parsing Natural  Language  Templates use [someone] could [something]                          nsubj                    aux                    dobj fails [somehing] nsubj 37
  • 38. Natural  Language  Templates use [someone] could [something]                          nsubj                    aux                    dobj fails [somehing] nsubj NL  parsing 38
  • 39. Natural  Language  Templates use [someone] could [something]                          nsubj                    aux                    dobj fails [somehing] nsubj NL  parsing 39
  • 41. NLP  Parser raw  text NLP  parser NLP  heuristics 41
  • 42. 42
  • 43. 43
  • 44. RQ1:   Is  DECA  effective  in   classifying  writers’  intentions  in   development  emails? 44
  • 49. 49
  • 50. 50
  • 51. 51
  • 52. 52
  • 53. 53
  • 54. 54
  • 55. RQ2:   Is  the  proposed  approach  more   effective  than  existing  ML  in  classifying   development  emails  content? 55
  • 56. ML  for  Email  Classification An  Approach  Based  on  ML  for  Email  Content  Classification            à  Antoniol  et.  al.,  CASCON  2008                à  Zhou  et  al.  ,  ICSME  2014 56
  • 57. ML  for  Email  Classification An  Approach  Based  on  ML  for  Email  Content  Classification 1)Text  Features 57
  • 58. ML  for  Email  Classification An  Approach  Based  on  ML  for  Email  Content  Classification 1)Text  Features 2)  Split  training   and  test  sets 58
  • 59. ML  for  Email  Classification An  Approach  Based  on  ML  for  Email  Content  Classification 1)Text  Features 2)  Split  training   and  test  sets 3)  Oracle   building 59
  • 60. ML  for  Email  Classification An  Approach  Based  on  ML  for  Email  Content  Classification 1)Text  Features 2)  Split  training   and  test  sets 3)  Oracle   building 4)  Classification training prediction            à  Antoniol  et.  al.,  CASCON  2008                à  Zhou  et  al.  ,  ICSME  2014 60
  • 61. 61
  • 62. 62
  • 63. 63
  • 64. 64
  • 65. 65
  • 66. 66
  • 67. 67
  • 68. 68
  • 69. 69
  • 70. Summary •  RQ2:   DECA outperforms traditional ML techniques in terms of recall, precision and F-Measure when classifying e-mail content. •  RQ1:   the automatic classification performed by DECA achieves very good results in terms of both precision, recall and F-measure (over all the experiments). 70
  • 71. Summary •  RQ2:   DECA outperforms traditional ML techniques in terms of recall, precision and F-Measure when classifying e-mail content. ”…it took the MSR community more than 10 years to figure out that machine learning is not the best method for analyzing human-written text. Thank you for helping move the field forward…”  [One of the ASE Reviewers] •  RQ1:   the automatic classification performed by DECA achieves very good results in terms of both precision, recall and F-measure (over all the experiments). 71
  • 72. 72
  • 73. Code  e-­‐‑documentation àPanichella  et.  al.  –  ICPC  2012   Extract  methods’  descriptions  from   developers  discussions à  Vector  Space  Models à  ad  hoc  heuristics “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 73
  • 74. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 74
  • 75. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 75
  • 76. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 76
  • 77. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 77
  • 78. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 78
  • 79. Code  re-­‐‑documentation “…  several  are  the  discourse   paIerns  that  characterize  false   negative  method  descriptions…  “ 79
  • 87. Future  work 1)DECA  as  preprocessing   support  to  discard  irrelevant   sentences  in  summarization   approaches 87
  • 88. Future  work 1)DECA  as  preprocessing   support  to  discard  irrelevant   sentences  in  summarization   approaches 2)DECA  in  combination  with   topic  models  for  mining   contents  with  the  same  intentions   and  the  same  topics   88