SlideShare a Scribd company logo
Exploratory Study of Slack Q&A Chats
as a Mining Source for
Software Engineering Tools
Preetha Chatterjee Kostadin Damevski Lori Pollock Vinay Augustine Nicholas A. Kraft
1
2
8 million daily active users
Given Slack’s increased use, are Slack Q&A chats a good mining source for
Software Engineering tools?
3
https://www.statista.com/statistics/652779/worldwide-slack-users-total-vs-paid/
16 140 268
500
750
1,100
1,700
2,000
2,300
2,700
3,000
4,000
6,000
8,000
10,000
0
2000
4000
6000
8000
10000
12000
Numberofusersinthousands
Research Questions
4
RQ1. How prevalent is the kinds of information that has
been successfully mined from the Stack Overflow Q&A
forum to support software engineering tools in developer
Q&A chats such as Slack?
RQ2. Do Slack Q&A chats have characteristics that might
inhibit automatic mining of information to support
software engineering tools?
Data Sets
5
Community
(Slack Channels)
#Conversations Community
(SO Tags)
#Posts
Slackauto Slackmanual SOauto SOmanual
clojurians#clojure 5,013 80 clojure 1,3920 80
elmlang#beginners 7,627 80 elm 1,019 160
elmlang#general 5,906 80 - - -
pythondev#help 3,768 80 python 806,763 80
racket#general 1,579 80 racket 3,592 80
Total 23,893 400 Total 825,294 400
Data Preparation:
• Chat Disentanglement [Elsner and Charniak 2008]
• LDA topic model
Research Questions
6
RQ1. How prevalent is the kinds of information that has
been successfully mined from the Stack Overflow Q&A
forum to support software engineering tools in developer
Q&A chats such as Slack?
RQ2. Do Slack Q&A chats have characteristics that might
inhibit automatic mining of information to support
software engineering tools?
How has Stack Overflow been used as a
mining resource?
8
Code:
• IDE code recommendation [DeSouza‘14, Rahman‘14, Cordeiro’12, Ponzanelli‘14,
Bacchelli‘12, Amintaber‘15]
• Automatic generation of comments [Wong’13, Rahman‘15]
API:
• Learning and recommendation of APIs [Chen’16, Rahman’16, Wang’13]
• Augmenting API documentation [Treude‘16, Subramanian ‘14, Chen’14]
Other:
• Building thesaurus of software-specific terms [Tian’14, Chen’17]
• Gender bias and emotions [Novielli’14, Morgan ’17, Ford’16]
RQ1: Prevalence of information
Study Measures
9
Measure
Document length
Code snippet count
Code snippet length
Bad code snippets
Gist links
Stack Overflow links
API mentions in code snippets
API mentions in text
RQ1: Prevalence of information
Study Measures
10
RQ1: Prevalence of information
11
Much of the information mined from Stack Overflow is also available on Slack
Q&A channels.
API mentions are available in larger quantities on Slack Q&A channels.
Links are rarely available on both Slack and Stack Overflow Q&A.
Study Results
RQ1: Prevalence of information
Research Questions
12
RQ1. How prevalent is the kinds of information that has
been successfully mined from the Stack Overflow Q&A
forum to support software engineering tools in developer
Q&A chats such as Slack?
RQ2. Do Slack Q&A chats have characteristics that might
inhibit automatic mining of information to support
software engineering tools?
13
Measure
Participant count
Questions with no answer
Answer count
Indicators of accepted answers
Questions with no accepted answer
NL text context per code snippet
Incomplete sentences
Noise in document
Knowledge construction process *
* A. Zagalsky, D. M. German, M.-A. Storey, C. G. Teshima, and G. Poo-Caamaño, “How the R community creates and
curates knowledge: An extended study of Stack Overflow and mailing lists,” Empirical Software Engineering, 2017.
RQ2: Challenges of Mining Slack
Study Measures
14
Words/Phrases: good find; Thanks for your help; cool; this works; that’s it, thanks
a bunch for the swift and adequate pointers; Ah, ya that works; thx for the info;
alright, thx; awesome; that would work; your suggestion is what I landed on; will
have a look thank you; checking it out now thanks; that what i thought; Ok; okay;
kk; maybe this is what i am searching for; handy trick; I see, I’ll give it a whirl;
thanks for the insight!; thanks for the quick response @user, that was extremely
helpful!; That’s a good idea! ; gotcha; oh, I see; Ah fair; that really helps; ah, I
think this is falling into place; that seems reasonable; Thanks for taking the time to
elaborate; Yeah, that did it; why didn’t I try that?
Emojis:
Accepted Answer Indicators
RQ2: Challenges of Mining Slack
15
Measure Results
Participant frequency 1 < 2 < 34
Questions with no answer 15.75%
Answer frequency 0 < 1 < 5
Questions with no accepted answer 52.25%
NL text context per code snippet 0 < 2 < 13
Incomplete sentences 12.63%
Noise in document 10.5%
Knowledge construction
61.5% crowd; 38.5%
participatory
RQ2: Challenges of Mining Slack
Study Results
Study Results
16
Accepted answers are available in chat conversations, but require more effort
to discern.
Participatory conversations provide additional value but require deeper analysis
of conversational context.
Percentages of incomplete sentences and noise are low.
RQ2: Challenges of Mining Slack
Measure Results
Participant frequency 1 < 2 < 34
Questions with no answer 15.75%
Answer frequency 0<1<5
Questions with no accepted answer 52.25%
NL text context per code snippet 0 < 2 < 13
Incomplete sentences 12.63%
Noise in document 10.5%
Knowledge construction 61.5% crowd; 38.5% participatory
17
P. Chatterjee, M. A. Nishi, K. Damevski, V. Augustine, L. Pollock and N. A. Kraft, "What information about code
snippets is available in different software-related documents? An exploratory study," 2017 IEEE 24th International
Conference on Software Analysis, Evolution and Reengineering (SANER), Klagenfurt, 2017, pp. 382-386.
The largest proportion of Slack Q&A conversations discuss software design.
Analyzing Types of Information in Chats
Related Work on Analyzing Chats
18
• Learn developer behaviors [Elliot’03, Shihab’09, Yu’11, Lin’16]
• Filter out off-topic discussion [Chowdhury and Hindle’15]
• Extraction of rationale [Alkadhi’17, ‘18]
• Chatbots [Lebeuf’17, Paikari’18]
Conclusions
19
Q&A chats provide, in lesser quantities, the same information as can be
found in Q&A posts on Stack Overflow.
Adapting technique and training sets can achieve high accuracy in
disentangling the Slack conversations.
It is feasible to apply automated mining approaches to chat conversations
from Slack. However, identifying an accepted answer is non-trivial.
Future Work
Investigate linking between public Slack channels to Stack Overflow.
Mine conversations for software development insights.
Mine opinion statements available in public Slack channels.
20
preethac@udel.edu
@PreethaChatterj
Exploratory Study of Slack Q&A Chats as a Mining Source for
Software Engineering Tools
Q&A chats provide, in lesser quantities, the same information as can be found in
Q&A posts on Stack Overflow.
Adapting technique and training sets can achieve high accuracy in disentangling
the Slack conversations.
It is feasible to apply automated mining approaches to chat conversations from
Slack. However, identifying an accepted answer is non-trivial.
Investigate linking between public Slack channels to Stack Overflow.
Mine conversations for software development insights.
Mine opinion statements available in public Slack channels.
Conclusions
Future Work
Supported by :
• NSF grant grant no. 1812968, 1813253
• DARPA MUSE program Air Force Research
Lab contract no. FA8750-16-2-0288.
Preprint:
https://tinyurl.com/
yxmown4x

More Related Content

What's hot

Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
Sangameswar Venkatraman
 
Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Deepak K
 
Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
Mining Stack Overflow to Tun the IDE into a Self-confident Programming PrompterMining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
Luca Ponzanelli
 
Summarization Techniques for Code, Changes, and Testing
Summarization Techniques for Code, Changes, and TestingSummarization Techniques for Code, Changes, and Testing
Summarization Techniques for Code, Changes, and Testing
Sebastiano Panichella
 
Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...
Ekta Grover
 
Vivo Search
Vivo SearchVivo Search
Vivo Search
Anup Sawant
 
Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)Illustrated Code (ASE 2021)
When Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsWhen Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review Tests
Delft University of Technology
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
Tao Xie
 
Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on Twitter
Lukas Masuch
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis Problem
Mark Cieliebak
 
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs DocumentationDRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
Sebastiano Panichella
 
Extracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic IpsicExtracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic Ipsic
Institute of Contemporary Sciences
 
Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationMasud Rahman
 
My life as a cyborg
My life as a cyborg My life as a cyborg
My life as a cyborg
Alexander Serebrenik
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Debdoot Mukherjee
 
Sentiment analyzer and opinion mining
Sentiment analyzer and opinion miningSentiment analyzer and opinion mining
Sentiment analyzer and opinion mining
Ankush Mehta
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
University of Hawai‘i at Mānoa
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Understanding Log Lines using Development Knowledge
Understanding Log Lines using Development KnowledgeUnderstanding Log Lines using Development Knowledge
Understanding Log Lines using Development Knowledge
SAIL_QU
 

What's hot (20)

Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
 
Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.Improving VIVO search through semantic ranking.
Improving VIVO search through semantic ranking.
 
Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
Mining Stack Overflow to Tun the IDE into a Self-confident Programming PrompterMining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter
 
Summarization Techniques for Code, Changes, and Testing
Summarization Techniques for Code, Changes, and TestingSummarization Techniques for Code, Changes, and Testing
Summarization Techniques for Code, Changes, and Testing
 
Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...
 
Vivo Search
Vivo SearchVivo Search
Vivo Search
 
Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)
 
When Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsWhen Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review Tests
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
 
Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on Twitter
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis Problem
 
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs DocumentationDRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
DRONE: A Tool to Detect and Repair Directive Defects in Java APIs Documentation
 
Extracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic IpsicExtracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic Ipsic
 
Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarization
 
My life as a cyborg
My life as a cyborg My life as a cyborg
My life as a cyborg
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
 
Sentiment analyzer and opinion mining
Sentiment analyzer and opinion miningSentiment analyzer and opinion mining
Sentiment analyzer and opinion mining
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Understanding Log Lines using Development Knowledge
Understanding Log Lines using Development KnowledgeUnderstanding Log Lines using Development Knowledge
Understanding Log Lines using Development Knowledge
 

Similar to Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools

How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
Fabio Calefato
 
Automatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow PostsAutomatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow Posts
Preetha Chatterjee
 
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIsSXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
Chris Busse
 
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc UseAutomatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Preetha Chatterjee
 
2011 Alfresco Community Survey Results
2011 Alfresco Community Survey Results2011 Alfresco Community Survey Results
2011 Alfresco Community Survey Results
Jeff Potts
 
Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics
Rahul Thankachan
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
CS, NcState
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
HPCC Systems
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Daniel Zivkovic
 
Visualising conversation around #c4thepromise
Visualising conversation around #c4thepromiseVisualising conversation around #c4thepromise
Visualising conversation around #c4thepromise
Steve Winton
 
Crowd Documentation - How Programmer Social Communities are Flipping Software...
Crowd Documentation - How Programmer Social Communities are Flipping Software...Crowd Documentation - How Programmer Social Communities are Flipping Software...
Crowd Documentation - How Programmer Social Communities are Flipping Software...
Chris Parnin
 
LLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage
 
Crowdsourcing Documentation in Software Engineering
Crowdsourcing Documentation in Software EngineeringCrowdsourcing Documentation in Software Engineering
Crowdsourcing Documentation in Software Engineering
Margaret-Anne Storey
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
Tao Xie
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Mark Tabladillo
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
University of Saskatchewan
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
Margaret-Anne Storey
 
The Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development ProcessesThe Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development Processes
Christoph Matthies
 
Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation
Proffer Blockchain Hackathon $17K+ prizes | Launch PresentationProffer Blockchain Hackathon $17K+ prizes | Launch Presentation
Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation
Anshul Bhagi
 
Word Cloud Plus with Will and Ray Poynter
Word Cloud Plus with Will and Ray PoynterWord Cloud Plus with Will and Ray Poynter
Word Cloud Plus with Will and Ray Poynter
Ray Poynter
 

Similar to Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools (20)

How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
How to Ask for Technical Help? Evidence-based Guidelines for Writing Question...
 
Automatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow PostsAutomatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow Posts
 
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIsSXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
SXSWi '11: Beyond Wordclouds: Analyzing Trends with Social Media APIs
 
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc UseAutomatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
 
2011 Alfresco Community Survey Results
2011 Alfresco Community Survey Results2011 Alfresco Community Survey Results
2011 Alfresco Community Survey Results
 
Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
Visualising conversation around #c4thepromise
Visualising conversation around #c4thepromiseVisualising conversation around #c4thepromise
Visualising conversation around #c4thepromise
 
Crowd Documentation - How Programmer Social Communities are Flipping Software...
Crowd Documentation - How Programmer Social Communities are Flipping Software...Crowd Documentation - How Programmer Social Communities are Flipping Software...
Crowd Documentation - How Programmer Social Communities are Flipping Software...
 
LLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
 
Crowdsourcing Documentation in Software Engineering
Crowdsourcing Documentation in Software EngineeringCrowdsourcing Documentation in Software Engineering
Crowdsourcing Documentation in Software Engineering
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629
 
ISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-MondalISEC-2021-Presentation-Saikat-Mondal
ISEC-2021-Presentation-Saikat-Mondal
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
The Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development ProcessesThe Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development Processes
 
Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation
Proffer Blockchain Hackathon $17K+ prizes | Launch PresentationProffer Blockchain Hackathon $17K+ prizes | Launch Presentation
Proffer Blockchain Hackathon $17K+ prizes | Launch Presentation
 
Word Cloud Plus with Will and Ray Poynter
Word Cloud Plus with Will and Ray PoynterWord Cloud Plus with Will and Ray Poynter
Word Cloud Plus with Will and Ray Poynter
 

Recently uploaded

Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools

  • 1. Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools Preetha Chatterjee Kostadin Damevski Lori Pollock Vinay Augustine Nicholas A. Kraft 1
  • 2. 2
  • 3. 8 million daily active users Given Slack’s increased use, are Slack Q&A chats a good mining source for Software Engineering tools? 3 https://www.statista.com/statistics/652779/worldwide-slack-users-total-vs-paid/ 16 140 268 500 750 1,100 1,700 2,000 2,300 2,700 3,000 4,000 6,000 8,000 10,000 0 2000 4000 6000 8000 10000 12000 Numberofusersinthousands
  • 4. Research Questions 4 RQ1. How prevalent is the kinds of information that has been successfully mined from the Stack Overflow Q&A forum to support software engineering tools in developer Q&A chats such as Slack? RQ2. Do Slack Q&A chats have characteristics that might inhibit automatic mining of information to support software engineering tools?
  • 5. Data Sets 5 Community (Slack Channels) #Conversations Community (SO Tags) #Posts Slackauto Slackmanual SOauto SOmanual clojurians#clojure 5,013 80 clojure 1,3920 80 elmlang#beginners 7,627 80 elm 1,019 160 elmlang#general 5,906 80 - - - pythondev#help 3,768 80 python 806,763 80 racket#general 1,579 80 racket 3,592 80 Total 23,893 400 Total 825,294 400 Data Preparation: • Chat Disentanglement [Elsner and Charniak 2008] • LDA topic model
  • 6. Research Questions 6 RQ1. How prevalent is the kinds of information that has been successfully mined from the Stack Overflow Q&A forum to support software engineering tools in developer Q&A chats such as Slack? RQ2. Do Slack Q&A chats have characteristics that might inhibit automatic mining of information to support software engineering tools?
  • 7. How has Stack Overflow been used as a mining resource? 8 Code: • IDE code recommendation [DeSouza‘14, Rahman‘14, Cordeiro’12, Ponzanelli‘14, Bacchelli‘12, Amintaber‘15] • Automatic generation of comments [Wong’13, Rahman‘15] API: • Learning and recommendation of APIs [Chen’16, Rahman’16, Wang’13] • Augmenting API documentation [Treude‘16, Subramanian ‘14, Chen’14] Other: • Building thesaurus of software-specific terms [Tian’14, Chen’17] • Gender bias and emotions [Novielli’14, Morgan ’17, Ford’16] RQ1: Prevalence of information
  • 8. Study Measures 9 Measure Document length Code snippet count Code snippet length Bad code snippets Gist links Stack Overflow links API mentions in code snippets API mentions in text RQ1: Prevalence of information
  • 10. 11 Much of the information mined from Stack Overflow is also available on Slack Q&A channels. API mentions are available in larger quantities on Slack Q&A channels. Links are rarely available on both Slack and Stack Overflow Q&A. Study Results RQ1: Prevalence of information
  • 11. Research Questions 12 RQ1. How prevalent is the kinds of information that has been successfully mined from the Stack Overflow Q&A forum to support software engineering tools in developer Q&A chats such as Slack? RQ2. Do Slack Q&A chats have characteristics that might inhibit automatic mining of information to support software engineering tools?
  • 12. 13 Measure Participant count Questions with no answer Answer count Indicators of accepted answers Questions with no accepted answer NL text context per code snippet Incomplete sentences Noise in document Knowledge construction process * * A. Zagalsky, D. M. German, M.-A. Storey, C. G. Teshima, and G. Poo-Caamaño, “How the R community creates and curates knowledge: An extended study of Stack Overflow and mailing lists,” Empirical Software Engineering, 2017. RQ2: Challenges of Mining Slack Study Measures
  • 13. 14 Words/Phrases: good find; Thanks for your help; cool; this works; that’s it, thanks a bunch for the swift and adequate pointers; Ah, ya that works; thx for the info; alright, thx; awesome; that would work; your suggestion is what I landed on; will have a look thank you; checking it out now thanks; that what i thought; Ok; okay; kk; maybe this is what i am searching for; handy trick; I see, I’ll give it a whirl; thanks for the insight!; thanks for the quick response @user, that was extremely helpful!; That’s a good idea! ; gotcha; oh, I see; Ah fair; that really helps; ah, I think this is falling into place; that seems reasonable; Thanks for taking the time to elaborate; Yeah, that did it; why didn’t I try that? Emojis: Accepted Answer Indicators RQ2: Challenges of Mining Slack
  • 14. 15 Measure Results Participant frequency 1 < 2 < 34 Questions with no answer 15.75% Answer frequency 0 < 1 < 5 Questions with no accepted answer 52.25% NL text context per code snippet 0 < 2 < 13 Incomplete sentences 12.63% Noise in document 10.5% Knowledge construction 61.5% crowd; 38.5% participatory RQ2: Challenges of Mining Slack Study Results
  • 15. Study Results 16 Accepted answers are available in chat conversations, but require more effort to discern. Participatory conversations provide additional value but require deeper analysis of conversational context. Percentages of incomplete sentences and noise are low. RQ2: Challenges of Mining Slack Measure Results Participant frequency 1 < 2 < 34 Questions with no answer 15.75% Answer frequency 0<1<5 Questions with no accepted answer 52.25% NL text context per code snippet 0 < 2 < 13 Incomplete sentences 12.63% Noise in document 10.5% Knowledge construction 61.5% crowd; 38.5% participatory
  • 16. 17 P. Chatterjee, M. A. Nishi, K. Damevski, V. Augustine, L. Pollock and N. A. Kraft, "What information about code snippets is available in different software-related documents? An exploratory study," 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), Klagenfurt, 2017, pp. 382-386. The largest proportion of Slack Q&A conversations discuss software design. Analyzing Types of Information in Chats
  • 17. Related Work on Analyzing Chats 18 • Learn developer behaviors [Elliot’03, Shihab’09, Yu’11, Lin’16] • Filter out off-topic discussion [Chowdhury and Hindle’15] • Extraction of rationale [Alkadhi’17, ‘18] • Chatbots [Lebeuf’17, Paikari’18]
  • 18. Conclusions 19 Q&A chats provide, in lesser quantities, the same information as can be found in Q&A posts on Stack Overflow. Adapting technique and training sets can achieve high accuracy in disentangling the Slack conversations. It is feasible to apply automated mining approaches to chat conversations from Slack. However, identifying an accepted answer is non-trivial. Future Work Investigate linking between public Slack channels to Stack Overflow. Mine conversations for software development insights. Mine opinion statements available in public Slack channels.
  • 19. 20 preethac@udel.edu @PreethaChatterj Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools Q&A chats provide, in lesser quantities, the same information as can be found in Q&A posts on Stack Overflow. Adapting technique and training sets can achieve high accuracy in disentangling the Slack conversations. It is feasible to apply automated mining approaches to chat conversations from Slack. However, identifying an accepted answer is non-trivial. Investigate linking between public Slack channels to Stack Overflow. Mine conversations for software development insights. Mine opinion statements available in public Slack channels. Conclusions Future Work Supported by : • NSF grant grant no. 1812968, 1813253 • DARPA MUSE program Air Force Research Lab contract no. FA8750-16-2-0288. Preprint: https://tinyurl.com/ yxmown4x

Editor's Notes

  1. Thank you. I’m Preetha Chatterjee, a PhD student at University of Delaware. Today, I will describe our work on “Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools.” My coauthors are: Kostadin Damevski, Lori Pollock, Vinay Augustine and Nicholas Kraft.
  2. With increased online sharing, developers are having conversations about software via online chat services. (click) Developers use these communities to ask and answer specific development questions, with the aim of improving their own skills and helping others. Slack is currently the most popular platform which hosts many active public channels focused on software development technologies.
  3. Over 8 million active users participate daily on Slack, and this graph shows how the number of users increased on Slack over the past few years. Through this study we investigate given Slack’s increased use, are Slack Q&A chats a good mining source for Software Engineering tools?
  4. For RQ1, We compare the content in Q \& A focused public chat communities (e.g. Slack) with Q \& A based discussion forums (e.g. Stack Overflow). We explore the availability and prevalence of information in Slack that are mined from SO, which provides us with the first insight into the prospect of chat communities as a source of mining. As a part of RQ2, we investigate the feasibility of applying automatic information extraction techniques on chat messages.
  5. We curated a comparison data set on Slack and SO by using LDA and a modified chat disentanglement technique which was initially proposed by Elsner and Charniak. We gathered around 24k Slack conversations and 800k SO posts. Since all the measures for this study could not be computed automatically with high accuracy, we created smaller subsets of data each containing 400 conversations and posts for manual analysis.
  6. I will first present the methodology and results of RQ1.
  7. This slide shows a pair of example conversation on Slack and a Stack Overflow post on the similar topic, to highlight their differences in form and structure. Chat conversations are transient and as a result important information and advice are lost over time. SO is archival-based resource and developers can easily refer to the information for future references. Chats are informal communication platform where developers exchange a lot of information in short time, while SO has more in-depth questions with well-thought out answers. As opposed to SO, chat conversations lack a formal structure and are often interleaved. I DON’T THINK WE HAVE TIME TO SHOW THIS SLIDE
  8. Literature shows that, Code and NL text from SO has been mined by researchers for several s/w engg tasks such as IDE recommendation, augmenting API documentation, building thesaurus of s/w specific terms, etc. Collectively, these prior works suggest that specific types of information embedded in software-related documents could be used in building or improving software engineering tools.
  9. To answer RQ1, we focused on similar information that has been commonly mined in SO. Specifically, we analyzed code snippets, links to external resources, and API mentions.
  10. To answer RQ1, we focused on similar information that has been commonly mined in SO. Specifically, we analyzed code snippets, links to external resources, and API mentions.
  11. We display the results primarily as box plots. Read take always and add: However, most of this information is available in larger quantities on Stack Overflow. Specifically for API mentions in text, both sources had a fairly low median occurrence, but Slack had a higher value and more variance. Before the study, we anticipated that developers on Slack would often use links to answer questions, saving time by pointing askers to an existing information source, such as Stack Overflow. Alternatively, we expected askers to use Gist to post code prior to asking questions, in order to benefit from the clean formatting that enables the display of a larger block of code. While both of these behaviors did occur, they were fairly infrequent.
  12. Next I will discuss the methodology and results of RQ2.
  13. To answer RQ2, we focused on measures that could provide some insights into the form of Slack Q&A conversations (participant count, questions with no answer, answer count) and measures that could indicate challenges in automation (how participants indicate accepted answers, questions with no accepted answer, natural language text describing code snippets, incomplete sentences, noise within a document, and knowledge construction process) that suggest a need to filter. Since RQ2 investigates challenges in mining information in developer chat communications to support software engineering tools, we only computed the measures on Slack.
  14. We observed the common words/phrases that indicate answer acceptance in Slack conversations. The most prevalent indicator is “Thanks/thank you”, followed by phrases acknowledging the participant’s help such as “okay”, ”got it”, and other positive sentiment indicators such as “this worked”, “cool”, and “great”. Accepted answers were also commonly indicated using emojis as listed in the table.
  15. Results represented as percentages are reported directly, while other results, computed as simple counts, are reported as minimum < median < maximum.
  16. The results indicate that the number of incomplete sentences describing code is low, 13%, and similarly the noise in a conversation can be as high as 11%. 2) There is a significant proportion of accepted answers available in Slack. However, an automatic mining tool needs to automatically identify the sentence in a conversation that is an answer to a question and which question it is answering. This implies that NLP techniques and sentiment analysis will most likely be needed to automatically identify and match answers with questions. 3) Nearly 40% of conversations on Slack Q&A channels were participatory, with multiple individuals working together to produce an answer to the initial question. These conversations present an additional mining challenge, as utterances form a complex dependence graph, as answers are contributed and debated concurrently.
  17. To gain insight into the semantic information, we analyzed the kinds of information provided in the conversations. Using the labels defined in one of our previous work, we observed that the most prevalent types of information on Slack is “Design”, which includes information on programming language, framework, and time/space complexity of the code snippet. This aligns with the fact that the main purpose of developer Q&A chats is to ask and answer questions about alternatives for a particular task, specific to using a particular language or technology. Often the focal point of conversations are APIs, where a developer is asking experts on the channel for suggestions on API or proper idioms for API usage.
  18. Other researches have conducted studies on analyzing chats. However they have focused on learning developer behaviors. Chowdhury and Hindle proposed an approach to automatically filter out off-topic IRC discussions by exploiting Stack Overflow programming discussions and YouTube video comments. Alkadhi et al. examined the frequency and completeness of available rationale in chat messages, contribution of rationale by developers, and the potential of automatic techniques for rationale extraction. Researchers have also investigated the role of chatbots in software development activities.
  19. In summary, Q&A chats provide similar information that can be found on Q&A forums such as Stack Overflow. Adapting existing technique and training sets can achieve high accuracy in disentangling the Slack conversations. And finally, presence of low percentages of noise and incomplete sentences show feasibility to apply automatic mining approaches to extract information from Slack chats. 1) While there were few explicit links to Stack Overflow and GitHub Gists in our dataset, we believe that information is often duplicated on these platforms, and that answers on one platform can be used to complement the other. Future work includes further investigating this linking between public Slack channels to Stack Overflow. 2) Participatory Q&A conversations are available on Slack on large quantities. These conversations often provide interesting insights about various technologies and their use, incorporating various design choices. As future work, we intend to investigate mining such conversations for software development insights. 3) We also observed that developers use Slack to share opinions on best practices, APIs, or tools (e.g., API X has better design or usability than API Y ). Stack Overflow explicitly forbids the use of opinions on its site. Opinions are valuable to software developers, and it could also lead to new mining opportunities for software tools. Hence, we plan to investigate the mining of opinion statements available in public Slack channels.
  20. This concludes my talk. I will be happy to answer questions now.