Community-Assisted Software Engineering Decision Making

Community-Assisted
Software Engineering
Decision Making
Gregory Gay and Mats Heimdahl
University of Minnesota

AI in SE: A Success Story
Large, active field, with:
● Growing research community
● Numerous conferences and workshops,
such as MSR, PROMISE, RAISE
● Large data repositories
● History of collaboration between industry
and academia
2

We're already good at drawing useful
conclusions. We expect further algorithmic
improvements.
But...
We need to improve our data!
3

Problem 1:
We don't know what data we need.
Trying to solve complex problems. Make
guesses, then collect data.
Results in missing attributes, added noise.
4

Problem 2:
The data we have is often weak.
Solution quality depends on data quality.
Some commonly-used data sets infamous for
missing values, unhelpful attributes, poor
recording standards.
5

We should improve data standards, but..
We need to use the data we have.
Synergy of human feedback and AI to turn
static data models into dynamic models.
Bring a Wikipedia model to data sets.
6

Inspiration: Recommender Systems
7

Enhanced Feedback Loop
8
Recommendation:
MC/DC
Helpful?
Yes
New Values for
Existing Attributes:
Num. Boolean
Expressions: 219
Num. Numeric
Calculations: 73
New Attributes to
Collect (and Values):
Ratio of Boolean to
Numeric Calculations:
3:1
Data to Delete:
Projects 1, 3, 7

Why should we enhance our data?
These dynamic data models allow:
● Low start-up costs.
● Build body of evidence over time.
● Address data quality issues.
● Human-in-the-loop feedback.
9

Challenge 1:
How do we collect feedback?
10

Challenge 2:
How do we use feedback?
Fundamental trade-off between human curation
and automated AI learning.
When should attributes be filtered? Un-updated
data phased out? New data added?
11

Challenge 3:
Motivating Users
How do we motivate users to:
● Provide feedback.
● Add new data.
● Update old data.
12

Motivation requires:
1. Incentive.
2. Ease of use/contribution.
3. Utility from and trust in the model.
13

We propose feedback-driven dynamic
data models maintained by a synergy of
user-feedback and automated AI techniques.
We propose that dynamic data will allow for
low start-up costs, a stronger body of
evidence over time, and adaptations to
changing industrial conditions.
14

For discussion...
1. Is this even a good idea?
2. What can we do to solve data quality
issues? (other than just the idea suggested
here)
3. What kind of data would benefit from
dynamic adaptation?
4. How do we motivate users to provide
feedback, new data, and update old data?
15

Community-Assisted Software Engineering Decision Making

More Related Content

What's hot

Viewers also liked

Similar to Community-Assisted Software Engineering Decision Making

More from gregoryg

Recently uploaded

Community-Assisted Software Engineering Decision Making