• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Community-AssistedSoftware EngineeringDecision MakingGregory Gay and Mats HeimdahlUniversity of Minnesota
  • 2. AI in SE: A Success StoryLarge, active field, with:● Growing research community● Numerous conferences and workshops,such as MSR, PROMISE, RAISE● Large data repositories● History of collaboration between industryand academia2
  • 3. Were already good at drawing usefulconclusions. We expect further algorithmicimprovements.But...We need to improve our data!3
  • 4. Problem 1:We dont know what data we need.Trying to solve complex problems. Makeguesses, then collect data.Results in missing attributes, added noise.4
  • 5. Problem 2:The data we have is often weak.Solution quality depends on data quality.Some commonly-used data sets infamous formissing values, unhelpful attributes, poorrecording standards.5
  • 6. We should improve data standards, but..We need to use the data we have.Synergy of human feedback and AI to turnstatic data models into dynamic models.Bring a Wikipedia model to data sets.6
  • 7. Inspiration: Recommender Systems7
  • 8. Enhanced Feedback Loop8Recommendation:MC/DCHelpful?YesNew Values forExisting Attributes:Num. BooleanExpressions: 219Num. NumericCalculations: 73New Attributes toCollect (and Values):Ratio of Boolean toNumeric Calculations:3:1Data to Delete:Projects 1, 3, 7
  • 9. Why should we enhance our data?These dynamic data models allow:● Low start-up costs.● Build body of evidence over time.● Address data quality issues.● Human-in-the-loop feedback.9
  • 10. Challenge 1:How do we collect feedback?10
  • 11. Challenge 2:How do we use feedback?Fundamental trade-off between human curationand automated AI learning.When should attributes be filtered? Un-updateddata phased out? New data added?11
  • 12. Challenge 3:Motivating UsersHow do we motivate users to:● Provide feedback.● Add new data.● Update old data.12
  • 13. Motivation requires:1. Incentive.2. Ease of use/contribution.3. Utility from and trust in the model.13
  • 14. We propose feedback-driven dynamicdata models maintained by a synergy ofuser-feedback and automated AI techniques.We propose that dynamic data will allow forlow start-up costs, a stronger body ofevidence over time, and adaptations tochanging industrial conditions.14
  • 15. For discussion...1. Is this even a good idea?2. What can we do to solve data qualityissues? (other than just the idea suggestedhere)3. What kind of data would benefit fromdynamic adaptation?4. How do we motivate users to providefeedback, new data, and update old data?15