Can Development Work Describe Itself?

MSR’2010, Cape Town, 
South Africa, Mai 2010 

Can Development Work 
Describe Itself? 

Walid Maalej, Technische Universität München 
Hans‐Jörg Happel, FZI Research Center Karlsruhe

ExecuIve Summary 

Informal notes that  To a large extent, 
describe developers’  work descripKons 
work contain well‐ can be generated by 
deﬁned semanKcs,  observing the work 
granularity levels, and  context of developers 
informaKon paMers  and their interacKons 

1  2 

Grounded Theory on Work DescripIons 

© W. Maalej, Mai 2010  Analyzing Work DescripIon – MSR 2010  2

Outline 

1  MoIvaIon 

2  Research SeRng 

3  Research Results 

4  Work DescripIon AutomaIon 


What Are Work DescripIons? 

Personal note 

Commit message  ArIfacts 
Comments 
including  
work 
descripIons 

Social media  Time sheet 

A work descripIon is an informal text wriTen by a 
knowledge worker to summarize achievements and 
other notable issues of a parIcular work session 


Previous Studies Showed InteresIng ProperIes 
of Work DescripIons 
 
Regularities in Content and
Effort and Quality Issues
Metadata

  5% of developers‘ Ime is    The overall vocabulary 
spent for describing work  usage seems to be 
(30 min. per day )    predictable  
  10% of the sessions have    The vocabulary size is 
pseudo descripIons  rather small 
(either no Ime or not 
  Diﬀerent projects have 
moIvaIon) 
similar ranking of terms  

To which extent can developers‘ work descriptions be automated?


Outline 

1  MoIvaIon 





Research QuesIons
 

 Content of Work DescripIons
 

The semanIcs of informaIon 
included in work descripIons
 

 InformaIon EnIIes
   InformaIon Granularity 

Text fragments with similar  The levels of detail included 
semanIcs   (abstracIon levels) 

Occurrences
   CombinaIons
  Preferences
   Levels
  Causes
 

Which infor‐ Do certain 
How are these  What are  Which properIes 
maIon enIIes  developers prefer 
enIIes  granularity  eﬀect the 
are included and  certain 
combined?  levels?   granularity? 
how o`en?  informaIon? 


Data Sets Collected in Diﬀerent Contexts 

Data set  Summary  Period  Developers  Entries 

Developers‘ personal 
MyComp  notes at a German  2001 – 2009  25  38,005 
soTware company 
Commit messages 
Apache  and code comments  2001 – 2009  1,145  598,418 
of all Apache projects 
Commit messages 
Unicase 
and code comments  2008 – 2009  18  5097 
of the unicase project  
Personal notes in a 
Eureka  observaKonal study at  2008  21  91 
5 companies 


The Data Analysis Process
 


Outline 

1  MoIvaIon 


3  Research Results 



InformaIon EnIIes and Their Usage Frequencies 
 

Occurrences %

Entity Average Apache Mycomp Unicase Eureka

Activity 71 69 76 71 67
Artifact 55 60 53 49 58
Problem 47 47 47 49 45
Rationale 28 30 29 25 31
New Work 24 24 20 28 22
Status 19 24 20 17 15
Reference 15 15 19 17 10
Solution 15 19 15 16 11
Experience 10 11 6 9 13


Findings on InformaIon EnIIes
 
1 
The majority of informaKon on performed acKviKes (82%) is 
combined with concerned arKfacts 

2 
InformaKon on problems is used to describe work done, work 
need to be done, and the context of experiences 

3 
The combinaKon paMerns show that sharing knowledge and 
managing work are two goals of work descripKons 

4 
There two clusters of developers: those who prefer to use 
arKfacts and those who prefer to use problems to describe work 


Granularity Levels and Usage Frequencies
 

Granularity Occurrences %
Level
Average Apache Mycomp Unicase Eureka

Implementation 54 58 37 62 60

Domain Project 31 29 34 29 30

Requirement 12 10 26 6 7

Method 33 33 49 28 20

Class 29 29 25 31 32
Object
Line 17 17 8 17 27
Component 15 14 16 19 10
Edit 53 55 41 57 60

Activity SE Process 36 34 42 30 39

Knowledge 12 13 15 11 9


Findings on InformaIon Granularity 
1 
The majority of work descripKons (62%) include informaKon from 
a single granularity level 

2 
Developers think consistently (in a single abstracKon level) when 
taking notes about arKfacts 

3 
The shorter the session is the more ﬁne‐grained are the 
described arKfacts 

4 
Levels of acKvity granularity overlap (edit, process and 
knowledge) 


Outline 

1  MoIvaIon 





Two Main Enablers For AutomaIng Work 
DescripIons  

AutomaKng Work DescripKon 

Shared semanKcs of developers’  HeurisKcs derived from 
working context, i.e. acKviKes,  empirical ﬁndings on 
arKfacts, and problems  developers’ behavior 


Shared SemanIcs to Annotate Context: 
Developers’ InteracIons
 


Shared SemanIcs to Annotate Context
 
Developers’ ArIfacts
 


HeurisIcs to Generate Work DescripIons
 
Appropriate Granularity
•  Guess the appropriate
Relevant vs. Irrelevant Context
level of detail
2 •  Only a subset of artifacts
concerned by the interactions  
1
Problem-Solution States
is included in the description
•  Detect if a developer is   Four factors to •  Useful metrics are accumulated
encountering a problem,   generate work usage duration, usage age, and
searching for a solution, or description usage frequency
applying a solution
3
•  Indictors are are error messages,
breakpoint usage, searches, or 4
usage of particular keywords
Developers Preferences
•  Learn from previous behavior of
developers and which information
they describe in which situation


Summary of the Talk 

•  InformaKon on acKviKes, arKfacts, problems, new 

InformaIon  work, and status is included for work management   •  Most informaIon 
EnIIes
  •  InformaKon on soluKons, raKonale, and experience  enIIes can be 
is included to capture and share knowledge  created 
automaIcally by 
observing  
•  There are diﬀerent levels of domain, object, and 
developer’s 
InformaIon  acKvity granularity 
context 
Granularity  •  These are used consistently and with common 
paMerns  •  For that we 
propose a set of 
ontologies and 
•  Developers either think problem‐centered or 
heurisIcs to be 
Developers‘  arKfact‐centered when describing their work 
used 
Preference  •  They use well deﬁned informaKon paMerns such as 
<acKvity concerns arKfacts> 


Feedback, QuesIons, SuggesIons 
and CollaboraIon are Welcomed! 
Walid Maalej   Hans‐Jörg Happel  
TUM  FZI 
maalejw@cs.tum.edu  happel@fzi.de  


Can Development Work Describe Itself?

Recommended

Recommended

More Related Content

Similar to Can Development Work Describe Itself?

Similar to Can Development Work Describe Itself? (20)

More from Walid Maalej

More from Walid Maalej (20)

Can Development Work Describe Itself?