High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
On Developers’ Personality in Large-scale Distributed Projects: The Case of the Apache Ecosystem
1. On Developers’ Personality in
Large-scale Distributed Projects:
The Case of the Apache Ecosystem
Fabio Calefato
Giuseppe Iaffaldano
Filippo Lanubile
University of Bari, Italy
Bogdan Vasilescu
CMU, USA
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
2. Personality
“A dynamic and organized set of traits
that create the unique patters of
behaviors, thoughts, and feelings of a
person"
[1] Ryckman, R. (2004). Theories of personality.
6. Findings from previous
research in psycholinguistic
• Strong connections between language use and
personality
• Personality traits can be successfully derived from
the analysis of written text (e.g., emails,…) [1, 2]
• Every trait in the Big-Five model strongly and
significantly associated with theoretically-
appropriate patterns of word use [3]
[1] Hirsh & Peterson, (2009) “Personality and language use in self-narratives”
[2] Shen et al. (2013) “Understanding email writers: personality prediction from email messages”
[3] Tausczik & Pennebaker (2010) “The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods”
7. Psychometric analysis tools
• Linguistic Inquiry and Word Count (LIWC - "Luke")
– Text classifier, counts %word falling into 80+ categories
(linguistic, psychological, topical)
– Used to prove strong association between linguistic
patterns and personality or psychological state
• IBM Personality Insights
– Web service to perform linguistic analytics to infer
individuals' personality from written communication
– Builds on Big 5 and LIWC
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
8. Personality types of Apache developers
github.com/collab-uniba/personality
176 projects
mbox archives
106 mailing lists
~1.35M emails
~38K senders
git clones
56 projects
~206K commits
~5K developers
alias
unmasking
unique email
addresses
unique login /
email addresses
~46K developer
UIDs
O
C
E
A
N
9. 5 Research questions
Does developers’ personality…
1. change over time?
2. vary between core vs. peripheral developers?
3. vary with the degree of development activity?
4. change after becoming a core member?
And finally...
5. What personality traits are associated with the
likelihood of becoming a contributor?
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
10. RQ1:
change over time
Our study
• Compared Big5
personality scores over
multiple years
• Apache developers
evolve as
– More open
– More agreeable
– More neurotic
Related work
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
• 400 active GitHub
developers evolve as
– More conscientious and
extrovert
– Less agreeable
11. RQ2:
core vs. peripheral developers
Our study
• Paired comparisons
between core and
peripheral developers
– Peripheral = devs w/o write
access
– Core = devs w/ write access
to repo
• No differences observed
Related work
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
• The 2 core developers
responsible of 2007’s
releases more extroverted
and open than average
12. RQ3:
change with degree of contribution
Our study
• Refinement of RQ2
• Paired comparisons
between
– High vs. low commit authors
(peripheral)
– High vs. Low commit
integrators (core)
• No differences observed
Related work
• High contributors are
– more open, conscientious,
extraverted, and neurotic
– less agreeable
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
13. RQ4:
change with project membership
• Refinement of RQ2
• Devs’ personality scores grouped by before vs. after
gaining write access to project repos (i.e., becoming a
core project member)
• No differences found
Paired
groups
Trait V p-value
Cliff’s
Before vs After
becoming
core-team
members
Openness 70 0.5 0.07
Conscientiousness 90 0.2 0.15
Extroversion 50 0.5 -0.12
Agreeableness 50 0.5 -0.16
Neuroticism 70 0.5 0.18
Wilcoxon Signed-Rank test
14. RQ5:
Likelihood of becoming contributor
– Controls: word count + project size, age, and category
not significant
– Model performance: R2=.14, AUC=.82
Logistic regression
model
Openness
Conscientiousness
Agreeableness
Extraversion
Neuroticism
+
+++
15. Open questions
• How does devs’ personality mesh with
– Culture
(geographical bias?)
– Context
(projects, ecosystems, different tasks?)
• Findings different than prior work
– Despite relying on the same personality theory
(assess psychometric tools reliability)
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
16. One answer
• Do we need psychologists to help us conduct GSE
research?
– I do believe so!
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
17. On Developers’ Personality in
Large-scale Distributed Projects:
The Case of the Apache Ecosystem
THANKS!
@fcalefatoFabio Calefato
github.com/collab-uniba/personalityPackage:
Pre-print: tiny.cc/icgse18
ICGSE 2018
29 May 2018 - Gothenburg, Sweden
Editor's Notes
Psychology research has very many definitions of personality, here’s the one I like the most among the ones used in studies related to software engineering the one by Rickman and others which define personality as..
Softw. projects the results of collective efforts performed by multiple developers, the larger a project the more the people involved, each having their own different personality
We argue that studying and hence understanding their personality, have the potentially explain some of the stakeholders’ behaviors in various contexts (how they affect task execution, and they collaborate with each others ) in turn explain some intricacies of software development
We’re not the first to argue this though, the subtle Effects of personality have been studied for more than 40ys now. And yet, there is no clear understanding of such effects on software dev, not only distributed even co-located, there does not seem to be some baseline or acknowledged ”truth” to build on
Long and difficult road ahead on this research path is that there are Many personality frameworks that have been used in the studies so far
Myers-Briggs Type Indicator (MBTI)
Kersey Temperament Sorter (KTS)
Minnesota Multiphasic Personality Inventory (MMPI)
Inventories of hundreds and hundreds of items, unrealistic to imagine adminstering those to open source devs, or even paid devs from companies, profile takes hours on multiple days.
…
Big 5 Theory (Five-Factor Model)
Simiplicity made “actionable”
Describe the 5 traits
O
C
E
A
N
Extracting personality from written communication
Previous research on psycholinguistic has confirmed strong connections between language use and personality and that the personality traits can be successfully derived from the analysis of written text [13] such as emails [14].
In fact, Tausczik & Pennebaker [15] found that every trait in the Big-Five model is strongly and significantly associated with theoretically-appropriate patterns of word use, indicating.
Quantitative analysis of personalities of developers from communication traces
Mined ecosystem-level data from the Apache Software Foundation (ASF) projects
Apache SF project directory retrieved:
Project name
Status (active, incubating, retired)
Development language
Category
Repository URI (git, svn)
Mailing-list URIs (dev, user)
RQs borrowed from prior work that used LIWC and big5 theory
With project tenure, developers in the Apache ecosystem over time grow more open and agreeable (more cooperative) and more neurotic (more prone to worry and feeling stressed but also more self-conscious)
They used LIWC and Big5
Our other findings also show a sort of ‘stability’ in developers’ personality traits, which are not affected by project membership
Agreeableness and openness => cooperative
Agreeableness: 61% of deviance in data explained
Deviance (measure of goodness of fits of the model, % of deviance explained is an effect size measure)
Rsquared low, model does not explain all the variance in the data, but high AUC indicates that the model has a quite high predictive power and can accurately predict whether an apache developer will become a contributor or not (i.e., have 1+ contribution merged in a project repo).
Culture -> increase relevance for global sw eng
Context -> make research more actionable
Paid vs volunteers? Do they have exhibit personality profiles?
Culture -> increase relevance for global sw eng
Context -> make research more actionable
Paid vs volunteers? Do they have exhibit personality profiles?
With project tenure, developers in the Apache ecosystem over time grow more open and agreeable (more cooperative) and more neurotic (more prone to worry and feeling stressed but also more self-conscious)
we found evidence that developers evolve as more open, agreeable, and neurotic. In other words, they tend to become, respectively: (i) more capable of expressing their feelings; (ii) more cooperative, altruistic, and prone to trust the others; (iii) more prone to worry, self-conscious, and susceptible to stress. Rastogi & Nagappan [26] found that GitHub developers’ personality changed over time, too. However, in contrast with our results, they found developers to evolve as more conscientious and extrovert, and less agreeable over two consecutive years. While further investigations are needed to explain these differences, we note these claims are made despite some negligible to small effect sizes (Cohen’s ¶) observed.