Engler and Prantl system of classification in plant taxonomy
Affective Trust as a Predictor of Successful Collaboration in Distributed Software Projects
1. Affective Trust as a Predictor of
Successful Collaboration in
Distributed Software Projects
Fabio Calefato, Filippo Lanubile
University of Bari, Italy
SEmotion @ ICSE’16
May 17, 2016 – Austin, TX, USA
2. A Bit of Theory on Trust
2
INTEGRITY
The adherence to intrinsic moral
norms which makes a trustee
reliable
BENEVOLENCE
The perceived level of
courtesy and positive
attitude
ABILITY
Capability of a trustee
(based on knowledge,
competence, skills) to
perform tasks within a
specific domain
PREDICTABILITY
The degree to which a person is
liable and accountable and meets
the expectation of another
person
Cognitive Trust
Affective Trust
Trustee’s
antecedents to trust
Trustor’s
antecedent to trust
PROPENSITY TO TRUST
A general, not experience-based inclination to
display faith and adopt a trusting attitude
toward others
3. Why We Care About Trust
• Trust fundamental to build “sense of teamness”
• Lack of trust can mean:
– Unwillingness to cooperate or share information
– Perception of being on separate teams
– Decreased goodwill towards others
• The detrimental effects of lack of trust indirectly
affect project performance (costs, speed, …)
– Especially in case of distributed collaboration (no
F2F interaction)
3
Al-Ani et al, “Trust in Virtual Teams: Theory and Tools.” CSCW (2013)
Bradner, & Mark, “Why distance matters: effects on cooperation, persuasion and deception.” CSCW (2002)
Sabherwal, “The role of trust in outsourced IS development projects.” CACM (1999)
Wilson et al, “All in due time: The development of trust in computer-mediated and face-to-face groups.” (2006)
4. Building Trust in Distributed
Software Teams
Previous works
• We proposed SocialCDE, a
socio-technical system that
links Sw artifacts to personal
info within the workspace
• Others observed socially-
oriented communication
finding its way in OSS
projects’ communication
channels:
– Wang & Redmiles
• Cheap-talk on IRC
channels
– Guzzi et al.
• Social interactions
occurring in mailing
lists
4
Formal
Communication
≈
≈
Socially-oriented
Communication
Remote
Conferencing
Distance
Calefato & Lanubile, “Augmenting social awareness in a collaborative development environment” CHASE (2012)
Wang & Redmiles, “Cheap talk, cooperation, and trust in global software engineering.” ESE (2015)
Guzzi et al. “Communication in open source software development mailing lists.” MSR (2013)
???
Build
Affective Trust
5. What We Propose
• No previous study has provided evidence
connecting affective trust to project
performance
– Project performance ≈ Successful collaborations
– Successful collaboration ≈ Pull Request (PR)
• Previous studies have typically relied on self-
reported data (questionnaires) to “measure”
trust
– Affective trust ≈ Amount of Affective lexicon in PRs
– Affect.Trust(Devi,Devj,PRx) ≈ ∑Affect.Lexicon(PRi)
Semotion’16 @ ICSE 2016 5
i
6. Research Model &
Hypotheses
• H1 – In distributed projects, the amount of affective lexicon in
PR comments decreases over time, as affective trust
mutually develops between developers
• H2 – In distributed projects, the larger the amount of affective
lexicon exchanged in prior PR comments between the
contributor and the integration manager, the larger the
chances for the current PR to be accepted
H1 H2 Successful
collaborations
Affective trust
antecedents
Affective
trust develops
over time
Social
communication
between devs
6
Treinen & Miller-Frost
(IBM Syst. J, 2006)
Jarvenpa & Leidner
(J Org. Sci., 1999)
Wang & Redmiles
(ESE J, 2015)
7. Sentiment Analysis in Software
Engineering
MSR‘15
Mining Successful
Answers in Stack
Overflow (Calefato et al.)
Peer J ‘13
Happy software
developers solve
problems better
Peer J
(Graziotin et al.)
ESEC/FSE ‘13
Towards
emotional
awareness in
software
development
teams.
(Guzman and Bruegge)
ICSE‘15
Stuck and
Frustrated or In
Flow and Happy:
Sensing
Developers’
Emotions and
Progress.
(Muller and Fritz)
MSR ‘14
Sentiment analysis of commit
comments in GitHub: An
empirical study (Guzman et al.)
RE‘14
How Do Users
Like this
Feature? A Fine
Grained
Sentiment
Analysis of App
Reviews
(Guzman and Maalej)
Security and emotion: sentiment
analysis of security discussions
on GitHub (Pletea et al.)
CHASE ‘15
Exploring Causes of
Frustration for Software
Developers.
(Ford and Parnin)
Towards emotion-based
collaborative software
engineering
(Dewan)
Do developers feel emotions?
(Murgia et al.)
2013 2014 2015
‘13
Do moods
affect
programmers’
debug
performance?
(Kahn et al)
CT&W ‘11
2011
XP‘15
Would you mind fixing
this issue?
(Ortu et al.)
8. Sentiment Analysis:
Tools
• Several tools available
• Classification of a text according to its positive,
negative or neutral semantic orientation
– NLTK
• Outputs probability for each orientation
• Trained on tweets and movie reviews
– Stanford Sentiment Analyzer
• Issues an overall orientation label + representation of the
sentence structure
• Trained on movie reviews
– SentiStrength
• Outputs a score for both positive and negative sentiment
orientation
• Designed for and validated on general purpose social media
NLTK, http://text-processing.com
Stanford Sentiment Analyser, http://nlp.stanford.edu/sentiment
SentiStrength, http://sentistrength.wlv.ac.uk
9. Domain-dependency in
Sentiment Analysis
• Sentiment-analysis
tools reliable only
when used on the
same corpora they are
trained on
– High risk of emotion
misclassification
• No publicly available
analyzer trained on
Software Engineering
corpora yet
• False positives in
negative sentiment
detection
– Domain lexicon
‘What is the best way
to kill a critical
process’
– Contextual semantics
‘I am missing a
parenthesis. But where?
– Context of interaction
(Q&A)
‘I have a problem, […]
please explain what is
wrong’
Novielli et al. “The Challenges of Sentiment Detection
in the Social Programmer Ecosystem.” SSE (2015)
10. Developers’ Interaction
in Pull Requests
• The LGTM (“Looks
Good To Me”)
situation
• Building affective
trust through
interactions
– How many PRs
before enough trust
is established?
– How many PRs
before a change in
affective language
is observed?
Semotion’16 @ ICSE 2016 11
11. Current & Future Work
Gold Standard &
Sentiment Analyzer for SE
Preliminary Qualitative
Analysis
• Exploring PRs in GitHub,
seeking explicative cases of
affective-language shift
– Picked from a dataset of
“highly discussed” PRs
Sentiment
Gold standard
Train a sentiment
analyzer for SE
(technical text)
Dump
May ‘15
Preprocessing
(removed code, urls..)
Manual
annotation
(multiple raters)
Tsay et al. “Let's talk about it: evaluating contributions
through discussion in GitHub.” FSE (2014)
10+M
questions
Here we present the preparatory design of an empirical study aimed to assess whether affective trust established between developers is a predictor of successful collaboration in distributed projects.
Trust is a complex matter to study since it involves both interpersonal relationships (e.g., cultural issues between trustee and trustor) and facets of human behavior (e.g., personal traits). To date several definitions of trust have been given. A widely used and concise definition is provided by Jarvenpaa et al.
“The expectations of one (the trustor) that the others (the trustees) will behave as predicted” [1]
In other words, positive trust emerges when others’ actions meet our expectation; otherwise, negative trust, or mistrust, arises.
When we assess someone’s trustworthiness, we do it along judging others along several dimensions or personal characteristics called antecedents of trust
antecedents of trust [8], that is, the properties of others that trigger the evaluation when one assessing their trustworthiness.
Therefore, sharing these properties/antecedents about someone fosters trust building, although this is also influenced by one’s intrinsic propensity to trust, a natural disposition towards trusting others in general
propensity to trust (i.e., (a dispositional willingness to rely on others) Trust propensity is a general disposition to trust people in life (this is a consumer trait that is relatively stable. People differ in this trait; some are always distrusting others, whereas others believe that people can be trusted. Disposition to trust is a general, i.e. not situation specific, inclination to display faith in humanity and to adopt a trusting stance toward others (Mayer et al. 1995; McKnight et al. 1998), and is perceived as a personality trait (Grabner-Kräuter et al., 2003). This tendency is not based upon experience with or knowledge of a specific trusted party (ref), but is the result of an ongoing lifelong experience and socialization. As an antecedent of trust, disposition to trust is most effective in initial phases of a relationship when the parties are still mostly unfamiliar with each other (Mayer et al. 1995) and before extensive ongoing relationships provide a necessary background for the formation of other trust-building beliefs, such as integrity, benevolence and ability.
ability (e.g., skills, knowledge), benevolence (e.g., courtesy, availability), integrity relates to a set of moral norms and trustee’s characteristics usually considered as good as, for example, honesty, fairness, loyalty and discretion, and predictability (e.g., reliability, consistent behaviors) are the personal characteristics of a trustee that facilitate the establishment of the trust relationship with a trustor. More specifically, the ability and predictability dimensions are assessed by means of cognitive elaboration of personal and professional information. At the same time, affective-based appraisal leads to trust building along the dimensions of benevolence and integrity is adherence to a set of
principles (such as study/work habits) thought to make the trustee dependable and reliable, according to the trustor
Other definitions of trust distinguish between cognitive (or rational) and affective (or social) perspectives.
Distinction between rational and social perspectives [2]
Affective trust
Trustees pushed to meet trustor’s expectations due to emotional ties, concerns, and care (moral duty)
a form of mutual care, concerns that push the trustees to meet the trustor’s expectations)
Cognitive trust
Trustor’s expectations are based on trustees’ competence and reliability in performing important actions that he or she cant do or even monitor
[1] Jarvenpaa et al. Is anybody out there? Antecedents of trust in global virtual teams. J. of Mgmt Inf. Systems, 14(4), 1998.
[2] Wilson et al. All in due time: The development of trust in computer-mediated and face-to-face teams, Organizational Behavior and Human Decision Processes, 99(1), 2006.
So what we are arguing here is that by exposing and accessing information which are typically shared trough informal communication (which helps form social ties, connections) affective trust grow
Why is trust so important in SE and to SE researchers?
The amount of social interaction decreases over time, as trust is built, and conversation become more procedural task-oriented
Affective trust builds on F2F social interactions, for which we do not have “standard” proved surrogate tools.
We proposed SocialCDE
Prev research has observed the exchange of social information through channels (chat & mailing list), so important it finds its way into communication channels like ML and chat used by OSS projects
So, if affective trust is so important
Find evidence connecting trust, in particular AFFECTIVE trust, to performance
Performance is intuitively intended as a software project that delivers a quality product staying on budget and time schedule
We approximate it as successful requests accepted PRs that cause the code base to grow, adding features, fixing bugs, one step closer to the project final goal
Complement previous studies looking at interactions around code artifacts (PR comments)
One common limitation identified in prior empirical-research findings is that there is no explicit measure of ‘how much’ improving trust contributes to a project performance.
Difficult to measure trust, it is sensed self-reported data (read questionnaires)
We seek for a more operative, quantitative measure, approximate trust by means of exchange of affective lexicon between a pair
Explain the paths of the research model and how Hs were built
From Social Communication to Affective Trust
Jarvenpaa & Leidner observed trust evolution in global teams interacting only through computer-mediated communication [9]. The analysis indicated that teams with low level of initial trust lacked in social communication at the beginning of projects. Conversely, teams that had high level of trust at the end of projects had an initial social focus in communication, which later diminished to make room for procedural and task-focused interactions. Consistently, we argue that disclosing personal and contextual information in the workspace can increase the feeling of similarity between distant teammates, thus fostering the amount of social communication, a manifestation of a higher numbers of successful interactions with new ties and stronger bonds established between members across sites. Finally, existing research by Jarvenpaa et al. [8] has shown that perceived individuals’ integrity, benevolence, and propensity to trust are the relevant antecedents of trust in the sense that they facilitate affective trust building.
No interviews, like Russman’s TWAN schema
From Affective Trust to Project Performance intended as Successful collaboration
Treinen & Miller-Frost [13] observed that the development of mutual trust between distant sites at the beginning of a project was paramount. In fact, they observed that, during the early stage of a project, building personal knowledge about the team and mutual trust turned out to be more important than resolving technical issues, since trust would allow to resolve future issues from afar (e.g., conference call), thus resulting in increased overall efficiency. They however, didn’t distinguish between affective and cognitive trust, though.
As per testing H2, we acknowledge that establishing a cause/effect relationship between trust and project performance is a challenging task, as many other confounding factors (e.g., project type, individual skills) may interfere along the process.
Now that is clear what our research goal is – assessing affective trust through socially oriented communication , we can ask:
Are we going to find expression of affective states – care,benevolence, etc. – when they interact over the comm channels used in their projects? That is do developers feel – and express emotions?
Positive emotions – gratitude, appreciation, contentment (satisfaction)
Negative emotions – anger, sadness
Now that we know that developers do feel and report a variety of both positive and negative sentiments, the point is:
How do we analyze them – and are we sure that they matter? I.E. does emotion correlate to or affect performance in anyway??
Prev Reseach has shown already many wasy to model and analyse emotions in SE domain. Venues are quite diverse, cross-domain problem, not just in or Semotion niche.
Papers in green specifically mention relation between emotions and performance.
Do emotions affect individual performance?
Khan et al. 2011 - Do moods affect programmers’ debug performance? Cognition, Technology & Work
Positive mood improves debugging performance
Positive mood induced through physical exercises
Graziotin et al., 2013 - Happy software developers solve problems better: psychological measurements in empirical software engineering. Peer J
Emotion and cognition
Developers constantly face problem-solving tasks
Analysis and creativity skills are required
Emotions and moods deeply influence cognitive processing abilities and performance
Murgia et al., 2014 – Do developers feel emotions? An exploratory analysis of emotions in software artifacts. MSR 2014.
Emotions is associated to issue tracking activity
More comments and watchers for issue where Sadness is expressed
Higher fixing time when negative emotions are expressed
Ortu et al. 2015 – Would you mind fixing this issue?
Expressessing politeness in help requests reduces fixing time
LOTS of ITALIAN researchers care about emotions in SE apparently :-)
How do we evaluate affective states in developers interactions?
Various alternatives, all widely used and validated and yet, all trained on non-technical domain and datasets
One problem with this study is…
As said, we are seeking complementary evidence about PR (IRC and mailing list, sync and async communication) where enough sociallly-oriented communication is exchanged to let affective trust grow.
The other say «problem» with the study is that we have to consider that Interactions on PRs are however different from those happeing in IRC channels and emails
We are seeking confirmation of what was observed by Wang and Redmiles, Guzzi et al., and Treinen and Miller, that trust-building oriented communication does happe in IRC, mailing lists) expecially at «the beginning» and thant that slows down, after enough trust is established – leaving place to procedural task oriented comm.
So, how many PRs does it usually take before we observe a change in affective language?
If the amount of socially-oriented communication has been found to be higher in the beginning of collaboration and decrease over time:
How many PRs does it take generally before sufficiente amount of trust is established?
Also, how different is Devs interaction happening within PRs comments as opposed to what has been observed in ML and IRChat msgs.
An example of a successful collaboration, a PR that has been merged, and PR comments involving two senior project member of the Node.js project, who have clearly established enough overall trust – both cognitive and affective.
In conclusion, what we are already doing and going to complete about this research effort:
-1° create a linguistic resource and a sentiment analyzer trained on technical text in the SE domain
We start from SO, a massive source of ever-growing millions of questions covering diverse aspects of SE;
-A preliminary qualitative analysis of PR from Github seeking for explicative cases that show clear affective-language shift in the history of interactions between developer pairs
Retrived a random subsample of PRs from a dataset containing “highly discussed” PRs in projects hosted at GitHub
Other than SO, in the future, we might consider including other sources, from the stack exchange platoform, or issue issue tracker, like Jira, and Github itsefl