Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
Junji
Shimagaki
Yasutaka
Kamei
Ahmed E.
Hassan
Naoyasu
Ubayashi
Shane
McIntosh
A Study of the Quality-Impacting Practice...
2
Code review is an important
software quality assurance practice
Programmer Code reviewer
3
Sony Mobile uses
Gerrit Code Review tools
1. Commit message
2. Files under review
4
Sony Mobile uses
Gerrit Code Review tools
1. Commit message
4. Review scores
3. Reviewers
2. Files under review
5
Code Review context
at Sony Mobile
“Code-Review”. (e.g., syntax, grammar, logic..)
Code Review context
at Sony Mobile
“Code-Review”. (e.g., syntax, grammar, logic..)
Test results on
Application crashes?
Re...
7
Lax reviewing practices
impact software quality
Reviewer
Programmer
Poorly
reviewed code
Code
repository
McIntosh et al....
Poorly
reviewed code
Reviewer Programmer
Code
repository
McIntosh et al., EMSE 2015
8
How about at Sony Mobile?
Lax review...
9
Approach
Quantitative
study
Qualitative
study
Replication of
McIntosh et al.,
EMSE 2015
Developer surveys
at Sony Mobile...
10
Review participation
Quick results
Simple adaptation does not work!
Review coverage
Self approval✗
Discussion volume✗
U...
11
Sony's unique apps, HW
A software project for this.
Target system: A smartphone product
of a release cycle for 6 months
A software project for this.
Target system: A smartphone product
of a release cycle for 6 months
Sony's unique apps, HW
Ch...
Why might Sony Mobile be Different?
Third-party
dependencies
Offline
Communication
Embedded
Software
Development
13
Third-party
dependencies
Why might Sony Mobile be Different?
EmbeddedOffline Software
Communication
Development
Previous s...
Third-party
dependencies
Why might Sony Mobile be Different?
Embedded
Software
Development
Offline
Communication
Previous ...
16
Why might Sony Mobile be Different?
Embedded
Software
Development
Third-party Offline
dependencies Communication
Previo...
Do reviewing practices
impact software quality?
Review coverage
Un-review ratio
Third-party ratio
✗
✓
Components with high...
Review participation
Do reviewing practices
impact software quality?
Discussion volume
Patch update activity
✗
✓
Component...
Review participation
Do reviewing practices
impact software quality?
Self approval
Self verify
✗
✓
Components which are pr...
20
Do reviewing practices
impact software quality?
Review coverage
Self verify
Third-party ratio✓
Review participation
✓ P...
Do reviewing practices
impact software quality?
Review coverage
✓ Third-party ratio
Review participation
✓ Self verify
Why...
22
Qualitative Study Approach
Presentation
Initial survey
(93 stakeholders)
–---------
–-------
–---------
–-------
–-----...
23
Qualitative Study Approach
Semi-structured
Interviews
(15 key engineers)
Presentation
Initial survey
(93 stakeholders)
...
Qualitative Study Approach
Semi-structured
Interviews
(15 key engineers)
Presentation
Initial survey
(93 stakeholders)
–--...
“An external codebase takes more time
from me to understand the code and
to develop patches.”
Developers require more time...
Why does self-verify rate matter
at Sony Mobile?
The self-verification practice is coloured by
the author’s subjective per...
Why does patch update rate
matter at Sony Mobile?
“Patch updates rate” captures developer
effort in a way that is not dimi...
Discouraging the practice of
self-verification
Investigating ways to encourage
passive developers to participate more
in c...
Investigating ways to encourage
passive developers to participate more
in code review 29
What is Sony Mobile doing to
adju...
30in code review
What is Sony Mobile doing to
adjust their reviewing process?
QA has new focus on
test coverage of externa...
31
32
33
34
35
36
Backup slides
37
Review coverage of a component
✓
✓
✓ ✓
✓
✓
✓
Review is performed
at the Sony Mobile's Gerrit
✓✓✓✓ ✓✓✓
Code review participation metrics
✓
✓
✓
✓
✓ ✓
Discussion volume
recorded in Gerrit
Number of commits
approved by her/his o...
But, again, they do NOT share
relationship with defect proneness
✓
✓
✓
✓
✓ ✓
recorded in Gerrit
approved by her/his own
✗N...
Adjusted review participation metrics share
relationship with defect-proneness
✓
✓
✓
✓
✓ ✓
Patch updates
activity
verified...
41
External components tend to be
InHouse
more defect-prone.
Defect-proneness
declines as In-House
ratio increases.
Review...
Self-verify shares an increasing
relationship with defect proneness.
Number of self_verify commits
Defect
proneness
(Logit...
Defect
proneness
(Logit
transformed)
Patch updates activity
Patch updates activity shares an decreasing
relationship with ...
44
Quantitative
study
Replication of
McIntosh et al.,
EMSE 2015
Do reviewing practices impact
software quality at Sony Mob...
Quantitative
study
Qualitative
study
with 100+ people45EMSE 2015
Replication of
McIntosh et al.,
Developer surveys
at Sony...
Quantitative
study
Qualitative
study
Replication of
McIntosh et al.,
by stakeholders with 100+ people46EMSE 2015
Developer...
47
A software project for this.
Target system: A smartphone product
of a release cycle for 6 months
700 components
...
300...
Do reviewing practices
impact software quality?
RQ1:
Review coverage
RQ2:
Review participation
48
Review coverage of a component
Code repository of
1 component
1 commit
49
50
Review coverage of a component
✓
✓
✓ ✓
✓
✓
✓
8 commits
4 reviewed
5 commits
1 reviewed
2 commits
2 reviewed
Review is p...
Review coverage of a component
✓
✓
✓ ✓
✓
✓
✓
50% 20% 100%
51
However, it does NOT share
relationship with defect-proneness
✓
✓
✓ ✓
✓
✓
✓
✗50%
52
✗20% ✗100%
At Sony Mobile, review status is equivalent to
whether it is made 'In-House'
✓
✓
✓ ✓
Sony Mobile's
internal patches
Linux ...
But, our defined bags look too small to
represent 'In-House' made ratio
✓
✓
✓ ✓
Commits during
development of Slipped hist...
55
We adjusted the definition of
'review coverage'
✓
✓
✓ ✓
→ ??%
Proportion of 'In-House' commits in total
56
Adjusted review coverage shares
relationship with defect-proneness
✓✓
✓✓
✓
✓
✓
✓
✓
✓ ✓
✓0.1% ✓1% ✓100%
External origina...
Do reviewing practices
impact software quality?
RQ1:
Review coverage
✓YES!In-House ratio
RQ2:
Review participation
???57
58
Ok, code-reviewed but...
✓
A reviewed commit
no guarantee of active participation
✓
Definitions of participation
Who
approved this
commit?
59
✓
Sufficient
discussion
(effort) made?
Who
approved this
commit?
60
Definitions of participation
✓
Code review participation metrics
Sufficient
discussion
(effort) made?
Number of commits
approved by her/his own
Discuss...
✓
Sufficient
discussion
(effort) made?
approved by her/his own
✗Number of commits
recorded in Gerrit
✗Discussion volume
Bu...
We adjust self-approval
We only counted the number of self “Code-Review”
We also count the number of self “Verified”
63
Code review process at
Sony Mobile
Code
reviewer
Programmer
1.
Code.
2.
Upload.
3.
Review.
4.
Verify on HW.
Gerrit
server
...
65
Code review system
at Sony Mobile
We adjust effort
“… it is much easier to work with
direct communication rather
than with the Gerrit tools.”
Software archi...
✓
Who
verified this
commit?
Sufficient
discussion
(effort) made?
verified by her/his own
✓Number of commits
activity
67
✓P...
Upcoming SlideShare
Loading in …5
×

A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile

Presented at ICSE 2015

  • Login to see the comments

A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile

  1. 1. 1 Junji Shimagaki Yasutaka Kamei Ahmed E. Hassan Naoyasu Ubayashi Shane McIntosh A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile
  2. 2. 2 Code review is an important software quality assurance practice Programmer Code reviewer
  3. 3. 3 Sony Mobile uses Gerrit Code Review tools 1. Commit message 2. Files under review
  4. 4. 4 Sony Mobile uses Gerrit Code Review tools 1. Commit message 4. Review scores 3. Reviewers 2. Files under review
  5. 5. 5 Code Review context at Sony Mobile “Code-Review”. (e.g., syntax, grammar, logic..)
  6. 6. Code Review context at Sony Mobile “Code-Review”. (e.g., syntax, grammar, logic..) Test results on Application crashes? Reboot after 10 sec6 ?
  7. 7. 7 Lax reviewing practices impact software quality Reviewer Programmer Poorly reviewed code Code repository McIntosh et al., EMSE 2015
  8. 8. Poorly reviewed code Reviewer Programmer Code repository McIntosh et al., EMSE 2015 8 How about at Sony Mobile? Lax reviewing practices impact software quality
  9. 9. 9 Approach Quantitative study Qualitative study Replication of McIntosh et al., EMSE 2015 Developer surveys at Sony Mobile with 100+ people Implications Better code review practices validated by stakeholders
  10. 10. 10 Review participation Quick results Simple adaptation does not work! Review coverage Self approval✗ Discussion volume✗ Un-review ratio✗
  11. 11. 11 Sony's unique apps, HW A software project for this. Target system: A smartphone product of a release cycle for 6 months
  12. 12. A software project for this. Target system: A smartphone product of a release cycle for 6 months Sony's unique apps, HW Chipset and modem Android OS Strong dependencies on third-party system12 s
  13. 13. Why might Sony Mobile be Different? Third-party dependencies Offline Communication Embedded Software Development 13
  14. 14. Third-party dependencies Why might Sony Mobile be Different? EmbeddedOffline Software Communication Development Previous studied systems are less impacted by third-party dependencies.14
  15. 15. Third-party dependencies Why might Sony Mobile be Different? Embedded Software Development Offline Communication Previous studied systems rely on online communication methods.15
  16. 16. 16 Why might Sony Mobile be Different? Embedded Software Development Third-party Offline dependencies Communication Previous studied systems are of applications at higher levels in the application stack
  17. 17. Do reviewing practices impact software quality? Review coverage Un-review ratio Third-party ratio ✗ ✓ Components with higher third-party codebase ratio are more defect-prone17
  18. 18. Review participation Do reviewing practices impact software quality? Discussion volume Patch update activity ✗ ✓ Components with high patch update activity are less defect prone. 18
  19. 19. Review participation Do reviewing practices impact software quality? Self approval Self verify ✗ ✓ Components which are prevailed with self verification practices are more defect prone19
  20. 20. 20 Do reviewing practices impact software quality? Review coverage Self verify Third-party ratio✓ Review participation ✓ Patch update activity ✓
  21. 21. Do reviewing practices impact software quality? Review coverage ✓ Third-party ratio Review participation ✓ Self verify Why are these metrics effective? Let's ask the developers!
  22. 22. 22 Qualitative Study Approach Presentation Initial survey (93 stakeholders) –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- Interviewee's list –--------- –------- –--------- –------- –--------- –------- –--------- –-------
  23. 23. 23 Qualitative Study Approach Semi-structured Interviews (15 key engineers) Presentation Initial survey (93 stakeholders) –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- Interviewee's list Implications –--------- –------- –--------- –------- –--------- –------- –--------- –-------
  24. 24. Qualitative Study Approach Semi-structured Interviews (15 key engineers) Presentation Initial survey (93 stakeholders) –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- Interviewee's list Implications –--------- –------- –--------- –------- –--------- –------- –--------- –------- Validation survey (25 senior stakeholders) –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- Confirmed Implicatio24 ns ✓ –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –------- –--------- –-------
  25. 25. “An external codebase takes more time from me to understand the code and to develop patches.” Developers require more time and effort to understand, extend, or repair components with high third-party rates. Why does third-party ratio matter at Sony Mobile? Software engineer ✓92% of stakeholders agreed 25
  26. 26. Why does self-verify rate matter at Sony Mobile? The self-verification practice is coloured by the author’s subjective perspective, which may bias the testing procedures and results. “I understand the architecture, and I am the one who can test my commit properly.” Software engineer ✓75% of stakeholders agreed 26
  27. 27. Why does patch update rate matter at Sony Mobile? “Patch updates rate” captures developer effort in a way that is not diminished by in-person discussion at Sony Mobile. “… it is much easier to work with direct communication rather than with the Gerrit tools.” Software architect ✓81% of stakeholders agreed 27
  28. 28. Discouraging the practice of self-verification Investigating ways to encourage passive developers to participate more in code review 28 What is Sony Mobile doing to adjust their reviewing process? QA has new focus on test coverage of external code
  29. 29. Investigating ways to encourage passive developers to participate more in code review 29 What is Sony Mobile doing to adjust their reviewing process? QA has new focus on test coverage of external code Discouraging the practice of self-verification
  30. 30. 30in code review What is Sony Mobile doing to adjust their reviewing process? QA has new focus on test coverage of external code Discouraging the practice of self-verification Investigating ways to encourage passive developers to participate more
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36 Backup slides
  37. 37. 37 Review coverage of a component ✓ ✓ ✓ ✓ ✓ ✓ ✓ Review is performed at the Sony Mobile's Gerrit ✓✓✓✓ ✓✓✓
  38. 38. Code review participation metrics ✓ ✓ ✓ ✓ ✓ ✓ Discussion volume recorded in Gerrit Number of commits approved by her/his own Self review only? Enough code review effort? 38
  39. 39. But, again, they do NOT share relationship with defect proneness ✓ ✓ ✓ ✓ ✓ ✓ recorded in Gerrit approved by her/his own ✗Number of commits 39 ✗Discussion volume
  40. 40. Adjusted review participation metrics share relationship with defect-proneness ✓ ✓ ✓ ✓ ✓ ✓ Patch updates activity verified by her/his own Lax reviewing practices are associated with defect-proneness. 40 ✓Number of commits ✓
  41. 41. 41 External components tend to be InHouse more defect-prone. Defect-proneness declines as In-House ratio increases. Review coverage No significant link with defect proneness In-House shares a stronger relationship with defect proneness at Sony Mobile
  42. 42. Self-verify shares an increasing relationship with defect proneness. Number of self_verify commits Defect proneness (Logit transformed) 42
  43. 43. Defect proneness (Logit transformed) Patch updates activity Patch updates activity shares an decreasing relationship with defect proneness. 43
  44. 44. 44 Quantitative study Replication of McIntosh et al., EMSE 2015 Do reviewing practices impact software quality at Sony Mobile?
  45. 45. Quantitative study Qualitative study with 100+ people45EMSE 2015 Replication of McIntosh et al., Developer surveys at Sony Mobile Do reviewing practices impact software quality at Sony Mobile?
  46. 46. Quantitative study Qualitative study Replication of McIntosh et al., by stakeholders with 100+ people46EMSE 2015 Developer surveys at Sony Mobile Do reviewing practices impact software quality at Sony Mobile? Implications Better code review practices validated
  47. 47. 47 A software project for this. Target system: A smartphone product of a release cycle for 6 months 700 components ... 300 components ... We study defect-proneness of those 1,000 components
  48. 48. Do reviewing practices impact software quality? RQ1: Review coverage RQ2: Review participation 48
  49. 49. Review coverage of a component Code repository of 1 component 1 commit 49
  50. 50. 50 Review coverage of a component ✓ ✓ ✓ ✓ ✓ ✓ ✓ 8 commits 4 reviewed 5 commits 1 reviewed 2 commits 2 reviewed Review is performed at the Sony Mobile's Gerrit
  51. 51. Review coverage of a component ✓ ✓ ✓ ✓ ✓ ✓ ✓ 50% 20% 100% 51
  52. 52. However, it does NOT share relationship with defect-proneness ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗50% 52 ✗20% ✗100%
  53. 53. At Sony Mobile, review status is equivalent to whether it is made 'In-House' ✓ ✓ ✓ ✓ Sony Mobile's internal patches Linux kernel's baseline commits 53
  54. 54. But, our defined bags look too small to represent 'In-House' made ratio ✓ ✓ ✓ ✓ Commits during development of Slipped historic kernel commits 54
  55. 55. 55 We adjusted the definition of 'review coverage' ✓ ✓ ✓ ✓ → ??% Proportion of 'In-House' commits in total
  56. 56. 56 Adjusted review coverage shares relationship with defect-proneness ✓✓ ✓✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓0.1% ✓1% ✓100% External originated components tend to be more defect prone.
  57. 57. Do reviewing practices impact software quality? RQ1: Review coverage ✓YES!In-House ratio RQ2: Review participation ???57
  58. 58. 58 Ok, code-reviewed but... ✓ A reviewed commit no guarantee of active participation
  59. 59. ✓ Definitions of participation Who approved this commit? 59
  60. 60. ✓ Sufficient discussion (effort) made? Who approved this commit? 60 Definitions of participation
  61. 61. ✓ Code review participation metrics Sufficient discussion (effort) made? Number of commits approved by her/his own Discussion volume recorded in Gerrit WWhohodid aapppprroovveedtthhiiss ccoommmmiitt?? 61
  62. 62. ✓ Sufficient discussion (effort) made? approved by her/his own ✗Number of commits recorded in Gerrit ✗Discussion volume But, they do NOT share relationship with defect proneness WWhohodid aapppprroovveedtthhiiss ccoommmmiitt?? 62
  63. 63. We adjust self-approval We only counted the number of self “Code-Review” We also count the number of self “Verified” 63
  64. 64. Code review process at Sony Mobile Code reviewer Programmer 1. Code. 2. Upload. 3. Review. 4. Verify on HW. Gerrit server 5. Privileged. 6. Submit. 64
  65. 65. 65 Code review system at Sony Mobile
  66. 66. We adjust effort “… it is much easier to work with direct communication rather than with the Gerrit tools.” Software architect 66 No longer assume discussion is online. We introduce a new “patch update ratio”
  67. 67. ✓ Who verified this commit? Sufficient discussion (effort) made? verified by her/his own ✓Number of commits activity 67 ✓Patch updates Adjusted review participation metrics share relationship with defect-proneness

×