Your SlideShare is downloading. ×
Approximating Change Sets at Philips Healthcare: A Case Study
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Approximating Change Sets at Philips Healthcare: A Case Study

951
views

Published on

Talk presented on March 4, 2011 at the 15th European Conference on Software Maintenance and Reengineering in Oldenburg, Germany. …

Talk presented on March 4, 2011 at the 15th European Conference on Software Maintenance and Reengineering in Oldenburg, Germany.

Abstract: A single development task such as solving a bug or implementing a new feature often involves changing a number of entities, also known together as a change set. Change sets can be approximated from the version control system. They are then used by the architects and developers to take important decisions. So change sets need to be approximated carefully. It is common to assume that two entities checked-in less than a small time interval from each other, and having the same meta-data associated with them, belong to the same transaction. Transactions may be good approximations of change sets if developers commit change sets in one go and if the required meta-data is available. This is however not the case in the industrial environment (Philips Healthcare) we study. Our paper presents a case study in which we investigated how change sets can be approximated in an environment with a complex workflow and limited meta-data in the version repositories. We found that, dependent on the commit practices used, a suitable time intervals between check-in timestamps of files has to be determined and leveraged to reliably approximate change sets.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
951
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - increase efficiency\n- reduce overheads\n- reduce costs\n\n
  • - increase efficiency\n- reduce overheads\n- reduce costs\n\n
  • - increase efficiency\n- reduce overheads\n- reduce costs\n\n
  • \n
  • - few changes in files across different subsystems\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - series of intermediate commits\n- prevent against data loss\n- parallelization of tasks - develop an interface so everyone can keep working\n\n
  • - not possible to learn the reason for change\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - time slots chosen with consulatayion with developers n managers\n
  • \n
  • \n
  • \n
  • \n
  • - explained pr\n
  • - explained pr\n
  • - explained pr\n
  • - explained pr\n
  • \n
  • - 1 month max is 100% means some tasks are really long.\n
  • \n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • --- not change set specific\n--- incomplete changes\n--- generic reasons\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Approximating Change Sets at Philips Healthcare: A Case Study Adam Vanya Rahul Premraj Hans van Vliet VU University Amsterdam
    • 2. some healthcare products that are subjects of this study...
    • 3. Achieva 3.0T TX
    • 4. Intera 1.5T MRI
    • 5. Panorama HFO
    • 6. Philips MRI Systems• Eight million lines of code across 34,000 files.• C, C++, and C# used.• Hundreds of developers across 3 sites.• An old version of IBM ClearCase used for version control.• Nine years of version control data available.
    • 7. !"#$%&(%&)*+,),-(%,(.*+/+0-
    • 8. “ !"#$%&"$%()*"#$+,#-$ #*+$,%$-*./,*./$*/0$ ” #1%21#$./$,3#$4&,&"#5
    • 9. “Re-engineering in the large...” Jens Borchers Reengineering from a Practitioner’s View – A Personal Lesson’s Learned Assessment Invited talk at CSMR 2011
    • 10. Re-engineering in the largeBy functionality By team location By other criteria
    • 11. “!"#$%"&$()*+&,-&.)/0&1&2$3,4$*5&0$.&16$,/&7%2/&7*8)*+&.01/&1*8&.0"%"&/0"&-%$63"(&)29 ” !"#$%#&%()%*%+,#+-.&/ .&01%2+#3%(04
    • 12. We need data! • Which developers change which files? • Which functionality is implemented in a file? • Which sub-systems are often changed together? • ...Change Sets
    • 13. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 31, NO. 6, JUNE 2005 429 Mining Version Histories to Guide Software Changes Thomas Zimmermann, Student Member, IEEE, Peter Weißgerber, Stephan Diehl, and Andreas Zeller, Member, IEEE Computer Society Abstract—We apply data mining to version histories in order to guide programmers along related changes: “Programmers who changed these functions also changed....” Given a set of existing changes, the mined association rules 1) suggest and predict likely further changes, 2) show up item coupling that is undetectable by program analysis, and 3) can prevent errors due to incomplete changes. After an initial change, our ROSE prototype can correctly predict further locations to be changed; the best predictive power is obtained for changes to existing software. In our evaluation based on the history of eight popular open source projects, ROSE’s topmost three suggestions contained a correct location with a likelihood of more than 70 percent. Index Terms—Programming environments/construction tools, distribution, maintenance, enhancement, configuration management, clustering, classification, association rules, data mining.
    • 14. Identifying TransactionsA:1.3 B:1.6 C:1.1 D:1.3 E:1.5
    • 15. Identifying Transactions developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5
    • 16. Identifying Transactions developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5 same author + same log message
    • 17. Identifying Transactions developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5 same author + same log message 200 seconds
    • 18. Identifying Transactions developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5 same author + same log message 200 seconds
    • 19. Identifying Transactions developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5 same author + same log message 200 seconds
    • 20. Change Sets Task 1 Task 2A:1.3 B:1.6 C:1.1 D:1.3 E:1.5 A:1.4 J:1.2 E:1.6
    • 21. Environment at Philips• Developers often commit files associated to more than one task.• No build-able system required for commit.• Multiple developers may work together for complex tasks.
    • 22. Environment at Philips Developers rarely add commit messages!A:1.3 B:1.6 C:1.1 D:1.3 E:1.5
    • 23. Environment at Philips Developers rarely add commit messages! developer:
hugo log
msg.
:
Fixed
bug
#13463 timestamp:
Jul
23
2005
02:16:57A:1.3 B:1.6 C:1.1 D:1.3 E:1.5
    • 24. Identifying Change Sets
    • 25. Identifying Change Sets200 seconds
    • 26. Identifying Change Sets200 seconds
    • 27. Identifying Change Sets200 seconds 1 hour
    • 28. Identifying Change Sets200 seconds 1 hour 1 day 1 week 1 month
    • 29. Approximated Change Sets Table I: The Approximated Change Sets (ACS) #Check-ins per ACS δ #ACSs Min. Max. Med. Avg. 200 sec 115487 1 14002 2 8 1 hour 82571 1 14551 2 11 1 day 42447 1 14551 4 22 1 week 13568 1 19404 9 69 1 month 3408 1 27502 27 275
    • 30. Approximated Change Sets #Check-ins per ACS“ !"#$"%&(%#)*(+,-.%#/% δ #ACSs Min. Max. Med. Avg. ” 200 sec 115487 1 14002 2 8 0(/*%*1%2/(314551 4 22 1 hour 82571 1 day 42447 1 1 14551 2 11 1 week 13568 1 19404 9 69 1 month 3408 1 27502 27 275
    • 31. Evaluating Change Sets! ! !"#$$%&(&) *+,% !"#$%&#$# -(.(/#0("$
    • 32. Evaluating Change Sets! ! !"#$$%&(&) *+,% !"#$%&#$# -(.(/#0("$
    • 33. Developer survey
    • 34. Developer survey
    • 35. Developer survey
    • 36. Developer survey
    • 37. Developer survey
    • 38. Developer survey
    • 39. Developer survey• Ten most active developers invited to participate. Eight responded.• Participants briefed on purpose of survey, how change sets were approximated, and possibility to discontinue survey.• Randomly drawn change sets presented.• Developers evaluated 75 change sets.
    • 40. Precision from survey Table II: Precision estimated with help of developersTime interval Precision (in %) #ACS ∗ (δ) Max. Min. Avg. Analyzed Skipped 200 seconds 100 50 91 19 3 1 hour 100 33 91 15 4 1 day 100 40 78 21 4 1 week 100 6 66 14 7 1 month 100 2 36 6 8 ∗ ACS stands for Approximated Change Sets
    • 41. Precision from survey Table II: Precision estimated with help of developersTime interval Precision (in %) #ACS ∗ (δ) Max. Min. Avg. Analyzed Skipped 200 seconds 100 50 91 19 3 1 hour 100 33 91 15 4 1 day 100 40 78 21 4 1 week 100 6 66 14 7 1 month 100 2 36 6 8 ∗ ACS stands for Approximated Change Sets
    • 42. Evaluating Change Sets! ! !"#$$%&(&) *+,% !"#$%&#$# -(.(/#0("$
    • 43. Evaluating Change Sets! ! !"#$$%&(&) *+,% !"#$%&#$# -(.(/#0("$
    • 44. Example Postlist Unique ID <POSTLIST_NAME>nly95872_RFAmpSim_20060103T150114 Stream information <DEVSTREAM>FEMAIN Developer <USER>Anna Date <DATE_DD_MMM_YYYY>3 JAN 2006Is it a problem report? <PR_SECTION>Y Problem report number <SOLVED_PR>MR00035599Rationale behind change <REASON_TEXT>Improve simulation of the RF-Amplifier <CODING_STANDARD>N <PNS_SAR>N Developers playing <BBLOCK_OWNER>James a role in the <REVIEWER>Robert review process <TEAMLEADER>David <DOC_SECTION>NChanged files submitted <POSTLIST_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain5 for review <POSTLIST_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main3 <POSTLIST_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain5 <TEST_SECTION>Y <TEST_DONE>@OTM6 Last versions of files <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain2 used to build system <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main1 <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain4
    • 45. Example Postlist Unique ID <POSTLIST_NAME>nly95872_RFAmpSim_20060103T150114 Stream information <DEVSTREAM>FEMAIN Developer <USER>Anna Date <DATE_DD_MMM_YYYY>3 JAN 2006Is it a problem report? <PR_SECTION>Y Problem report number <SOLVED_PR>MR00035599Rationale behind change <REASON_TEXT>Improve simulation of the RF-Amplifier <CODING_STANDARD>N <PNS_SAR>N Developers playing <BBLOCK_OWNER>James a role in the <REVIEWER>Robert review process <TEAMLEADER>David <DOC_SECTION>NChanged files submitted <POSTLIST_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain5 for review <POSTLIST_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main3 <POSTLIST_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain5 <TEST_SECTION>Y <TEST_DONE>@OTM6 Last versions of files <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain2 used to build system <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main1 <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain4
    • 46. Example Postlist Unique ID <POSTLIST_NAME>nly95872_RFAmpSim_20060103T150114 Stream information <DEVSTREAM>FEMAIN Developer <USER>Anna Date <DATE_DD_MMM_YYYY>3 JAN 2006Is it a problem report? <PR_SECTION>Y Problem report number <SOLVED_PR>MR00035599Rationale behind change <REASON_TEXT>Improve simulation of the RF-Amplifier <CODING_STANDARD>N <PNS_SAR>N Developers playing <BBLOCK_OWNER>James a role in the <REVIEWER>Robert review process <TEAMLEADER>David <DOC_SECTION>NChanged files submitted <POSTLIST_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain5 for review <POSTLIST_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main3 <POSTLIST_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain5 <TEST_SECTION>Y <TEST_DONE>@OTM6 Last versions of files <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain2 used to build system <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main1 <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain4
    • 47. Example Postlist Unique ID <POSTLIST_NAME>nly95872_RFAmpSim_20060103T150114 Stream information <DEVSTREAM>FEMAIN Developer <USER>Anna Date <DATE_DD_MMM_YYYY>3 JAN 2006Is it a problem report? <PR_SECTION>Y Problem report number <SOLVED_PR>MR00035599Rationale behind change <REASON_TEXT>Improve simulation of the RF-Amplifier <CODING_STANDARD>N <PNS_SAR>N Developers playing <BBLOCK_OWNER>James a role in the <REVIEWER>Robert review process <TEAMLEADER>David <DOC_SECTION>NChanged files submitted <POSTLIST_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain5 for review <POSTLIST_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main3 <POSTLIST_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain5 <TEST_SECTION>Y <TEST_DONE>@OTM6 Last versions of files <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain2 used to build system <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main1 <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain4
    • 48. Example Postlist Unique ID <POSTLIST_NAME>nly95872_RFAmpSim_20060103T150114 Stream information <DEVSTREAM>FEMAIN Developer <USER>Anna Date <DATE_DD_MMM_YYYY>3 JAN 2006Is it a problem report? <PR_SECTION>Y Problem report number <SOLVED_PR>MR00035599Rationale behind change <REASON_TEXT>Improve simulation of the RF-Amplifier <CODING_STANDARD>N <PNS_SAR>N Developers playing <BBLOCK_OWNER>James a role in the <REVIEWER>Robert review process <TEAMLEADER>David <DOC_SECTION>NChanged files submitted <POSTLIST_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain5 for review <POSTLIST_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main3 <POSTLIST_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain5 <TEST_SECTION>Y <TEST_DONE>@OTM6 Last versions of files <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.oneBDRFAmplifierNorf.h@@mainfemain2 used to build system <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.twobdtransmittercsinterfaces.hcf@@main1 <PREVIOUSLY_CONSOLIDATED_FILE>path.to.file.threeBDRFAmplifierNorf.cpp@@mainfemain4
    • 49. <SWID>29 Results from postlists Figure 5: A sampable III: Precision estimated IV: Recall estimated using postlists Table using postlists Time interval Time interval Precision (in %) Recall (in %) (δ) Max. Min. Avg. Max. (δ) Min. Avg. 200 seconds 100 200 seconds 50 93 100 20 74 1 hour 100 20 1 hour 89 100 20 84 1 day 100 1 1 day 69 100 22 92 1 week 100 31 <11 week 100 24 94 1 month 95 <1 month 1 8 100 25 94
    • 50. <SWID>29 Results from postlists Figure 5: A sampable III: Precision estimated IV: Recall estimated using postlists Table using postlists Time interval Time interval Precision (in %) Recall (in %) (δ) Max. Min. Avg. Max. (δ) Min. Avg. 200 seconds 100 200 seconds 50 93 100 20 74 1 hour 100 20 1 hour 89 100 20 84 1 day 100 1 1 day 69 100 22 92 1 week 100 31 <11 week 100 24 94 1 month 95 <1 month 1 8 100 25 94
    • 51. So, what works?• The one hour time interval works best for our environment.• Optimal time interval may differ from one environment to another.
    • 52. Threats to validity• Assumed developers have a good recall of their own change sets.• Postlists were carefully selected, but in rare cases may relate to more than one change set.• Change sets with multiple developers involved were not captured completely.
    • 53. Summary
    • 54. Summary
    • 55. Summary re- r p u y o lu a te a taE v a e d d c e s s pr o s e ! re u b e fo