Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Stop the Line practice in SW development


Published on

Stop the Line + Stop Feature Development, Lean practices for Software Product Development - F-Secure's experience report at LESS2011

Published in: Technology

Stop the Line practice in SW development

  1. 1. Stop the Line + Stop Feature DevelopmentLean practices for Software Product DevelopmentGabor Gunyho | Juan Gutierrez Plaza | Régis DéauImprovement Coach Senior Manager, Agile Practices Manager, Testing Practices2011-11-01Protecting the irreplaceable |
  2. 2. F-Secure – the company• Founded in 1988, listed on NASDAQ OMX Helsinki• Market cap ca 350 m€, annual revenue ca 130 m€ (2010)• Headquartered in Helsinki, 18 country offices, presence in more than 100 countries• 812 people, 300+ in R&D, 5 R&D offices in 4 countries (2010)2 2011-11-01 © F-Secure Public
  3. 3. Products and Services Win, Mac, Linux, Android, iOS, RIM, Symbian, 20+ language versions3 2011-11-01 © F-Secure Public
  4. 4. Customers: 200+ operator partners globally 22 20 18 16 14 12 10 8 6 4 2 0 Operator revenue (mEur/quarter)4 2011-11-01 © F-Secure Public
  5. 5. About the AuthorsGabor Gunyho Regis Déau Juan Gutierrez PlazaImprovement Coach with the “R&D Testing practices Manager at F-Secure Currently „Agile Practices Manager‟Global Methods” team at F-Secure, SDC unit, focusing on developing an at F-Secure‟s SDC unit, focusing onexperienced Agile and Lean product agile testing culture and improve the R&D transformation of the site.development expert, contributor and quality engineering practices for Experienced coach who has helpedreviewer of books on scaling Agile continuously improving the R&D different teams to improve in eng.and Lean SW development standards and process practices5 2011-11-01 © F-Secure Public
  6. 6. What is this presentation all about?• No “recipe”• Just to share how we did it Image source: Text source: 2011-11-01 © F-Secure Public
  7. 7. The Project7 2011-11-01 © F-Secure Public
  8. 8. Project setup• Between 10 and 12 teams (about 100 people) • Mostly in Helsinki, some in Kuala Lumpur, later also one in Poland • Mostly feature teams • Fairly mature in basic Scrum[1] and Agile engineering practices • Some experience in multi-team projects[2][3] but not on this scale• Major new product, significant changes in • Business model • Architecture • Longer-Term Planning[4][5], including new backlog tooling8 2011-11-01 © F-Secure Public
  9. 9. Project timeline• Started: Dec 2009 • This presentation counts data from March 2010• Project Split and Spin-off: March 2010• Intermediate Public Release: Sept 2010 • Limited scope• Stop the Line practice: since Sept 2010 (1st draft in June) • Simplification of the practice: Oct 2010 • StL enforcer added on: March 2011• Stop Feature Development practice: since Sept 2010• Two-week sprints: 46 so far • Most resulted in a public Technology Preview release• Public Release Oct 20119 2011-11-01 © F-Secure Public
  10. 10. Stop the Line10 2011-11-01 © F-Secure Public
  11. 11. What is it?A practice coming from Lean that is originated from theToyota Production System (TPS) [6] Stop-the-Line Work is stopped if an abnormality is found. Work continues only when problem is fixed.11 2011-11-01 © F-Secure Public
  12. 12. What is it? – The Line “Line” refers to production/assembly lines in automobile industry where one station takes the output of the previous station as input Image source: 2011-11-01 © F-Secure Public
  13. 13. What is it? - Stopping• If a problem is found, anyone can “pull the cord” that: • Stops the line from moving ahead • Signals the problem to everyone on the line pointing to the station in trouble Image source: Image source: 2011-11-01 © F-Secure Public
  14. 14. Fixing once and for allWhy it happened? How to avoid it?• The problem is fixed immediatelyIn addition• To get all the benefits of the Stop the Line practice, a root cause analysis is done to find what caused the problem• To prevent recurrence of the same problem, fix the root cause too14 2011-11-01 © F-Secure Public
  15. 15. Why to use it?• Focus on quality at all times• Avoid burying problems deep in the product where it‟s more difficult to fix it, potentially adding more problems on top of the identified ones• Everybody is aware of the problem so anyone who can help, can contribute to fixing it• Identify recurrent (systemic) problems so they are solved once and for all15 2011-11-01 © F-Secure Public
  16. 16. … and for us in SW. Development? (1/3)• Detection • A Stop-the-Line is raised when • A build is failing (e.g. it doesnt compile or pass unit testing) • Automated smoke test fails for more than 2 consecutive times • A problem prevents manual testing to be performed • Signals and automated actions • Stop-the-Line radiator raises Stop-the-Line flag for the “line” i.e., product area • Stop-the-Line commit hook prevents commits to source repository for the affected line, except for fixing the StL case16 2011-11-01 © F-Secure Public
  17. 17. … and for us in SW. Development? (2/3)• Notification • E-mail (first approach, issued manually) • Stop-the-Line Radiator (since March 2010, automatically, from the build system, with automated scripts)17 2011-11-01 © F-Secure Public
  18. 18. … and for us in SW. Development? (3/3)• Fixing • A team or person claims the issue using the claim functionality in radiator and then starts investigating it • Same or other team or person starts fixing the problem • Issues not claimed before next day are handled in the daily Scrum of Scrums and picked up by some team • Team works on Stop-the-Line case as high priority item until it is handled • When radiator no longer declares Stop-the-Line, team is freed from this responsibility • Other teams not affected by the StL case can continue working on their area• Prevention • Team worked on the Stop-the-Line case conducts a root cause analysis for selected cases and records findings then sends note to project mailing list18 2011-11-01 © F-Secure Public
  19. 19. In short…Detection & Reaction Preventionvisualization Fix the Problem is Stop the Root cause Fix the root problem Found Line Analysis cause immediately• Detected in • Team claims the StL case • Root cause analysis done by the Test • Team investigates and fixes issue team (or multiple teams, if needed) Automation • Root causes are documented and• Visualization records made available by the radiator • Fixing of root causes is initiated (fixing root causes may take significant effort and time, ROI analysis and planning takes place for bigger initiatives)19 2011-11-01 © F-Secure Public
  20. 20. An Implementation Detail• Rule #1 when the StL is on then do not commit new feature development code to the module that has the StL, only commit bug fixes• Unfortunately not everyone was careful enough to follow this rule systematically so some commits not related to fixing the StL problem were done whilst StL was still on• To prevent the human errors an automated tool was introduced to enforce the rule #1, the “StL enforcer” • Hook was added in the repository that checks if the commit is done during a StL event, and if so, commits are rejected, except for those targeted fixing the StL case • Introduced in the middle of the project (March 2010)20 2011-11-01 © F-Secure Public
  21. 21. Bug Handling &Stop Feature Development21 2011-11-01 © F-Secure Public
  22. 22. Bug Handling - the Old Model• High level concept: • Using the bug count metric: • A way to measure quality • A list of bugs (and a long one, • Release Quality Engineer follows, i.e., “bug warehouse”) reports and escalates (no real process to react) • Bug life cycle:• Decision making order: • Store all, prioritize continuously 1. Release Quality Engineer or Project level bug review • Only high priority bugs get fixed • Rest remains on the list (>95% of all) 2. Team bug review • Maintenance gets all bugs that 3. Team member development project did not have time to fix before the release date • Maintenance never fixes these22 2011-11-01 © F-Secure Public
  23. 23. Redefining bug handling: Our Goal• Very fast track in closing new cases • Get all new cases closed in less than 4 weeks (2 sprints) • Make decision quickly, closest to the actual place of work• Avoid building a big inventory (warehouse) of bugs by all means • To reduce recurring effort of prioritizing a long list23 2011-11-01 © F-Secure Public
  24. 24. Bug Handling - The New Model• Reversing the old decision-making order, the new order: 1. Team members 2. Team bug review 3. Team bug review with Product Owner (+other stakeholders if needed) 4. Project bug review24 2011-11-01 © F-Secure Public
  25. 25. Bug Handling - The New Model• Using bug count limits – Stop Feature Development • Work guidance: • X bugs / team  STOP new development in team • Y bugs / project  STOP the new development in whole project• Bug life cycle: • Extremely fast handling cycle: • Fix in this sprint • Fix in next sprint • Trash (with “reason” category) • For maintenance • Yes we fix • Trash (with “reason” category)25 2011-11-01 © F-Secure Public
  26. 26. Stop Feature Development (SFD)• What is it, then? • An enhancement for StL • Line is stopped not only when tests are not passing but when the number of non-critical bugs go over a threshold: • Per team • Per project Later: • Per Product Area• Why? • To control another dimension of the system dynamics Image sources: 2011-11-01 © F-Secure Public
  27. 27. Stop Feature Development (SFD)• When/how to invoke it? (examples) • 10+ cases / team -> Stop Feature Development for the team • 100+ cases / project -> Stop Feature Development for the whole project Later another dimension was added: • X+ cases / product area -> Stop Feature Development for the product area • Product Area A limit: 60 bugs • Product Area B limit: 30 bugs • Product Area C1 limit: 20 bugs • Product Area C2 limit: 30 bugs27 2011-11-01 © F-Secure Public
  28. 28. Stop Feature Development (SFD)• When/how to “resume the line”? • Hysteresis was added to the system to avoid unwanted rapid switching of state, e.g., • Product Area C2 limit for SFD: bug count > 30 • Product Area C2 limit for Resume-the-Line: bug count < 2728 2011-11-01 © F-Secure Public
  29. 29. The new bug handling process - overview Some valid bugs will get trashed, but that is OK in this process!29 2011-11-01 © F-Secure Public
  30. 30. The new bug handling process - summary• Bug is created: Choose the correct product area and prioritize the bug with your best guess.• Decision: A team decides whether the bug is fixed in this sprint, next sprint or trashed.• Fix and test: A team fixes and tests their fix.• Closing: All new bugs are closed in no more than 2 sprints (4 weeks).• Inside the teams: • Bugs not tracked if… • The bug is fixed and tested by the team within the sprint • The bug does not cross sprint or team boundaries30 2011-11-01 © F-Secure Public
  31. 31. 31 2011-05-09 © F-Secure Public
  32. 32. Statistics32 2011-11-01 © F-Secure Public
  33. 33. StL+SFD Events vs. Releases StL + SFD StL Enforcer Apr 1 July 1 Oct 1 Jan 1 Apr 1 July 1 Now 2010 2010 2010 2011 2010 201133 2011-11-01 © F-Secure Public Note: StL event data is not available Sp 31 - 43
  34. 34. Sprint 29 – StL Root Cause Analysis Findings• 11 StL cases was tracked for Sprint 29• Frequent root cause categories: 1. Blind commits (or insufficiently tested commits) 2. Code/environment changes broke Test Automations 3. Large commits• Actions that can prevent similar case in future: 1. Ensure sufficient testing before commit, commit to branch if needed 2. Monitor the radiator for smoke test results after commit (delay the commit to next day if you plan to leave office soon) 3. Developers should test final builds manually more often 4. Make smaller and incremental commits34 2011-11-01 © F-Secure Public
  35. 35. Total Ro Case 9 Case 8 Case 7 Case 6 Case 5 Case 4 Case 3 Case 2 Case 1 ot Case 1035 C au s eC at eg or Co y de2011-11-01 3 1 1 1 br / e n ok v i r e o La TA nm rg en e tc 2 1 1 co ha m m ng es Ha its lf im 1 1 pl em Bl en in te d d fe© F-Secure Public 4 1 1 1 1 co m at m ur es its Te st E 1 1 (IT nv ) ir o nm Ho en w tC fu can ha tu th ng re is es ? be Ha pr rd ev en 3 1 1 1 pr / N te e v ot d Detailed case breakdown en w o in M tion rth on t 3 ito effo he 1 1 1 rR rt ad iat Sm or al ler 2 1 1 co m &i m nc i re Fa ts m st en er ta l 1 1 sh TA or Ha te n rd Co T w m A cy are pl 1 1 im et cle to pl e f em ea De en ture ve tat i 3 1 1 1 bu lope on ild rs s m sh En o ou su re o ld t r fte es 5 1 1 1 1 1 be e s n t Re fo uff re ici d Ed com ent uc m tes a i t ti n to te 1 1 m de g on ve ito lop r er on w ha t
  36. 36. Conclusions36 2011-11-01 © F-Secure Public
  37. 37. Conclusions• Overall quality of the product improved• Number of StL events decreased by time • StL enforcer helped to avoid making mistakes• Not releasing every two weeks BECAME AN EXCEPTION and not a rule• New bug handling process helped on focusing on important bugs• SFD keeps the level of open bugs in a manageable number• After a settle-down period, these practices change the mindset of the people to be more quality focused• Next step: • StL and SFD are “brakes” to avoid accidents, now we are learning how to drive at high speed safely (i.e., avoid making so many bugs in the first place)37 2011-11-01 © F-Secure Public
  38. 38. Questions?38 2011-11-01 © F-Secure Public
  39. 39. AcknowledgementsThe authors would like to thank the whole project team and the whole R&Dorganization of F-Secure and its management for making this presentationpossible and support the data collection and publishingWe‟d like to thank especially • Petri Kuikka • Risto Kumpulainen • Pekka Kiviniemi • Ferrix Hovifor their contribution in the bug handling process, Continuous Integration andTest Automation system and radiator design and implementation and datavisualization39 2011-11-01 © F-Secure Public
  40. 40. References[1] Schwaber, K., Beedle, M.: “Agile Software Development with Scrum”, Prentice Hall (2001)[2] Larman, C., Vodde, B.: “Scaling Lean & Agile Development: Thinking and Organizational Tools for Large-Scale Scrum”, Addison-Wesley Professional (2008)[3] Larman, C., Vodde, B.: “Practices for Scaling Lean & Agile Development: Large, Multisite, and Offshore Product Development with Large-Scale Scrum”, Addison-Wesley Professional (2010)[4] Leffingwell, D.: “Scaling Software Agility: Best Practices for Large Enterprises”, Addison-Wesley Professional (2007)[5] Leffingwell, D.: “Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise”, Addison-Wesley Professional (2011)[6] Womack, J.P., Jones, D.T., Roos, D.: The machine that changed the world (1990, 2007)[7] Poppendieck, M., Poppendieck, T.: “Implementing Lean Software Development: from Concept to Cash”, Addison-Wesley (2007)40 2011-11-01 © F-Secure Public
  41. 41. Contact Information authors did their best to attribute the authors of texts and images, and to recognize any copyrights, see moredetails of copyrights, license terms and conditions for each source under the reference link provided. If you thinkthat anything in this material should be changed, added or removed, please contact the authors at the addressesabove41 2011-11-01 © F-Secure Public