Developing software at scale cs 394 may 2011


Published on

I gave a guest lecture in software engineering to Chris Riesbeck's CS394 class at Northwestern in spring 2011. See my related blog post at

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • StoriesQ1: David & Testing on EmulatorQ2: “We’re at 89%, how is that different than 90%?”Q3: QFEsQ4: Times that I held the line, and times that I didn’t
  • Developing software at scale cs 394 may 2011

    1. 1. Todd Warren – CS 394 Spring 2011<br />Developing Software at Scale: Lessons from 20+ Years at Microsoft<br />
    2. 2. Today<br />Team structure at Microsoft<br />Product Complexity and Scheduling<br />Quality and software testing<br />Knowing when to ship<br />
    3. 3. Programs vs. Software products<br />3x<br />3x<br />9x<br />Source: Fred Brooks Jr., Mythical Man Month, “The Tar Pit”<br />
    4. 4. Software Products vs. Custom Software Development<br />Source: Hoch, Roeding, Purkert, Lindner, “Secrets of Software Success”, 1999<br />
    5. 5. Roles<br />Product Manager<br />User-Interface designer<br />End-user liaison<br />Project Manager<br />Architect<br />Developers<br />Tool Smith<br />QA/Testers<br />Build Coordinator<br />Risk Officer<br />End User Documentation<br />Program Management<br />Software Development Engineers<br />Test and Quality Assurance<br />User Assistance / Education<br />Source: McConnell<br />
    6. 6. Resources<br />Size Matters!<br />Different Methodologies and Approaches<br />Scope of Feature and Quality Matters<br />Affects Level of Process needed and overhead<br />5 person teams: Moderate Process, Shared Roles<br />24 person teams (PMC): Moderate Process, Lifecycle oriented roles and specialization—good for “Extreme” style process<br />60-100 (MS Project): Moderate Process, some loose functional specialization and lifecycle<br />100-200 (Windows CE) person teams: Medium to Heavy Process, Lifecycle roles and functional specialization<br />1000+ Person Teams (Windows Mobile): Heavy Process, Multiple Methodologies, Formal Integration Process<br />Higher Quality==more rigorous process<br />True also for open source, online projects<br />Apache is best example of very specified culture of contribution<br />
    7. 7. Organization and its affect on Products<br />VeryFormal<br />Casual<br />More Formal<br />
    8. 8. Project Interdependency Matters:“Star” or “Mesh”<br />Office<br />Windows<br />Edge<br />Edge<br />Edge<br />Edge<br />Edge<br />Edge<br />Core<br />Edge<br />Core<br />Edge<br />Edge<br />Edge<br />Edge<br />Edge<br />
    9. 9. A ‘Typical’ Product Group <br />25% Developers<br />45% Testers<br />10% Program Management<br />10% User Education / Localization<br />7% Marketing<br />3% Overhead<br />
    10. 10. Small Product: Portable Media Center <br />1 UI Designer<br />5 Program managers<br />8 Developers<br />10 testers<br />
    11. 11. Microsoft Project<br />30 Developers (27%)<br />36 Testers (33%)<br />15 Program Mgrs (14%)<br />20 UA/Localization (18%)<br />6 Marketing (5%)<br />3 Overhead (3%)<br />
    12. 12. Exchange Numbers (circa 2000)<br />112 Developers (25.9%)<br />247 Testers (57.3%)<br />44 Program Mgrs. (10.2%)<br />12 Marketing (2.7%)<br />16 Overhead (3.7%)<br />
    13. 13. Windows CE (circa 2002)<br />
    14. 14. Windows Mobile (circa 2009)<br />
    15. 15. Amount of Time <br />3 month maximum is a good rule of thumb for a stage/milestone.<br />Hard for people to focus on anything longer than 3 months.<br />Never let things go un-built for longer than a week<br />
    16. 16. Smaple Staged Timeline (Project 2000)<br />
    17. 17. How Long?<br />216 days development (truthfully probably more like 260d)<br />284 days on “testing” in example<br />Component Tests: 188d<br />System wide tests:~97d<br />50/50 split between design/implement and test/fix<br />Some Projects (e.g. operating systems, servers) longer integration period (more like 2:1)<br />Factors: How distributed, number of “moving parts”<br />Show why some of the Extreme methodology is appealing.<br />
    18. 18. Fred Brooks OS/360 Rules of thumb<br />1/3 planning<br />1/6 coding<br />1/4 component test and early system test<br />1/4 system test, all components in hand<br />
    19. 19. Office 2000 Schedule<br />
    20. 20. A few projects compared to Brooks<br />
    21. 21. Quality and Testing<br />Design in Scenarios up front<br />What is necessary for the component<br />UI is different than API<br />Server is different than client<br />Set Criteria and usage scenarios<br />Understanding (and controlling if possible) the environment in which the software is developed and used<br />“The last bug is found when the last customer dies”<br />-Brian Valentine, SVP eCommerce, Amazon<br />
    22. 22. Example of ComplexityTopology Coverage<br />Exchange versions: 4.0 (latest SP), 5.0 (latest SP) and 5.5 <br />Windows NT version: 3.51, 4.0 (latest SP’s)<br />Langs. (Exchange and USA/USA, JPN/Chinese, JPN/Taiwan, JPN/Korean,Windows NT): JPN/JPN, GER/GER, FRN/FRN<br />Platforms: Intel, Alpha, (MIPS, PPC 4.0 only)<br />Connectors X.400: Over TCP, TP4, TP0/X.25<br />Connectors IMS: Over LAN, RAS, ISDN<br />Connectors RAS: Over NetBEUI, IPX, TCP<br />Connector interop: MS Mail, MAC Mail, cc:Mail, Notes<br />News: NNTP in/out<br />Admin: Daily operations<br />Store: Public >16GB and Private Store >16GB <br />Replication: 29 sites, 130 servers, 200,000 users, 10 AB views<br />Client protocols: MAPI, LDAP, POP3, IMAP4, NNTP, HTTP <br />Telecommunication: Slow Link Simulator, Noise Simulation<br />Fault tolerance:Windows NT Clustering<br />Security: Exchange KMS server, MS Certificate Server<br />Proxy firewall: Server-to-Server and Client-to-Server<br />
    23. 23. Complexity 2: Windows CE<br />5m lines of code<br />4 processor architectures<br />ARM/Xscale, MIPS, x86, SH<br />20 Board Support Packages<br />Over 1000 possible operating system components<br />1000’s of peripherals<br />
    24. 24. Complexity 3: Windows Mobile 6.x<br />2 code instances (“standard” and “pro”)<br />4 ARM Chip Variants<br />3 memory configuration variations<br />8 Screen sizes (QVGA, VGA, WVGA, Square..)<br />60 major interacting software components<br />3 network technologies (CDMA, GSM, WiFi)<br />Some distinct features for 7 major vendors<br />100 dependent 3rd party apps for a complete “phone”<br />
    25. 25. Bugs over the lifecycle<br />
    26. 26. Bugs over the lifecycle<br />
    27. 27. Flow of tests during the cycle<br />Unit TestsImplemented<br />FeatureImplemented<br />Feature isSpecified<br />Test Design<br />Is written<br />Test ReleaseDocument<br />ComponentTesting<br />SpecializedTesting<br />SystemTest”<br />Bug Fix<br />RegressionTests<br />
    28. 28. Ways of Testing<br />Types of Tests<br />Black Box<br />White Box<br />“Gray” Box<br />Stage of Cycle<br />Unit Test / <br />Verification Test<br />Component<br />Acceptance Test<br />System Test<br />Performance Test<br />Stress Test<br />External Testing (Alpha/Beta/”Dogfood”)<br />Regression Testing<br />
    29. 29. Four Rules of Testing<br />Guard the Process<br />Catch Bugs Early<br />Test with the Customer in Mind<br />M0<br />M1<br />M2<br />RTM<br />Make it Measurable<br />Ship Requirement<br />ProOnGo LLC – May 2009<br />
    30. 30. Inside the Mind of a Tester<br />How close are we to satisfying agreed upon metrics/criteria?<br />Are the criteria passing stably, every time we test?<br />What are we building,<br />and why?<br />What do our bug trends say about our progress?<br />How risky is this last-minute code check-in?<br />What metrics and criteria summarize customer demands?<br />Based on current trends, when will we pass all criteria?<br />Do we pass all criteria? If not: what, why, how?<br />Can we reliably measure these metrics and criteria?<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    31. 31. What are we building, and why? Any problems with this?<br />What metrics and criteria summarize customer demands?<br />Can we reliably measure these metrics and criteria?<br />M0: Specs & Test Plans<br />// An API that draws a line from x to y<br />VOID LineTo(INT x, INT y);<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    32. 32. Balance: Fast vs. Thorough<br />Canary<br />Most<br />Frequent<br />Shallow Coverage<br />Least<br />Frequent<br />Completes<br />Coverage<br />Build Verification Tests<br />Automated Test Pass<br />Manual Test Pass<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    33. 33. Fast Tests that can automatically run at check-in time<br />Static Code Analysis (like lint)<br />Trial build, before check-in committed to SCM<br />Form-field tests:<br />Check-In cites a bug number?<br />Code-reviewer field filled out?<br />Canary & Check-In Tests<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    34. 34. Build Verification Test<br />Goal: find bugs so heinous that they could…<br />Block ability to dogfood<br />Derail a substantial portion of test pass (5%?)<br />Unwritten Contract:<br />You break the build, you fix it within an hour<br />Day or night<br />Holds up productivity of entire team<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    35. 35. Automated Test Pass<br />Example on a Microsoft product:<br />Number of test cases: 6 digits<br />Number of test runs: 7 digits<br />14 different target device flavors<br />Runs 24/7, results available via web<br />Automatic handling of device resets / failsafe<br />Requires creativity:<br />How would you automate an image editor?<br />A 3D graphics engine?<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    36. 36. Manual Test Pass<br />Cost of automating vs. Cost of running manually<br />Rationale/Quantitative way to decide whether to automate<br />Reality: few organizations maximize benefits of automation<br />Therefore, manual testing lives on<br />Tough “Automated vs. Manual” decisions:<br />Testing for audio glitches (does the audio crackle?)<br />Does the UI feel responsive enough?<br />RTM Milestone<br />Confirm, Ship<br />M0<br />Specs & Test Plans<br />M1 .. Mn<br />Development & Test<br />ProOnGo LLC – May 2009<br />
    37. 37. Tracking Bugs<br />Who found<br />When<br />What, and it’s seveirty<br />How to reproduce<br />What part of the product<br />Create a<br />Where fixed and by whom<br />State<br />Open, Resolved, Closed<br />Disposition<br />Fixed, Not Fixed, Postponed, “By Design”<br />
    38. 38. Release Criteria<br />What must be true for a release to be done or complete<br />Includes a mix of criteria<br />All features implemented and reviewed<br />Documentation complete<br />All Bugs Closed (not necessarily fixed)<br />Performance Criteria met<br />Video<br />
    39. 39. Bug “Triage”<br />Late in the cycle, a process for determining what to fix<br />Getting people together and prioritizing impact on release criteria and overall stability goals<br />Even Known Crashing bugs are postponed depending on criteria<br />
    40. 40. Summary<br />With software products, know what to build for the customer<br />Have checkpoints for progress (Milestones)<br />Many types of testing and structure: right tool for the job<br />Determine and measure ship criteria<br />