Rapid Software Testing: Reporting


Published on

Test reporting is something few testers take time to practice. Nevertheless, it's a fundamental skill—vital for your professional credibility and your own self-management. Many people think management judges testing by bugs found or test cases executed. Actually, testing is judged by the story it tells. If your story sounds good, you win. A test report is the story of your testing. It begins as the story we tell ourselves, each moment we are testing, about what we are doing and why. We use the test story within our own minds, to guide our work. James Bach explores the skill of test reporting and examines some of the many different forms a test report might take. As in other areas of testing, context drives good reporting. Sometimes we make an oral report, occasionally we need to write it down. Join James for an in depth look at the art of the reporting.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Rapid Software Testing: Reporting

  1. 1. MP PM Tutorial 9/30/2013 1:00:00 PM "Rapid Software Testing: Reporting" Presented by: James Bach Satisfice Inc Brought to you by: 340 Corporate Way, Suite 300, Orange Park, FL 32073 888-268-8770 ∙ 904-278-0524 ∙ sqeinfo@sqe.com ∙ www.sqe.com
  2. 2. James Bach Satisfice, Inc. James Bach is founder and principal consultant of Satisfice, Inc., a software testing and quality assurance company. In the eighties, James cut his teeth as a programmer, tester, and SQA manager in Silicon Valley in the world of market-driven software development. For nearly ten years, he has traveled the world teaching rapid software testing skills and serving as an expert witness on court cases involving software testing.
  3. 3. Rapid Software Testing: Reporting James Bach, Satisfice, Inc. james@satisfice.com www.satisfice.com Rapid Testing Rapid testing is a mind-set and a skill-set of testing focused on how to do testing more quickly, less expensively, with excellent results. This is a general testing methodology. It adapts to any kind of project or product.
  4. 4. The Premises of Rapid Testing 1. 2. 3. 4. 5. 6. 7. 8. Software projects and products are relationships between people, who are creatures both of emotion and rational thought. Each project occurs under conditions of uncertainty and time pressure. Despite our best hopes and intentions, some degree of inexperience, carelessness, and incompetence is normal. A test is an activity; it is performance, not artifacts. Testing’s purpose is to discover the status of the product and any threats to its value, so that our clients can make informed decisions about it. We commit to performing credible, cost-effective testing, and we will inform our clients of anything that threatens that commitment. We will not knowingly or negligently mislead our clients and colleagues. Testers accept responsibility for the quality of their work, although they cannot control the quality of the product. What is a test report?  A test report is any description, explanation, or justification of the status of a test project.  A comprehensive test report is all of those things together.  A professional test report is one competently, thoughtfully, and ethically designed to serve your clients in that context.  A test report isn’t “just the facts.” It’s a story about facts. Learn to tell the testing story!
  5. 5. Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context) The First Law of Reporting: Be Credible!  They won’t listen to uncomfortable information, unless you are credible.  They’ll assume you’re mistaken about surprising information, unless you are credible.  They’ll assume you’re exaggerating about risks, unless you are credible.  They’ll micro-manage your reporting, unless you are credible.
  6. 6.            Actually care about the project. Actually care about people on the project. Actually know how to do your job. Do not tell lies or exaggerate. Sweat the details in your own work. Gain experience. Study the technology. Read all documents carefully. Find things to appreciate about the work of others. Acknowledge mistakes, correct them and learn from them. Keep a journal and become the historian of your project. Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context)
  7. 7. A Narrative Model of Testing   This is a map of the Rapid Testing methodology that I teach. It is organized in the structure of a story, because story construction is at the heart of what it means to test. Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context)
  8. 8. Let’s Count Unicorns! Do you know what a Unicorn is? Okay. Answer this question: How many unicorns will fit into your cubicle? In the absence of context… test case counts mean NOTHING! How much testing is 40 test cases?  How much is 400?  How about 40,000 test cases? 
  9. 9. “Pass Rate” is a Stupid Metric. Pass Rate 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2/1 2/3 2/5 2/7 2/9 2/11 2/13 2/15 2/17 2/19 2/21 2/23 2/25 2/27 3/1 3/3 3/5 Pass Rate You shouldn’t take test case counts seriously because…         Test cases are not independent. Test cases are not interchangeable. Test cases vary widely in value from case to case, tester to tester, product to product, project to project, test technique to test technique, and over time. Test case design is subjective, so counts are easy to inflate. Test cases do not— and can not—capture all the testing that occurs (example: bug investigation) Testers often don’t follow the test cases, anyway. Automated test cases are fundamentally different from sapiently executed tests. Test cases represent what’s easy to put into a test case.
  10. 10. Testing Dashboard Updated: Build: 2/21 38 Area Effort C. Q. Comments file/edit view insert format tools slideshow online help clipart converters install compatibility general GUI high low low low blocked low blocked none none start 3/17 start 3/17 low 1 1+ 2 2+ 1 2 0 1 1 0 0 3 1345, 1363, 1401 automation broken crashes: 1406, 1407 animation memory leak new files not delivered need help to test... need help to test... lab time is scheduled 15 Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context)
  11. 11. Activity-based test management is designed to facilitate reporting  Thread-based Test Management: This means organizing your whole test effort around test activities that comprise your testing story. You manage testing AND report status from a mind-map.  Session-based Test Management: This means organizing testing into “sessions” which are normalized units of uninterrupted test time. You can count these more safely. Visualizing Test Progress
  12. 12. Visualizing Test Progress Visualizing Test Progress
  13. 13. Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context) Risk-Based Testing Makes Reporting More Relevant Risk Area 1 Status of the product and what we did to test it…. Risk Area 2 Status of the product and what we did to test it…. Risk Area 3 Status of the product and what we did to test it… (I rarely make a grid like this with a written report, because the artifacts I use to manage testing, day-to-day, are focused on activities, not risks, and I would have to create a special document to do a risk-based report.)
  14. 14. Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context) Safety Language (aka “epistemic modalities”) “Safety language” in software testing, means to qualify or otherwise draft statements of fact so as to avoid false confidence.  Examples: I think… It appears… So far… I infer… It seems… apparently… I assumed…  The feature worked I have not yet seen any failures in the feature…
  15. 15. Safety Language In Action Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context)
  16. 16. To test is to construct three stories (plus a bit more) Level 1: A story about the status of the PRODUCT… …about how it failed, and how it might fail... …in ways that matter to your various clients. Level 2: A story about HOW YOU TESTED it… …how you configured, operated and observed it… …about what you haven’t tested, yet… …and won’t test, at all… Level 3: A story about the VALUE of the testing… …what the risks and costs of testing are… …how testable (or not) the product is… …things that make testing harder or slower… …what you need and what you recommend… Why should I be pleased with your work? (Level 3+: A story about the VALUE of the stories.) …do you know what happened? can you report? does report serve its purpose? Advice for Test Reporting         Build crediblity (by being credible). Know the context of your tests (test framing). Never use a number out of context (e.g. no test case counts). Highlight general test activities (put tests in context). Highlight product risk (put bugs in context). Practice “safety language” (avoid misleading speech) Tell a three-level testing story. (status testing value) Don’t waste peoples’ time. (fit the report to the context)
  17. 17. Example Reports
  18. 18. James Bach and Chris Ojaste 12/25/08 james@satisfice.com godai92@live.com Incident Report Analysis and Repair of Kraft “Grate-It Fresh” Parmesan Cheese Dispenser Overview We fixed a broken Kraft “Grate-It Fresh” self-contained disposable parmesan cheese dispensing unit. This report details the incident, including the problem as it presented to us, analysis of the problem, and corrective action we took. Situation and Problem The investigators (Chris and James) were attending a Christmas banquet at which was served pasta along with grated parmesan cheese. The cheese was dispensed from a self-contained disposable unit, inside of which there appeared to be a block of cheese. “KRAFT Grate-It-Fresh Parmesan Cheese is the easy way to get the bold flavor of freshly grated Parmesan cheese. This unique and convenient all-in-one package, with 100% pure Parmesan cheese and a built-in grater, dispenses freshly grated Parmesan cheese with each easy turn. It’s the most convenient way to top off all your favorite dishes with the dynamic flavor of freshly grated Parmesan.” (http://brands.kraftfoods.com/KraftParm/parmProducts.htm) By rotating the dial on the bottom of the unit in a clockwise fashion, the cheese is shaved off the block and delivered to the plate by means of gravity. However, our cheese dispenser was not working. Multiple rotations of the dial delivered no cheese at all. Someone had to save Christmas! We resolved to investigate and repair the problem if possible. Analysis and Repair Process 1. External physical inspection ruled out the possibility of cheese exhaustion as a cause of the problem. By the weight of the unit and by visual inspection through the plastic case, we determined that about 1/3 of a block of cheese remained to be grated. 2. Also by visual inspection we determined the apparent mechanism by which the grater works is consistent with the cheese grater described in US patent 6,412,717 . Specifically, a rotatable grating plate is attached to a threaded spindle that passes through the cheese and through a pressure plate on the opposite side of the cheese. By rotating the grating plate, the pressure plate is forced toward the grating plate by the threads on the spindle. This pushes the cheese into the blades of the grating plate. The grating plate and blades are plastic. The spindle and the pressure plate is also plastic. The spindle seems to be made of a softer plastic than that of the pressure plate. 3. Experimentation established that the mechanism was functioning at least at a minimal level by turning the grater in reverse and observing that the pressure plate pulled away from the cheese. Turning the grater in the correct direction (clockwise) brought the pressure plate back into contact with the cheese, pushing it into the grater. We then noted an increased resistance to turning - 1 -
  19. 19. consistent with the pressure being placed on the cheese. However, the pressure approached a maximum, then eased, as if the pressure plate was slipping on the threads of the spindle. We conjectured that the threads were stripped. 4. Our first repair strategy was to push the cheese into the grater by hand. We thought that might move the pressure plate past the point where the spindle threads were stripped (assuming that the pressure plate itself was not damaged). To get at the cheese, we removed the grating cap with brute force (surprisingly this did not appear to damage it), which freed the entire mechanism from the enclosing plastic case. This allowed us to provide a great deal of pressure to the pressure plate, in addition that that of the damaged threads on the spindle. This strategy failed. No matter how much pressure we applied, very little cheese came through the grater. 5. This led us to a systematic examination of possible failure mechanisms . Here’s what we came up with: The grater blades may be damaged. The grating plate may be warped so that the grater blades fail to engage. The shape of the cheese face may cause the grater blades to fail to engage. 6. Visual inspection of the blades and grating plate failed to corroborate the hypothesis that the problem lay with the grater mechanism, whereas examination of the cheese block revealed grooves in the cheese face that perhaps could account for the blades failing to get any bite. 7. Our second repair strategy was to remove the cheese from the spindle, flip it over, and replace it so that the grater engaged a pristine face of cheese. This improved the grating by a little bit. At this point we returned to our first strategy and applied manual force to the pressure plate. This improved grating effectiveness dramatically, and slowly moved the pressure plate past the damaged portion of the spindle. We then reassembled the unit. Outcome The grater appeared to work. Subsequent web searches on the product name suggested the probable cause of the initial failure : The downward facing part of the cheese block dried out and became too hard to grate. (Interesting that we did not consider the possibility of dried out cheese in our list of failure modes, in step 5. However, our repair strategy coincidentally worked, even though we misunderstood the root cause.) Other people online have experienced this. Apparently, the cheese is meant to be used within 14 days of breaking the seal. This seems like an unrealistic requirement. Contrast-enhancement of low-res photo of spindle we were examining, showing healthy threads below the region of stripped threads. The pressure plate (at bottom) now rests on healthy threads.
  20. 20. Development Notes on “Incident Report” By James Bach Overview I wrote this report as an exercise to help teach the art of performing an investigation and reporting upon it. Maybe you are young, inexperienced, or a self-taught thinker. Maybe you’d like to compete better to get a job doing something that involves problem-solving or rapid learning. If so, then look for opportunities in your own daily experience to perform an investigation such as this, and write a report about it. Do several of them, and you will have a portfolio of your work to show prospective employers. Regardless of your formal educational background, showing examples of your work speaks boldly about what you can do. Although this report describes an investigation. The general approach I’ve taken here can be applied to many kinds of reports. General Approach to Reporting I begin with the question “who am I serving with this report?” and then “what is my goal in making this report?” Usually, I am serving a paying client and my primary goal has to do with helping them solve some specific problem. That’s a start. In this case, however, my clients are my students and colleagues. My goal, here, is to successfully tell the story of a thought process. Success means several things: The reader The reader The reader The reader obtains a clear picture of the investigation. obtains a useful example of a report. feels able to contribute to or criticize the investigation, based on the report. learns how a simple event might become a showcase for scientific thinking. In writing reports, there is nearly always another goal. The author may not be aware of this goal, but here it is: The author’s own reputation as a thinker is enhanced and not diminished. Remember: every report you write affects how people think about you. Your ability to reason, your eye for detail, your commitment, your professionalism, your care for others—all of these qualities and more are being evaluated in the minds of your readers. I want to write a clean, simple report. I try to minimize clutter and text. I want it to be short, punchy, and readable. I use formatting to help the reader’s eye find relevant information quickly, but I try to reduce the number of formatting elements in the document to avoid slipping into visual confusion. I’m not always sure if I succeed, but that’s my goal. Speaking of formatting, I used the “modern report” template, from Microsoft Word, as a base. Then I changed the fonts to Cambria and Calibri. I use Calibri for bold facing, since a non-serif font looks better in bold and helps to distinguish text from the un-bolded serifed text in the body of paragraphs. Also notice that when I bold text inside a paragraph, I increase the size by one point. I use bolding for emphasis, occasionally italics, but never underlining. Underlining is messy and old-fashioned. I often highlight key ideas with bolding, so that the body of the text will not look like a big gray mass. This improves readability and browsability. I want to help the reader come to his own conclusions even if they might differ from mine. To do that, I include not only my observations, but also information about how my observations were obtained and how they might be mistaken. I separate my inferences from the observations on which they are based (example “by visual inspection and weight…) and show how one follows from the other. I
  21. 21. also consider including background information that will help the reader make a better assessment of what I did, such as the references to the patent and to the Kraft website. The structure of the report should support the thinking the reader needs to do. As I design the report, I anticipate the questions the reader will have, and arrange for the answers to those questions to “pop out” from the text. In this case, I felt that a play-by-play narrative of the investigation would serve that need best. I want to use professional vocabulary. Although it can be perfectly fine to write a report in an informal tone, I felt in this case it would be amusing to apply a more formal writing style to this trivial investigation. I was going for something like the rhetorical tone of an NTSB accident investigation. Aside from tone, I also wanted to practice “talking like a tester.” That means speaking with extra precision and objectivity, as compared to casual conversation. Walkthrough Let me walk you through the report to show you how I did it and why I did it that way. This is the masthead that comes with the “modern report” template in Microsoft Word. I like to use a minimalist approach: author, contact information, and date. In some situations I may include more information, such as who commissioned the report, or the version number of the report. Sometimes I struggle with the title of a report. The title is important, because the report may be sitting on a desk with lots of other papers. The title will be the part that catches the eye first. One way to title the report would be to make it quite specific. This can be fine for a one-off report, but usually a report I write is part of a series, or one example from a category of reports. So, I generally prefer a short title that identifies the type of report this is, then I provide specific information in the sub-title. Incident report is an okay title. But I fear it’s a little too generic. Incident could mean anything. Investigation report might be better. I chose “incident” because one typical investigative situation is a customer coming to a technical support organization with a problem. These are often called incidents. The purpose of the overview is to communicate the essence of the whole report so that the reader may decide if it’s worth reading at all. The essence of the report is that we found a problem and fixed it. But I can’t just write “Overview: we found a problem and fixed it.” I don’t want my report to sound generic— as if I’ve simply copied the text from another report. Anytime I write something that seems generic, I want to replace it with something that gives at least a bit of detail that is specific to the situation at hand. That’s why I named and described the object that we fixed.
  22. 22. Also notice that there is no table of contents in this report. The biggest problem with a table of contents in a short document is that it conveys the subtle message to the reader that the report is full of fluff that must be puffed up as much as possible to make it look more impressive. I’m annoyed with tables of contents in reports that are less than about fifty pages long. I think they are a waste of space. If the report is more than about 7 or 8 pages long, then I will list the sections of the report in the overview, but I won’t give page numbers. It’s a simple matter for the reader to find the sections in a short document. The reason I describe the situation and problem is to show the focus and motivation of the investigation. This creates a tension that is resolved in the meat of the report. At the end of the report I go back to the top and ask myself if I have answered the questions or dealt with the challenges posed in the situation and problem section. I initially expected to have separate sections to describe the situation and the problem, but there seemed to be too little to say about the each of those things, individually. Combining them created a better flow and a critical mass of content. One of the little challenges in writing this was to describe the object we repaired. After trying to describe it in original words, I realized that I could use an official description of it, and a few moments of web searching brought me to the Kraft site. The description was brief enough that I could include it handily in the report. Anything included must be properly attributed, of course. In this case, the full link to the web page makes sense to include, so the reader can look up more information. I took a cell phone picture of the actual unit we repaired, but when I discovered the Kraft website had a handsome official picture, I used that one instead. I initially expected to have separate sections for analysis and repair, but as in the case of situation and problem, I ended up combining them. In this case, analysis and repair activities were intertwined. I didn’t see a graceful way of detangling them. I numbered the paragraphs to convey a sense of step-by-step order. In fact, the investigation bounced around a lot and branched. Reality is complicated, but part of the reporting process is to organize what happened into a comprehensible narrative. That means the flow of events I report are going to be a bit simpler than it happened in real life. In a complicated investigation I will often film it or take detailed notes to preserve the sequence of events.
  23. 23. In a narrative style of reporting, I strive to create anticipation and interest in the mind of the readers. That keeps them reading and thinking. I want them to follow along and get a sense of the things I considered, and the false steps I made as well as the productive steps. The highlight of the first step of the investigation is the method we used to examine the grater. I wrote “external physical inspection” to distinguish what we did from plausible alternatives such as disassembling the unit, or reading about the unit online. Note on phrasing: See the words “cheese exhaustion.” I suppose I could have written “…the unit had run out of cheese.” That would have been simpler and more accessible, but I was going for a more scientific tone. I once saw an NTSB report refer to “fuel exhaustion” as a cause of an airplane accident, so I emulated that. In order to report credibly about the investigation and repair of a mechanical problem, I needed to describe the mechanism with sufficient detail to allow the reader to appreciate the situation. As I tried to do that, I found myself making up my own terms to describe the various parts of the grater. After a few attempts writing in my own words, I realized that there might be a patent associated with the grater. That patent may include exactly the description I needed. I went to Google patent search and quickly discovered a cheese grater from 1978 (patent 4,082,230) that looked something like the one we had repaired. I thought I would use that patent, until a few minutes later I thought perhaps I should search for “food grater” or just “grater” instead of specifying a cheese grater. This is because patents are sometimes written from the most general standpoint possible in order to maximize the scope of the patent. That search turned up exactly the invention I was looking for. I considered pasting the exact description of the invention from the patent into the report. That didn’t work well. The text was too long and complicated. Therefore, I settled for summarizing it using technical terms drawn from the two patents. In making my description, I referred to the patent. That way I have a good reason not to explain the mechanism in any great detail, since the details are implicitly included by reference. I tried to make the steps consistent by putting the action first in each step. Each step begins with some variation of “we did this.” Here the experiment is briefly described. Just enough to create a reasonably detailed mental image in the minds of readers.
  24. 24. The first repair strategy failed. In a report that seeks to describe only the problem and the solution, it is not necessary to describe failed strategies. I included it because this report is also concerned with demonstrating the investigative process itself. In real life, we did not say “let’s systematically examine all possible failure mechanisms.” What we did was bat around some ideas while each of us tried to force the cheese through the grater. In retrospect, however, our chatter seemed equivalent to an open brainstorm of reasons why the product was failing. The narrative would be incomplete unless I show how we ruled out the various possible causes of the problem. That’s done in paragraph 6, which leads into the second, successful repair strategy. The picture of the spindle is crude. It was based on the photo, below, taken with my Blackberry. I should have photographed the spindle outside of the plastic case. It would have been much sharper. I didn’t realize I was going to be writing a full report on this incident at the time of the investigation, or I would have taken (and included) many more photos. Photographs, diagrams, and video bring a wonderful dimension to investigative reports. Because the photo of the spindle was so blurry, I used an image enhancement program to play with the contrast and color balance until I was able to see the threads. Then I added annotation using Microsoft Paint.
  25. 25. In the first draft of the report, I forgot to include the simplest information about the outcome: that the grater appeared to be working. On reading through the draft several times, I fixed that. Only as I was finishing up the report did it occur to me that I could use Google to discover whether anyone else had been experiencing problems with the Kraft grater. Sure enough there are several reports online. My first reaction to these was “don’t people have better things to do than to complain about a trivial food product on their blogs?” and then I remembered that is sort of what I’m doing by writing this report. Heh heh. People are motivated by lots of different things, I guess. A troublesome element of the report is that it reveals a major oversight of the investigators: we failed to consider over-dried cheese as the cause of the problem. This makes us look bad, in a way, but in another way, including that information as a post-script shows that we might accept our mistakes and learn from them. Potential Improvements to the Investigation It can be difficult to decide how much investigation is enough. We felt satisfied with achieving the repair of the unit, but we hardly exhausted the possible branches of exploration and learning. Here are some ideas for what we could have done: We could attempt to measure the properties of the cheese block to quantify the amount of drying that has occurred. We could perform experiments to track the drying process. We could attempt to develop home-spun countermeasures to prevent the drying from taking place or reverse the drying process, then report on their efficacy. We could interview the homeowners to determine the history and provenance of the cheese grating unit. How long had they owned it? When did they first open it? We could search for more information online about the properties of the product and its reported problems. We could contact Kraft directly and ask about the product. We could try the dried cheese with traditional metal graters to see if part of the blame lies with the plastic grating plate. We could have consulted other guests at the dinner. We could have purchased several units and tested them in parallel.
  26. 26. OEW Case Tool QA Analysis, 8/26/94 Summary OEW is a complex application that is fairly stable, although not up to our standards for fit and finish. There are no existing tests for the product, only a rudimentary test outline that will need to be translated from German. One full-time and one part-time tester work on the project. Those testers are neither trained nor particularly experienced. The vendor’s primary strategy for quality assurance is a fairly extensive beta test program. We suggest a minimum of one tester to validate the changes to OEW. We also suggest that the developer of OEW work onsite with our test team under our supervision. Feature Analysis Complexity This is a complex application. 8 68 40 5 27 120 Functionality interesting menus interesting menu items obvious dialogs kinds of windows buttons on the speedbar thousand lines of code This application has substantial functionality. Code Generation Code Parsing Code Diagramming Build Invocation Volatility The changes in the codebase will be minor. Bug fixes. Smallish U.I. tweaks. Disable support for various things, including build invocation. Operability The application is ready for testing immediately. It operates like a late beta or shipping application. The proposed changes will be unlikely to destabilize the app. Customers We expect that large codebases will be generated, parsed or diagrammed with this application. About 25% of our beta testers have codebases larger than 200,000 lines. The parsing capability will encourage customers to import their apps.
  27. 27. Risk Analysis  The risk of catastrophes occurring due to changes in the codebase is small.  The risk that the much larger and probably more demanding Borland market will be dissatisfied with OEW is significant. QA Strategies         Get this into beta 2, or send a special beta 2B to our testers who have large codebases. Find beta bangers with large codebases and have them import into OEW. Perform rudimentary performance analysis with big codebases. Bring the existing OEW testers from Germany onsite. Hire a dedicated OEW tester (contractor, perhaps). Participate in a doc. and help review. Translate existing test outline from German. Perform at least one round of compatibility testing. Schedule   The QA schedule will track the development schedule. It may take a little while to recruit a tester. Issues  Are there international QA issues?
  28. 28. 1. No access to LAN, access to PCE over Internet 2. Access to LAN, but no account on PCE Levels of Required Access 3. PCE Account, but no rights within account 4. Rights to some projects, not others 5. No rights for particular action within project 6. All rights and access 1. No special knowledge/accidental Prioritizing Security Problems 2. Casual hacker knowledge Level of Attack 3. User level knowledge of PCE 4. Special hacker knowledge 5. Developer level knowledge of PCE Levels of Damage Levels of Responsibility Web Client API Server-to-Server Communication Database Direct Attack MS Project Attack Testers must learn security testing basics LDAP attack Attack Vectors Man-in-the-middle Produce a security-specific test coverage outline DNS Poisoning Document a concise security-specific test strategy Consider security implications for testing of each fix and enhancement Shoulder Surfing Testing and Analysis Activities Social Engineering Periodically perform general security regression testing Keyloggers and Malware Monitor and apply patches to platform elements Efforts Going Forward Unconstrained input PCE Security Create installation notes that clearly delineate security issues Obscure functions Low level error messages Explain security architecture to testers. Make finding obscure problems easier. Consider reviewing Microsoft security design checklists Technically informative error messages Development Activities Third-party components and interfaces "Blood in the Water" Review internal permissions architecture Generic O/S features and interfaces Default configurations Source Code Security based on assumption of no malice Security Observed Degrees of freedom in input Recent vulnerability disclosure in platform component Sniffing/Man-in-Middle Attack Documentation Review Whitebox Hazard Analysis Fingerprinting Google Hacking Vulnerability Scanning/Lookup Testing Activities SQL Injection Directory Traversal Cross-site Scripting Input Constraint Attacks HTTP Manipulation Session Hijacking Permissions Testing Problems Found PCE Security.mmap - 4/9/2011 -
  29. 29. Spot Check Test Report Prepared by James Bach, Principal Consultant, Satisfice, Inc. 8/14/11 1. Overview This report describes one day of a paired exploratory survey of the Multi-Phasic Invigorator and Workstation. This testing was intended to provide a spot check of the formal testing already routinely performed on this project. The form of testing we used is routinely applied in court proceedings and occasionally by 3rd-party auditors for this purpose. Overall, we found that there are important instabilities in the product, some of which could impair patient safety; many of which would pose a business risk for product recall. The product has new capabilities since August, but it has not advanced much in terms of stability since then. The nature of the problems we found, and the ease with which we found them, suggest that these are not just simple and unrelated mistakes. It is my opinion that:  The product has not yet been competently tested (or if it has been tested, many obvious problems have not been reported or fixed).  The developers are probably not systematically anticipating the conditions and orientations and combinations of conditions that product may encounter in the field. Error handling is generally weak and brittle. It may be that the developers are too rushed for methodical design and implementation.  The requirements are probably not systematically being reviewed and tested by people with good competency in English. (e.g. the “Pulse Transmitter” checkbox works in a manner that is exactly opposite to that specified in the requirements; error messages are not clearly written.) These are fixable issues. I recommend:  Pair up the developers and testers periodically for intensive exploratory testing and fixing sessions lasting at least one full day, or more.  Require the testers to be continuously on guard for anomalies of any kind, regardless of the test protocol they are following at any given moment. Testers should be encouraged to use their initiative, vary their use of the product, and speak up about what they see. Do not postpone the discovery or reporting of any defect, even small ones—or else they will build up and the processes creating these defects will not be corrected.  The requirements should be reviewed by testers who are fluent in English.  The developers should carefully diagram and analyze the state model of the product, and redesign the code as necessary to assure that it faithfully implements that state model.  Unit-level testing by the developers, and systematic code inspection, as per FDA guidance.
  30. 30. 2. Test Process The test team consisted of consulting tester James Bach (who led the testing) and Satisfice, Inc. intern Oliver Bach. The test session itself spanned about seven hours, most of which consisted of problem investigation. Finding the problems listed below took only about two hours of that time. The process we used was a paired exploratory survey (PES). This means two testers working on the same product at the same time to discover and examine the primary features and workflows of the product while evaluating them for basic capability and stability. One tester “plays” while the other leads, organizes and records the work. A PES session is a good way to find a lot of problems quickly. I have used this method on court cases and other consulting assignments over the years to evaluate the quality of testing. The process is similar to that published by Microsoft as the General Functionality and Stability Test Procedure (1999). In this method of testing, we walk through the features of the product that are readily accessible, learning about them, studying their states and interactions, while continuously applying consistency heuristics as test oracles in our search for bugs. Ten such heuristics in particular are on our minds. These ten have been published as the “HICCUPP” model in the Rapid Software Testing methodology. (See http://www.satisfice.com/rst.pdf for more on that.) We filmed most of the testing that we did, and delivered those videos to Antoine Rubicam. We did not test the entire product during our one-day session. However, we sampled the product broadly and deeply enough to get a good feel for its quality. 3. Test Results The severe problems we found were as follows: 1. System crash after switching probes. If the orientation mode is improperly configured with the circular probe such that there are no flip-flop mode cathodes active, and the probe is then switched to “dissipated”, the application will crash at the end of the very next exfoliation performed. (This is related to problems #6 and #7) Risk: delay of procedure, loss of user confidence, potential violation of essential performance standard of IEC60601, product recall Implications: The developer may not have anticipated all the necessary code modifications when dissipated mode probe support was added. Testers may not be doing systematic probe swap testing. 2. No error displayed after ion transmitter failure during exfoliation. By pressing the start button more than once in quick succession after an ion transmitter error is cleared, an exfoliation may begin even though the transmitter was not in the correct pulse mode. The system is now in a weird state. After that point, manually stopping the transmitter, changing the pulse rate, or cutting power to the transmitter will not result in any error message being displayed.
  31. 31. Risk: patient death from skin abrasions formed due to unintentionally intensified exfoliation, loss of user confidence, violation of IEC60601-1-8 and 60601-1-6, product recall Implications: There seems to be a timing issue with error handling. The product acts differently when buttons are pressed quickly than when buttons are pressed slowly. Testers may not be varying their pace of use during testing. 3. Error message that SHOULD put system in safe mode does NOT. Ion transmitter error messages can be ignored (e.g. "Exfoliation stopped. Ion flow is not high!"). After two or three presses of the start button, exfoliation will begin even though multiple error messages are still on the screen. Risk: Requirements violation, violation of IEC 60601-1-8 and 60601-1-6, product recall. Implications: Suggests that the testers may not be concerned with usability problems. 4. Can start exfoliation while exit menu is active (and subsequently exit during exfoliation). It should not be possible to press the exit button while exfoliating. However, if you press the exit button before exfoliating and the exit menu appears, the start button has not been disabled, and the exfoliation will begin with the exit menu active. The user may then exit. Risk: unintentional exfoliation, loss of user confidence, violation of IEC60601-1-6, product recall Implications: Problems like this are why a careful review of the product state model and redesign of the code would be a good idea. The bug itself is not likely to cause trouble, but the fact that this bug exists suggests that many more similar bugs also exist in the product. 5. Probe menu freezes up after visiting settings screen (and at other apparently random times). Going to settings screen, then returning, locks the probe mode menu until an exfoliation is started, at which point the probe mode frees up again. We found that the menu may also lock at apparently random intervals. Risk: loss of user confidence Implications: Indicates state model confusion; variables not properly initialized or re-initialized. 6. Partial system freeze after orientation mode failure. When in orientation mode with no cathodes selected for flip-flop, an exfoliation session can be started, which is allowed to proceed until flip-flop phase is activated. At that point, an error message displays and system is locked with "orientation and flip-flop" modes both selected on the exfoliation mode menu. The settings and exit buttons are also inoperative at that point. (This state can also be created by switching probes. It is related to problems #1 and #7.) Risk: Procedure delay, loss of user confidence, product recall Implications: Indicates state model confusion; variables not properly initialized or re-initialized. 7. No error is displayed when orientation session begins and flip-flop cathodes are not activated. When in orientation mode with no cathodes selected for flip-flop, an exfoliation session can be started. Instead, an error message should be generated. (This is related to problems #1 and #6.)
  32. 32. Risk: loss of user confidence, creates opportunity for worse problems Implications: Suggests the need for a deeper analysis of required error handling. Testers may not be reviewing error handling behaviors. 8. Cathode 10 active in standing mode after deactivating all cathodes in flip-flop mode. Deselection of cathodes in flip-flop or standing mode should cause de-selection of corresponding cathodes in the other mode. However, de-selecting all flip-flop cathodes leaves cathode 10 still active in standing mode. It’s easy to miss that cathode 10 is still active. Risk: creates opportunity for confusion, possible inadvertent exfoliation with cathode 10, possible violation of IEC60601-1-6 Implications: Suggests that the testers may not be concerned with usability problems. 9. Error message box can be shown off-screen. Error message boxes display at the location where the previous box was dragged. This memory effect means that a message box may be dragged to the side, or even off the screen, and thus the next occurrence of an error may be missed by the operator. Risk: creates opportunity for confusion, possible for operator to miss an error, violation of IEC60601-1-8 and 60601-1-6, when combined with bug #3, it could result in potential harm to the patient. Implications: Suggests that the testers may not be concerned with usability problems. 10. Behavior of the "Pulse Transmitter" checkbox is the opposite of that specified in the FRS. The FRS states "By selecting Pulse Transmitter checkbox application shall allow to perform exfoliation session with manual controlled transmitter.” However, it is actually de-selecting the checkbox which allows manual control. Risk: business risk of failing an audit. It is potentially dangerous, as well as illegal, for the product to behave in a manner that is the opposite of its Design Inputs and Instructions for Use. Implications: This is a common and understandable problem in cases where the specifications are written by someone not fluent in English. It is vital, however, to word requirements precisely and to test the product against them. Bear in mind that the FDA personnel probably will be native English-speakers. 11. Setting power to zero on an cathode does not cause the power to be less than 10 watts. According to the log file, the power is well above the standard for “0” laid out in IEC60601. (Also, displaying a “---“instead of “0” does not get around the requirement laid out in the standard. This is true not only because it violates the spirit of the standard, but also because the target value is displayed as “0” and the log file lists it as “0”.) Risk: violation of IEC60601, product recall Implications: The testers may not be familiar with the requirements of IEC60601. They may not be testing at zero power because the formal test protocol does not require it. Here are the lower severity problems we found:
  33. 33. 12. "Time allocated for cathode 10 is too short" message displays when time is rapidly dialed down. The message only displays when the time is dialled down rapidly, and we were not able to get it to display for any cathode other than 10. 13. Pressing ctrl key from exit menu causes immediate exit. 14. Exfoliation tones mysteriously change when only one cathode is active in standing mode. The exfoliation tone for flip-flop mode is sounded for standing mode when all but one cathode is deactivated. 15. Power can be set to zero during exfoliation without cancelling exfoliation. Since an exfoliation cannot be started without at least one cathode set to a power greater than 0, and since deactivating an cathode during an exfoliation session prevents it from being re-activated, it is inconsistent to allow cathodes to be set to “0” power during an exfoliation unless they are subsequently de-activated. 16. Power can be set to 1, which is unstable. Does it make sense to allow a power level of 1? The display keeps flickering between 1 and “---“. 17. If orientation is used, the user may inadvertently fail to set temperature limit on one of the exfoliation modes. Flip-flop and standing have different temperature limit settings. In our testing, we found it difficult to remember to set the limit on both modes before beginning the exfoliation session. This is a potential usability issue. 18. "Error-flow in standby mode should be low" message displayed at the same time as "Exfoliation stopped. Transmitter flow is not high!" This is a confusing pair of messages, which seem to require that the transmitter be in low flow and high flow at the same time. 19. Error messages stack on top of each other. If you press start with 0 power more than once, then more than one error message is displayed. As many times as you press, more error messages are displayed.