SlideShare a Scribd company logo
1 of 50
Download to read offline
Vendor Meets User

The Hexawise Test Design Tool and Two
Testers Who Tried to Use it in Real Life

            Presented at CAST, 2011
                  Justin Hunter
                Lanette Creamer
              Ajay Balamurugadas
Topics

                    Introduction to Hexawise
                    “Inside the Mind of the Vendor”
                              Justin Hunter

                          Experiences
                   “Inside the Minds of Testers”
              Lanette Creamer & Ajay Balamurugadas


Hexawise is a test case design tool used by testers to design their tests. In a context
where test scripts are used, Hexawise can design detailed test scripts. In an
Exploratory Testing context, Hexawise is used to generate “test ideas” that encourage
the tester for the tester to explore, and even design mini tests on the fly.


                                            2
My Dad - William G. Hunter




Why I created Hexawise has a whole lot to do with who my father was. He was a leading
applied statistician who specialized in how to make experiments more efficient & effective.




                                             3
1960’s




In the 1960’s my dad brought my family to Singapore where he taught at a university and
worked with local companies.




                                            4
1970’s




In the 1970’s my dad brought my family to Nigeria where, again, he taught at a university
and worked with local companies. That’s me in the lower right hand corner. We were
visiting a factory my father was helping.




                                             5
1980’s




In the 1980’s he did something that his colleagues thought was pretty crazy because they
thought his expertise and lessons learned probably wouldn’t transfer into the government
sector. While a professor in Madison, Wisconsin, he started collaborating with local and
state government agencies.




                                            6
Design of Experiments




Why did he uproot our family every few years and then start collaborating with
government agencies?

It was because he passionately believed that sharing his expertise in Design of
Experiments (a specialized field of applied statistics), would really help people - by giving
them skills that would improve both quality and productivity. This is a book cover from a
book he co-wrote with George Box that has helped to increase awareness of what Design
of Experiments is and how practitioners should use it.
                                              7
Design of Experiments




So what is Design of Experiments? It’s a specialized field of applied statistics that has
been around since the 1930’s. Simply put, it is focused on answering this question.




                                              8
Design of Experiments




Where is Design of Experiments used?

It is used extensively in manufacturing, among other industries. If you’re a manufacturer
trying to create a widget for a car part, for example, you don’t want to have to build
100,000 different prototypes of the widget before you stumbled across a combination of
heat and pressure and temperature and ingredients that achieve the desired
characteristics. You’d want to build a small handful of prototypes, and have the variables
going into each of the prototypes carefully varied from prototype to prototype to allow you
to learn as much as possible in as few experiments. That’s what auto-manufacturers
regularly do.
                                            9
Design of Experiments




Where Else is Design of Experiments used?

Design of Experiments-based methods have also been commonly used in agriculture for
decades. If you’re Monsanto and you want to grow an hardier seed that will grow in colder
temperatures and mature more quickly, you’re going to use Design of Experiments
methods to identify the combinations of variables to test together in each test you
execute.


                                          10
Design of Experiments




Where Else is Design of Experiments used?

Many marketers also use Design of Experiments methods extensively. YouTube
recently ran an experiment involving 1,024 different combinations of fonts, colors,
messages, and button sizes and shapes to find an optimal combination that increased
their sign-up rate by more than 15%. Jeff Fry, a tester at Microsoft wrote a good article
about this and posted a phenomenal video by a Design of Experiments expert who
worked at Amazon before moving to Microsoft.

A/B testing is a very simple “watered down” DoE approach. Multi-Variate Testing (MVT)
is “full-blown” DoE-based marketing.
                                            11
Design of Experiments




            What about
            in Software
              Testing?
This is the question I’ve been focused on for the last 5 years. Seventeen pilot projects I
conducted at my prior company convinced me that DoE-based methods consistently
deliver improvements in the efficiency and effectiveness of their business as usual test
cases.


                                            12
What is Hexawise?


           Challenges Hexawise Addresses


  Problems During                              Impact Felt During
   Test Design...                                Test Execution
  Manual Documentation                             Delayed Start

  Largely Repetitive Tests                         Inefficient

  Gaps in Coverage                                 Missed Defects
Hexawise was created to address these common testing challenges.
                                        13
Mortgage Application Example




Let’s use this simple example to demonstrate how pairwise testing works. I’ve
borrowed this idea from a presentation that Bernie Berger gave at StarEast.

Imagine you’re testing a mortgage application that has several sets of details. This is
an “executive summary” view of the different options that could be selected for
application. We could make this example more complicated by including hardware and
software configuration options, user types, etc. We’re intentionally keeping it simple
here.
                                           14
Mortgage Application Example




If we just examine one of those three branches, we see that it has 27 possible test
combinations associated with it. For example, 1 of the 27 possible tests would include:

One example: Income = Low & Credit Rating = Medium & Customer Status = VIP

There are 26 other similar combinations.
                                           15
Mortgage Application Example




When we examine all three branches, we see they have equal complexity. Each of the
three branches has 27 total possible combinations. How many total combinations are
there? Hint: It is not 81.
                                         16
Which Tests Should You Choose?




                       27 X 27 X 27 =

                19,683 Possible Tests


There are almost 20,000 possible combinations to choose from.

                                         17
Prioritization

                     How many test inputs are needed
                     to trigger defects in production?

                                5%
                                                                                                                                            1
                             11%
                                                                                                                                            2 (“pairwise”)

                                                                            51%
                                                                                                                                           3
                       33%
                                                                                                                                           4, 5, or 6
       In order to prioritize which specific combinations should be selected as high priority tests from those ~20,000 possible tests, it is
       extremely important to understand that the vast majority of defects in production can be triggered by just two test inputs tested together
       (e.g., a test that includes Income = Low as the first test input and also includes Credit Rating = High as the second test input).

       This fact has extremely important implications for software testers. Unfortunately, very few software testers are aware of (a) this fact, or
       (b) the implications. The implications for software testers is that small sets of tests that ensure every possible PAIR of values get tested
• Medical Devices:  D.R. Wallace, D.R. Kuhn, Failure Modes in Medical Device Software: an Analysiseffectiveof Recall Data, International Journal of Reliability, Quality, and Safety Engineering, Vol. 8, No. 4, 2001.    
       together in at least one test case are extremely efficient and of 15 Years at finding defects.
• Browser, Server:  D.R. Kuhn, M.J. Reilly, An Investigation of the Applicability of Design of Experiments to Software Testing, 27th NASA/IEEE Software Engineering Workshop, NASA Goddard SFC 4-6 December, 2002 .  
• NASA database:  D.R. Kuhn, D.R. Wallace, A.J. Gallo, Jr., Software Fault Interactions and Implications for Software Testing, IEEE Trans. on Software Engineering, vol. 30, no. 6, June, 2004.  
• Network Security:  K.Z. Bell, Optimizing Effectiveness and Efficiency of Software Testing: a Hybrid Approach,  PhD Dissertation, North Carolina State University, 2006.  

                                                                                                              18
Mortgage Application Example




Select a couple pairs of test inputs from this mind map. Possible pairs of inputs could
include pairs like these shown below. Select your own two pairs though.

First Example of a pair of values: Income = Low & Credit Rating = Medium

Second Example of a pair of values: Income= High & Customer Status = Employee
                                       19
Third Example: Income = High & Credit Rating = Low
These are the same test
inputs that have been
imported from the mind
map into the Hexawise
tool. When you click on
the “Create Tests” icon at
the top of the screen, you
will see a pairwise testing
solution. Every and all
pairs of values might have
selected will be included in
a surprisingly small
number of tests.




                               20
Only 17 tests are required (out f 19,683 possible tests) to test every single possible pair
of test conditions together in the same test case at least once.

In other words, every single pair of test conditions will be tested together at least once.
The pair of conditions we selected at random Income = Low, Credit Rating = Medium
appears in test number 8. All the other pairs are also tested together at least once.

If we have done a thorough job of identifying test inputs, the vast majority of defects
will be triggered by these 17 tests out of almost 20,000 tests. This is the lesson from
Design of Experiments that have been learned and applied in so many other industries
since the 1930’s and are now being applied by an increasing number software testers.
                                           21
This is the same set of 17 tests shown in case the speaker notes on the last slide
covered a pair of values you were trying to confirm were tested together.




                                            22
The tests “front-load” coverage. 87% of the pairs of values have already been tested
together by the end of the 9th test.

This is simply a coverage chart showing what percentage of test input pairs have been
tested together so far as a percentage of the total number of possible pairs that could
be tested together in the System Under Test.



                                           23
User Experiences




             Lanette Creamer’s Experiences



Now lets’ hear from Lanette and Ajay...




                                          24
HEXAWISE
First used on June 08, 2009
FIRST STEPS

Interestedto try new software
Aware of allpairs.exe

Problem Statement
    Multiple Printers
    Printer Specific Charts
    Chart 1 & Chart 2
    Other Settings

     Sl. No     Printer   Chart 1        Chart 2        Settings
     1          ABC       16 strips      64 strips      Borderless

     15         LMN       64 strips           -------   Auto cut

     45         XYZ            -------   16 strips          -------
TESTED WITH ALLPAIRS

Excel to Notepad to Excel
Very useful when all pairs are valid

Unable to mention invalid pairs

Steps to be repeated based on cases

Maintained a common repository
HEXAWISE
Easy  to use
One user account –
 anytime accessible
Can specify invalid
 pairs
Multiple strength cases
HEXAWISE WISH LIST
Desktop  version –
 useful without internet
 too
Able to define invalid
 pairs after the cases are
 generated
Easy method to define
 invalid pairs
Need to try project
 sharing & excel import
Disclaimer: Thinking Req’d




This is a photo I saw Lanette use in earlier presentations. It is absolutely spot on in this
context. Designing DoE-based software tests is not a paint-by-numbers approach.
You need to use your critical thinking skills. Without using them, there will be a
garbage in / garbage out problem.

           ?
                                             30
Disclaimer - Imperfect Models




When you use a Design of Experiments-based test design tool, you effectively create a

                       ?
model that will generate your tests. Whenever you do so, there will be parts of the
System Under Test you will miss. Perhaps (probably?) you will miss important parts.
                                          31
Disclaimer - Which Inputs?




When creating DoE-based software tests, you will face the same kinds of test design
considerations you always have... as well as new, DoE-specific considerations.

                                    ?
                                          32
“The Truffle Pig Problem”
Design of Experiments-based test design methods face a “truffle pig problem.” If
software bugs were like leaves on your lawn that you wanted to get rid of, DoE-based
test-design based methods would be much more popular than they are now. DoE-
based methods would be the equivalent of a leaf blower: you’d be able to instantly see
your productivity increase.

Unfortunately bugs are not visible, like leaves. They’re hiding, unseen, like truffles. It
is my experience, that DoE-based test design methods are like an especially thorough
and efficient truffle pig. The problem is, of course, that if someone gave you a super
truffle pig that was twice as good at finding truffles as your regular truffle pig, you
would probably have a hard time assessing how good it was. DoE-based test design
methods face this same challenge.




                                                    ?
                                             33
How Can You Know?




Here’s the best approach I’ve come up with to answer the question of “how can you
know DoE-based test design methods are better than manually-selected test cases?”



                                                           ?
                                         34
“Let’s test this hypothesis.”




Even though we can complete a meaningful “bake-off” pilot project within just a couple
man days of effort, this is the typical reaction I get from test teams who I propose this
approach to! (brief video of office mates diving under desks, hiding under plants, etc.)
                                      Cereal
It is amazing how quickly people tend to run and hide when they are given the
                                       Box
opportunity to learn something that could fundamentally change their software testing
effectiveness.

                                                             Toyota -
The irony is that teams will say “we’re too busy to execute a one or two day pilot project
                                                           Entering the
now.” Hello? In my experience, the findings from the pilot - on average - more than
double the number of defects found per tester hour. The entire point of learning about
                                                          Truck market
Design of Experiments-based test design techniques, like pairwise and 3-way, and
orthogonal array-based / OA testing is to improve your efficiency and effectiveness... So
                                                            in the U.S
you can get much more done with fewer resources. Saying “I’m too busy to learn how
to do that” is... shortsighted is probably the most diplomatic word.


                                          35
Results: Less Test Design Time

                            Different Test
                                                           Different
            Same                Design
                                                            Results
                              Approach
             System                                      Time to Design
            ~ 30 Test
           Under     - 40% Less Time                            Tests
                                 Identify tests
                                  manually vs.            Combinatorial
            Test Ideas Test Generation
            (b/c Many generate tests
                                                             Coverage
               Steps are Automated) of
                              using a Design
                                Experiments-           No. of Bugs Found
               Time               based tool             Time to Execute
In my experience, teams that have agreed to pilot projects have seen these results. It
                                                                Tests
takes far less time to generate tests using this approach because many steps in the test
case selection process and test case documentation process get automated.


                                                              ?
                                           36
Results: Better Coverage

                             Different Test
                                                            Different
            Same                 Design
                                                             Results
                               Approach
           System                                         Time to Design
          Under Test             Identify tests               Tests
                                  manually vs.            Combinatorial
          Test Ideas
                                generate tests                Coverage
                              using a Design of
                                 Experiments-         No. of Bugs Found
             Time                 based tool
In my experience, it is easy to show that combinatorial coverage (e.g. how many pairs of
values, how many triples of values, are tested together, Time far superior with this
                                                         etc.) is to Execute
approach. In this actual example from a couple months ago, we showed that 51
                                                                  Tests more than
business as usual tests that were put together manually did not test for
1,400 pairs of values.


                                                                ?
A skeptic will probably say... “OK. Interesting, but what does that translate to in terms
of actual defects found?”
                                            37
Results: More Bugs Found

                      Different Test
             Will depend upon:                             Different
           Same           Design
         (1) the System Under Test,                         Results
                        Approach
        (2) Test Designer skill, and
       (3)System
           the coverage strength of                     Time to Design
        Under Test     Identify tests                       Tests
           the DoE-based tests.
                                 manually vs.            Combinatorial
           Test Ideas
              My Experience fromgenerate tests              Coverage
               dozens of projects: of
                              using a Design
             2-way DoE-based    Experiments-          No. of Bugs Found
              Time                based tool
In my experience, 2-way (or pairwise) tests - using the same test ideas as used in
             tests consistently more Time to Execute as
business as usual tests - have consistently found       defects than the business
                    find more.
usual tests. This is true even when the business as usual tests are far higher in number
than the pairwise set of tests.
                                                                Tests
If you used 3-way or 4-way sets of tests, the number of defects found by this Design of
                                                              ?
Experiments-based test design approach would be far higher than found using the
business as usual approach.
                                           38
Results: ~2x Bugs / Hour

                          Different Test
                  Will depend upon:                     Different
            Same System Under Test,
                                 Design
              (1) the                                    Results
                              Approach
              (2) Test Designer skill, and
                (3) the coverage strength
            System DoE-based tests.
                 of the                               Time to Design
         Under Test     Identify tests                    Tests
            My Experience from
             dozens of projects: vs.
          Test Ideas
                         manually        Combinatorial
                       generate tests
           ~2-way DoE-based                 Coverage
                      using a Design of
           tests consistently
                        Experiments-    No. of Bugs Found
            find MANY more
            Time         based tool     Time to Execute
                 bugs / hour
                                               Tests
              (often double)
The number of defects is higher using Hexawise. The number of tests executed is

                                                            ?
lower using Hexawise. On average, in the dozens of pilot projects I have seen, the
number of defects found per tester hour is often double the number of defects found
per tester hour from business as usual sets 39 tests.
                                            of
How Can You Know?




I would strongly encourage you to try a simple one or two day pilot project. In fact, I’ll
help you do it if you agree to publish the results (whether good or bad).
                                                               ?
                                           40
Additional Information




                  James Bach - Pairwise Testing: A Best Practice that Isn’t




We’ve barely scratched the surface on the topic of what Design of Experiments-based
test design is and how you could get started using it. Here are some good sources to
find out more about it and how you can get started using it. I am happy to talk to you
about it if you have any questions.                                       ?
                                         41
Questions?




                                     ?
Thank you all for your time. Any questions?




                                         42
de s
                x S li
       e n di
A pp



           43
Select Your Thoroughness Goal

Testing for every pair of input values is just a start. The test designer can
generate plans with very different levels of testing thoroughness.
The 2-way test cases Hexawise generates have been consistently shown to be more thorough than standard test cases
created by most test teams at Fortune 500 firms. Even so, Hexawise allows users to “turn up the coverage dial” dramatically
and generate other, extraordinarily thorough, sets of tests. In this case, we see Hexawise can generate test set solutions for
this simple insurance ratings engine example ranging in size from 28 test cases (for users who prioritize speed to market) all
the way up to 3,925 test cases (for users who desire extremely thorough testing).




                                                               44
How Much is Enough Testing?

The “Analyze Coverage” screen shows you how much coverage is achieved
at each point in the set of tests. In other words, what percentage of the
targeted combinations have been tested for after each test?




                      This chart gives teams the ability to make fact-based decisions about “how
                      much testing is enough?” Here, for example, 83% of the pairs of test inputs
                      entered into this plan have been tested together after only 12 tests (out of
                      295,000 possible tests).
                                             45
Better Than Hand-Selected Tests

If you take a close look at any set of Hexawise-generated test cases you
will notice that there is an enormous amount of variation from test case
to test case (and the smallest amount of repetition that is mathematically
possible to achieve).
                                                       In contrast, if you were to translate
                                                       your existing manually-selected test
                                                       cases into a similar format and
                                                       analyze them, you would find that
                                                       the manually-selected test cases
                                                       have far more repeated test
                                                       combinations and far less variation
                                                       from test case to test case. This is
                                                       is a big part of the reason why
                                                       Hexawise generates dramatic
                                                       efficiency improvements.

                                                       In addition, if you were to graph
                                                       the percent of the targeted 2-way
                                                       combinations achieved by your
                                                       existing manually-selected test
                                                       cases, you would find that there are
                                                       many pairs of test inputs that were
                                                       never covered by your tests. The
                                                       fact that Hexawise will ensure
                                                       every pair of test inputs gets tested
                                                       in at least one test case is a big
                                                       part of the reason why Hexawise-
                                                       generated tests result in superior
                                                       coverage and more defects found
                                                       during test execution.


                                     46
What is DoE-based testing?


Topic       Details
            Design of Experiments-based testing is a test design approach used
Definition   to identify a small subset of tests (from many possible ones) in
            order to find as many defects as possible in as few tests as possible.


            Test conditions are constructed to ensure:
Why it      • No combinations of conditions get accidentally omitted
Works       • Unproductive repetition is minimized


            “Design of Experiments-based testing” covers several closely-
            related subjects:
“AKA”       • Pairwise / AllPairs
            • Orthogonal Array / OA / OATs
            • 2-way, 3-way, ... t-way


                                    47
Software Testing Challenges


• Software applications are very complex; it is impossible to test every possibility

• Extraordinarily smart, pragmatically-oriented applied statisticians created the field
  of “Design of Experiments” to solve exactly this challenge; for the last 40+ years
  they have developed highly effective math-based covering array techniques and
  similar strategies which are now broadly used in many areas including
  manufacturing, advertising, and agriculture

• These proven Design of Experiments techniques, which are designed to find out
  as much information as possible in as few test cases as possible, also have direct
  applicability to the software testing field

• Unfortunately, the vast majority of software testers in the relatively young field of
  software testing have never heard of any Design of Experiments concepts like
  MFAT vs. OFAT, Orthogonal Array coverage, pairwise coverage, or even the
  existence of the “Design of Experiments” field

• Instead of using 40+ years of Design of Experiments-based knowledge to design
  tests that are as effective as possible, testers almost always manually select the
  combinations of test conditions they use in their tests, and as a result...
                                               48
Results without DoE / Hexawise


... the results from manual test case selection efforts are consistently far
from optimal:

        Missed combinations                      Wasteful repetition




                                     49
Results with DoE / Hexawise


In contrast, Hexawise algorithms use Design of Experiments-based
methods to generate tests. The result is that Hexawise-generated
tests consistently find more defects in fewer tests. Hexawise-
generated tests pack more coverage into each test.




                                   50

More Related Content

Viewers also liked

NG BB 47 Basic Design of Experiments
NG BB 47 Basic Design of ExperimentsNG BB 47 Basic Design of Experiments
NG BB 47 Basic Design of ExperimentsLeanleaders.org
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesSigmoid
 
Time Series
Time SeriesTime Series
Time Seriesyush313
 
Gujrat
GujratGujrat
GujratUOG
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...Spark Summit
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slidesYanchang Zhao
 

Viewers also liked (10)

Design of Experiments
Design of ExperimentsDesign of Experiments
Design of Experiments
 
NG BB 47 Basic Design of Experiments
NG BB 47 Basic Design of ExperimentsNG BB 47 Basic Design of Experiments
NG BB 47 Basic Design of Experiments
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time Series
 
Pharmaceutical Design of Experiments for Beginners
Pharmaceutical Design of Experiments for Beginners  Pharmaceutical Design of Experiments for Beginners
Pharmaceutical Design of Experiments for Beginners
 
Quality by Design : Design of experiments
Quality by Design : Design of experimentsQuality by Design : Design of experiments
Quality by Design : Design of experiments
 
Time Series
Time SeriesTime Series
Time Series
 
Slideshow
SlideshowSlideshow
Slideshow
 
Gujrat
GujratGujrat
Gujrat
 
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by...
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slides
 

Similar to Hexawise Software Test Design Tool - "Vendor Meets User" at CAST Software Testing Conference 2011 - with speaker notes

Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing ExplainedTechWell
 
Automatic for the People
Automatic for the PeopleAutomatic for the People
Automatic for the PeopleAndy Zaidman
 
Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing ExplainedTechWell
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksBayesia USA
 
How to embed UX thinking in API design
How to embed UX thinking in API designHow to embed UX thinking in API design
How to embed UX thinking in API designstephshin
 
Prototyping for tiny fingers
Prototyping for tiny fingersPrototyping for tiny fingers
Prototyping for tiny fingersJulio Pari
 
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...Michael Larsen
 
Pair Programming Presentation
Pair Programming PresentationPair Programming Presentation
Pair Programming PresentationThoughtWorks
 
Bridging the communication gap
Bridging the communication gapBridging the communication gap
Bridging the communication gapGuillagui San
 
Design Fixation and conformity with examples
Design Fixation and conformity with examplesDesign Fixation and conformity with examples
Design Fixation and conformity with examplesBaskar Rethinasabapathi
 
How to Deliver the Right Software (Specification by example)
How to Deliver the Right Software (Specification by example)How to Deliver the Right Software (Specification by example)
How to Deliver the Right Software (Specification by example)Asier Barrenetxea
 
Field Research at the Speed of Business
Field Research at the Speed of BusinessField Research at the Speed of Business
Field Research at the Speed of BusinessPaul Sherman
 
Efficient And Effective Test Design
Efficient And Effective Test DesignEfficient And Effective Test Design
Efficient And Effective Test DesignJustin Hunter
 
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.Martijn Scheijbeler
 

Similar to Hexawise Software Test Design Tool - "Vendor Meets User" at CAST Software Testing Conference 2011 - with speaker notes (20)

Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing Explained
 
Practices and Tools for Better Software Testing
Practices and Tools for  Better Software TestingPractices and Tools for  Better Software Testing
Practices and Tools for Better Software Testing
 
Automatic for the People
Automatic for the PeopleAutomatic for the People
Automatic for the People
 
ATD2K16
ATD2K16ATD2K16
ATD2K16
 
Exploratory Testing Explained
Exploratory Testing ExplainedExploratory Testing Explained
Exploratory Testing Explained
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian Networks
 
Agile testing
Agile testingAgile testing
Agile testing
 
How to embed UX thinking in API design
How to embed UX thinking in API designHow to embed UX thinking in API design
How to embed UX thinking in API design
 
Prototyping for tiny fingers
Prototyping for tiny fingersPrototyping for tiny fingers
Prototyping for tiny fingers
 
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...
Get the Balance Right: Acceptance Test Driven Development, GUI Automation and...
 
Pair Programming Presentation
Pair Programming PresentationPair Programming Presentation
Pair Programming Presentation
 
The Testing Planet - July 2010
The Testing Planet - July 2010The Testing Planet - July 2010
The Testing Planet - July 2010
 
[Paul Holland] Trends in Software Testing
[Paul Holland] Trends in Software Testing[Paul Holland] Trends in Software Testing
[Paul Holland] Trends in Software Testing
 
Usability testing
Usability testingUsability testing
Usability testing
 
Bridging the communication gap
Bridging the communication gapBridging the communication gap
Bridging the communication gap
 
Design Fixation and conformity with examples
Design Fixation and conformity with examplesDesign Fixation and conformity with examples
Design Fixation and conformity with examples
 
How to Deliver the Right Software (Specification by example)
How to Deliver the Right Software (Specification by example)How to Deliver the Right Software (Specification by example)
How to Deliver the Right Software (Specification by example)
 
Field Research at the Speed of Business
Field Research at the Speed of BusinessField Research at the Speed of Business
Field Research at the Speed of Business
 
Efficient And Effective Test Design
Efficient And Effective Test DesignEfficient And Effective Test Design
Efficient And Effective Test Design
 
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.
Bad Experiments: The #18 Ways You’re A/B Tests are Going Wrong.
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Hexawise Software Test Design Tool - "Vendor Meets User" at CAST Software Testing Conference 2011 - with speaker notes

  • 1. Vendor Meets User The Hexawise Test Design Tool and Two Testers Who Tried to Use it in Real Life Presented at CAST, 2011 Justin Hunter Lanette Creamer Ajay Balamurugadas
  • 2. Topics Introduction to Hexawise “Inside the Mind of the Vendor” Justin Hunter Experiences “Inside the Minds of Testers” Lanette Creamer & Ajay Balamurugadas Hexawise is a test case design tool used by testers to design their tests. In a context where test scripts are used, Hexawise can design detailed test scripts. In an Exploratory Testing context, Hexawise is used to generate “test ideas” that encourage the tester for the tester to explore, and even design mini tests on the fly. 2
  • 3. My Dad - William G. Hunter Why I created Hexawise has a whole lot to do with who my father was. He was a leading applied statistician who specialized in how to make experiments more efficient & effective. 3
  • 4. 1960’s In the 1960’s my dad brought my family to Singapore where he taught at a university and worked with local companies. 4
  • 5. 1970’s In the 1970’s my dad brought my family to Nigeria where, again, he taught at a university and worked with local companies. That’s me in the lower right hand corner. We were visiting a factory my father was helping. 5
  • 6. 1980’s In the 1980’s he did something that his colleagues thought was pretty crazy because they thought his expertise and lessons learned probably wouldn’t transfer into the government sector. While a professor in Madison, Wisconsin, he started collaborating with local and state government agencies. 6
  • 7. Design of Experiments Why did he uproot our family every few years and then start collaborating with government agencies? It was because he passionately believed that sharing his expertise in Design of Experiments (a specialized field of applied statistics), would really help people - by giving them skills that would improve both quality and productivity. This is a book cover from a book he co-wrote with George Box that has helped to increase awareness of what Design of Experiments is and how practitioners should use it. 7
  • 8. Design of Experiments So what is Design of Experiments? It’s a specialized field of applied statistics that has been around since the 1930’s. Simply put, it is focused on answering this question. 8
  • 9. Design of Experiments Where is Design of Experiments used? It is used extensively in manufacturing, among other industries. If you’re a manufacturer trying to create a widget for a car part, for example, you don’t want to have to build 100,000 different prototypes of the widget before you stumbled across a combination of heat and pressure and temperature and ingredients that achieve the desired characteristics. You’d want to build a small handful of prototypes, and have the variables going into each of the prototypes carefully varied from prototype to prototype to allow you to learn as much as possible in as few experiments. That’s what auto-manufacturers regularly do. 9
  • 10. Design of Experiments Where Else is Design of Experiments used? Design of Experiments-based methods have also been commonly used in agriculture for decades. If you’re Monsanto and you want to grow an hardier seed that will grow in colder temperatures and mature more quickly, you’re going to use Design of Experiments methods to identify the combinations of variables to test together in each test you execute. 10
  • 11. Design of Experiments Where Else is Design of Experiments used? Many marketers also use Design of Experiments methods extensively. YouTube recently ran an experiment involving 1,024 different combinations of fonts, colors, messages, and button sizes and shapes to find an optimal combination that increased their sign-up rate by more than 15%. Jeff Fry, a tester at Microsoft wrote a good article about this and posted a phenomenal video by a Design of Experiments expert who worked at Amazon before moving to Microsoft. A/B testing is a very simple “watered down” DoE approach. Multi-Variate Testing (MVT) is “full-blown” DoE-based marketing. 11
  • 12. Design of Experiments What about in Software Testing? This is the question I’ve been focused on for the last 5 years. Seventeen pilot projects I conducted at my prior company convinced me that DoE-based methods consistently deliver improvements in the efficiency and effectiveness of their business as usual test cases. 12
  • 13. What is Hexawise? Challenges Hexawise Addresses Problems During Impact Felt During Test Design... Test Execution Manual Documentation Delayed Start Largely Repetitive Tests Inefficient Gaps in Coverage Missed Defects Hexawise was created to address these common testing challenges. 13
  • 14. Mortgage Application Example Let’s use this simple example to demonstrate how pairwise testing works. I’ve borrowed this idea from a presentation that Bernie Berger gave at StarEast. Imagine you’re testing a mortgage application that has several sets of details. This is an “executive summary” view of the different options that could be selected for application. We could make this example more complicated by including hardware and software configuration options, user types, etc. We’re intentionally keeping it simple here. 14
  • 15. Mortgage Application Example If we just examine one of those three branches, we see that it has 27 possible test combinations associated with it. For example, 1 of the 27 possible tests would include: One example: Income = Low & Credit Rating = Medium & Customer Status = VIP There are 26 other similar combinations. 15
  • 16. Mortgage Application Example When we examine all three branches, we see they have equal complexity. Each of the three branches has 27 total possible combinations. How many total combinations are there? Hint: It is not 81. 16
  • 17. Which Tests Should You Choose? 27 X 27 X 27 = 19,683 Possible Tests There are almost 20,000 possible combinations to choose from. 17
  • 18. Prioritization How many test inputs are needed to trigger defects in production? 5% 1 11% 2 (“pairwise”) 51% 3 33% 4, 5, or 6 In order to prioritize which specific combinations should be selected as high priority tests from those ~20,000 possible tests, it is extremely important to understand that the vast majority of defects in production can be triggered by just two test inputs tested together (e.g., a test that includes Income = Low as the first test input and also includes Credit Rating = High as the second test input). This fact has extremely important implications for software testers. Unfortunately, very few software testers are aware of (a) this fact, or (b) the implications. The implications for software testers is that small sets of tests that ensure every possible PAIR of values get tested • Medical Devices:  D.R. Wallace, D.R. Kuhn, Failure Modes in Medical Device Software: an Analysiseffectiveof Recall Data, International Journal of Reliability, Quality, and Safety Engineering, Vol. 8, No. 4, 2001.     together in at least one test case are extremely efficient and of 15 Years at finding defects. • Browser, Server:  D.R. Kuhn, M.J. Reilly, An Investigation of the Applicability of Design of Experiments to Software Testing, 27th NASA/IEEE Software Engineering Workshop, NASA Goddard SFC 4-6 December, 2002 .   • NASA database:  D.R. Kuhn, D.R. Wallace, A.J. Gallo, Jr., Software Fault Interactions and Implications for Software Testing, IEEE Trans. on Software Engineering, vol. 30, no. 6, June, 2004.   • Network Security:  K.Z. Bell, Optimizing Effectiveness and Efficiency of Software Testing: a Hybrid Approach,  PhD Dissertation, North Carolina State University, 2006.   18
  • 19. Mortgage Application Example Select a couple pairs of test inputs from this mind map. Possible pairs of inputs could include pairs like these shown below. Select your own two pairs though. First Example of a pair of values: Income = Low & Credit Rating = Medium Second Example of a pair of values: Income= High & Customer Status = Employee 19 Third Example: Income = High & Credit Rating = Low
  • 20. These are the same test inputs that have been imported from the mind map into the Hexawise tool. When you click on the “Create Tests” icon at the top of the screen, you will see a pairwise testing solution. Every and all pairs of values might have selected will be included in a surprisingly small number of tests. 20
  • 21. Only 17 tests are required (out f 19,683 possible tests) to test every single possible pair of test conditions together in the same test case at least once. In other words, every single pair of test conditions will be tested together at least once. The pair of conditions we selected at random Income = Low, Credit Rating = Medium appears in test number 8. All the other pairs are also tested together at least once. If we have done a thorough job of identifying test inputs, the vast majority of defects will be triggered by these 17 tests out of almost 20,000 tests. This is the lesson from Design of Experiments that have been learned and applied in so many other industries since the 1930’s and are now being applied by an increasing number software testers. 21
  • 22. This is the same set of 17 tests shown in case the speaker notes on the last slide covered a pair of values you were trying to confirm were tested together. 22
  • 23. The tests “front-load” coverage. 87% of the pairs of values have already been tested together by the end of the 9th test. This is simply a coverage chart showing what percentage of test input pairs have been tested together so far as a percentage of the total number of possible pairs that could be tested together in the System Under Test. 23
  • 24. User Experiences Lanette Creamer’s Experiences Now lets’ hear from Lanette and Ajay... 24
  • 25. HEXAWISE First used on June 08, 2009
  • 26. FIRST STEPS Interestedto try new software Aware of allpairs.exe Problem Statement  Multiple Printers  Printer Specific Charts  Chart 1 & Chart 2  Other Settings Sl. No Printer Chart 1 Chart 2 Settings 1 ABC 16 strips 64 strips Borderless 15 LMN 64 strips ------- Auto cut 45 XYZ ------- 16 strips -------
  • 27. TESTED WITH ALLPAIRS Excel to Notepad to Excel Very useful when all pairs are valid Unable to mention invalid pairs Steps to be repeated based on cases Maintained a common repository
  • 28. HEXAWISE Easy to use One user account – anytime accessible Can specify invalid pairs Multiple strength cases
  • 29. HEXAWISE WISH LIST Desktop version – useful without internet too Able to define invalid pairs after the cases are generated Easy method to define invalid pairs Need to try project sharing & excel import
  • 30. Disclaimer: Thinking Req’d This is a photo I saw Lanette use in earlier presentations. It is absolutely spot on in this context. Designing DoE-based software tests is not a paint-by-numbers approach. You need to use your critical thinking skills. Without using them, there will be a garbage in / garbage out problem. ? 30
  • 31. Disclaimer - Imperfect Models When you use a Design of Experiments-based test design tool, you effectively create a ? model that will generate your tests. Whenever you do so, there will be parts of the System Under Test you will miss. Perhaps (probably?) you will miss important parts. 31
  • 32. Disclaimer - Which Inputs? When creating DoE-based software tests, you will face the same kinds of test design considerations you always have... as well as new, DoE-specific considerations. ? 32
  • 33. “The Truffle Pig Problem” Design of Experiments-based test design methods face a “truffle pig problem.” If software bugs were like leaves on your lawn that you wanted to get rid of, DoE-based test-design based methods would be much more popular than they are now. DoE- based methods would be the equivalent of a leaf blower: you’d be able to instantly see your productivity increase. Unfortunately bugs are not visible, like leaves. They’re hiding, unseen, like truffles. It is my experience, that DoE-based test design methods are like an especially thorough and efficient truffle pig. The problem is, of course, that if someone gave you a super truffle pig that was twice as good at finding truffles as your regular truffle pig, you would probably have a hard time assessing how good it was. DoE-based test design methods face this same challenge. ? 33
  • 34. How Can You Know? Here’s the best approach I’ve come up with to answer the question of “how can you know DoE-based test design methods are better than manually-selected test cases?” ? 34
  • 35. “Let’s test this hypothesis.” Even though we can complete a meaningful “bake-off” pilot project within just a couple man days of effort, this is the typical reaction I get from test teams who I propose this approach to! (brief video of office mates diving under desks, hiding under plants, etc.) Cereal It is amazing how quickly people tend to run and hide when they are given the Box opportunity to learn something that could fundamentally change their software testing effectiveness. Toyota - The irony is that teams will say “we’re too busy to execute a one or two day pilot project Entering the now.” Hello? In my experience, the findings from the pilot - on average - more than double the number of defects found per tester hour. The entire point of learning about Truck market Design of Experiments-based test design techniques, like pairwise and 3-way, and orthogonal array-based / OA testing is to improve your efficiency and effectiveness... So in the U.S you can get much more done with fewer resources. Saying “I’m too busy to learn how to do that” is... shortsighted is probably the most diplomatic word. 35
  • 36. Results: Less Test Design Time Different Test Different Same Design Results Approach System Time to Design ~ 30 Test Under - 40% Less Time Tests Identify tests manually vs. Combinatorial Test Ideas Test Generation (b/c Many generate tests Coverage Steps are Automated) of using a Design Experiments- No. of Bugs Found Time based tool Time to Execute In my experience, teams that have agreed to pilot projects have seen these results. It Tests takes far less time to generate tests using this approach because many steps in the test case selection process and test case documentation process get automated. ? 36
  • 37. Results: Better Coverage Different Test Different Same Design Results Approach System Time to Design Under Test Identify tests Tests manually vs. Combinatorial Test Ideas generate tests Coverage using a Design of Experiments- No. of Bugs Found Time based tool In my experience, it is easy to show that combinatorial coverage (e.g. how many pairs of values, how many triples of values, are tested together, Time far superior with this etc.) is to Execute approach. In this actual example from a couple months ago, we showed that 51 Tests more than business as usual tests that were put together manually did not test for 1,400 pairs of values. ? A skeptic will probably say... “OK. Interesting, but what does that translate to in terms of actual defects found?” 37
  • 38. Results: More Bugs Found Different Test Will depend upon: Different Same Design (1) the System Under Test, Results Approach (2) Test Designer skill, and (3)System the coverage strength of Time to Design Under Test Identify tests Tests the DoE-based tests. manually vs. Combinatorial Test Ideas My Experience fromgenerate tests Coverage dozens of projects: of using a Design 2-way DoE-based Experiments- No. of Bugs Found Time based tool In my experience, 2-way (or pairwise) tests - using the same test ideas as used in tests consistently more Time to Execute as business as usual tests - have consistently found defects than the business find more. usual tests. This is true even when the business as usual tests are far higher in number than the pairwise set of tests. Tests If you used 3-way or 4-way sets of tests, the number of defects found by this Design of ? Experiments-based test design approach would be far higher than found using the business as usual approach. 38
  • 39. Results: ~2x Bugs / Hour Different Test Will depend upon: Different Same System Under Test, Design (1) the Results Approach (2) Test Designer skill, and (3) the coverage strength System DoE-based tests. of the Time to Design Under Test Identify tests Tests My Experience from dozens of projects: vs. Test Ideas manually Combinatorial generate tests ~2-way DoE-based Coverage using a Design of tests consistently Experiments- No. of Bugs Found find MANY more Time based tool Time to Execute bugs / hour Tests (often double) The number of defects is higher using Hexawise. The number of tests executed is ? lower using Hexawise. On average, in the dozens of pilot projects I have seen, the number of defects found per tester hour is often double the number of defects found per tester hour from business as usual sets 39 tests. of
  • 40. How Can You Know? I would strongly encourage you to try a simple one or two day pilot project. In fact, I’ll help you do it if you agree to publish the results (whether good or bad). ? 40
  • 41. Additional Information James Bach - Pairwise Testing: A Best Practice that Isn’t We’ve barely scratched the surface on the topic of what Design of Experiments-based test design is and how you could get started using it. Here are some good sources to find out more about it and how you can get started using it. I am happy to talk to you about it if you have any questions. ? 41
  • 42. Questions? ? Thank you all for your time. Any questions? 42
  • 43. de s x S li e n di A pp 43
  • 44. Select Your Thoroughness Goal Testing for every pair of input values is just a start. The test designer can generate plans with very different levels of testing thoroughness. The 2-way test cases Hexawise generates have been consistently shown to be more thorough than standard test cases created by most test teams at Fortune 500 firms. Even so, Hexawise allows users to “turn up the coverage dial” dramatically and generate other, extraordinarily thorough, sets of tests. In this case, we see Hexawise can generate test set solutions for this simple insurance ratings engine example ranging in size from 28 test cases (for users who prioritize speed to market) all the way up to 3,925 test cases (for users who desire extremely thorough testing). 44
  • 45. How Much is Enough Testing? The “Analyze Coverage” screen shows you how much coverage is achieved at each point in the set of tests. In other words, what percentage of the targeted combinations have been tested for after each test? This chart gives teams the ability to make fact-based decisions about “how much testing is enough?” Here, for example, 83% of the pairs of test inputs entered into this plan have been tested together after only 12 tests (out of 295,000 possible tests). 45
  • 46. Better Than Hand-Selected Tests If you take a close look at any set of Hexawise-generated test cases you will notice that there is an enormous amount of variation from test case to test case (and the smallest amount of repetition that is mathematically possible to achieve). In contrast, if you were to translate your existing manually-selected test cases into a similar format and analyze them, you would find that the manually-selected test cases have far more repeated test combinations and far less variation from test case to test case. This is is a big part of the reason why Hexawise generates dramatic efficiency improvements. In addition, if you were to graph the percent of the targeted 2-way combinations achieved by your existing manually-selected test cases, you would find that there are many pairs of test inputs that were never covered by your tests. The fact that Hexawise will ensure every pair of test inputs gets tested in at least one test case is a big part of the reason why Hexawise- generated tests result in superior coverage and more defects found during test execution. 46
  • 47. What is DoE-based testing? Topic Details Design of Experiments-based testing is a test design approach used Definition to identify a small subset of tests (from many possible ones) in order to find as many defects as possible in as few tests as possible. Test conditions are constructed to ensure: Why it • No combinations of conditions get accidentally omitted Works • Unproductive repetition is minimized “Design of Experiments-based testing” covers several closely- related subjects: “AKA” • Pairwise / AllPairs • Orthogonal Array / OA / OATs • 2-way, 3-way, ... t-way 47
  • 48. Software Testing Challenges • Software applications are very complex; it is impossible to test every possibility • Extraordinarily smart, pragmatically-oriented applied statisticians created the field of “Design of Experiments” to solve exactly this challenge; for the last 40+ years they have developed highly effective math-based covering array techniques and similar strategies which are now broadly used in many areas including manufacturing, advertising, and agriculture • These proven Design of Experiments techniques, which are designed to find out as much information as possible in as few test cases as possible, also have direct applicability to the software testing field • Unfortunately, the vast majority of software testers in the relatively young field of software testing have never heard of any Design of Experiments concepts like MFAT vs. OFAT, Orthogonal Array coverage, pairwise coverage, or even the existence of the “Design of Experiments” field • Instead of using 40+ years of Design of Experiments-based knowledge to design tests that are as effective as possible, testers almost always manually select the combinations of test conditions they use in their tests, and as a result... 48
  • 49. Results without DoE / Hexawise ... the results from manual test case selection efforts are consistently far from optimal: Missed combinations Wasteful repetition 49
  • 50. Results with DoE / Hexawise In contrast, Hexawise algorithms use Design of Experiments-based methods to generate tests. The result is that Hexawise-generated tests consistently find more defects in fewer tests. Hexawise- generated tests pack more coverage into each test. 50