Model-Based Testing:
       Theory and Practice
                                    Wolfgang Grieskamp
Staff Software Engineer, Cloud Division, Google Corp, USA
About Me
 < 2000: Formal Methods Research Europe (TU Berlin)

 2001-2007: Microsoft Research: Languages and Model-Based
  Testing Tools

 2007-2011: Microsoft Windows Interoperability Program: Model-
  Based Protocol Testing and Tools

 Since 4/2011: Google Cloud Division


 DISCLAIMER: This talk does not represent Google’s opinion or direction but
  my personal one.

 DISCLAIMER: Some of the material presented here stems from my academic
  and practical work on model-based testing at Microsoft. All such material
  has been published via numerous articles and presentations.
Model-Based Testing: the High-Level Picture

                            Requirements
                                                         Feedback
Author


         Model
                       Feedback                           Feedback

Generate


                                              Issue
         Inputs           Expected Outputs            Verdict



             Control               Observe
                                                          Feedback
                          System-under-Test
Content of this talk
1. Twenty years of MBT research: a short survey

2. Ten years MBT at Microsoft - an example of industrial
   adoption?

3. Some Lessons Learnt

4. MBT and the Cloud: New opportunities?
I

Twenty Years of MBT Research: a Short Survey
          With emphasis on historical accuracy
The Three Major Schools
 Axiomatic Approaches


 Finite State Machine (FSM) Approaches


 Labeled-Transition-System (LTS) Approaches
Axiomatic Approaches to MBT
Gaudel et. al ‘86: Test set generation from algebraic specifications

Given conditional equation

        p(x) => f(g(x),a) = h(x) where f, g, h SUT functions

… find x s.t. the equation is sufficiently tested.

Approach: break p down into its DNF; find a value for x for each
partition of the DNF.

Justification: Uniformity Hypothesis

  If p represents the way the program discriminates x, most errors should
  be detected by trying one value each for the partitions of p.
Axiomatic Approaches to MBT
Dick and Faivre ‘93: Automating the generation and sequencing of
test cases from model-based specifications

Given a VDM specification of operations with pre/post conditions
take

        pre(op1) | post(op1) | … | pre(opn) | post(opn)

… and build the DNF.

Consider each member of the DNF a disjoint state of a state
machine. Draw a transition

        si -opl-> sk       iff si => pre(opl) & post(opl) => sj

Generate sequences from the state machine (see below …)
Axiomatic Approaches to MBT
Many derivations of this basic work:

 Helke, Neustupny, Santen „97: Automatic test generation from Z
  specifications

 Legeard, Peureux, Utting „02: Automated boundary testing from Z and B

 Grieskamp et. al „02: Deriving finite state machines from abstract state
  machines

 Kuliamin, Petrenko, et.al „03: The UniTesK approach to designing test
  suites

 …

Recent:

 Brucker, Wolff „12: On theorem-prover based testing
FSM Approaches to MBT
Basic model is a mealy machine (FSM where
transitions are labeled with inputs and outputs).

In practice often refined as an EFSM.

Objective: find sequences from the FSM model which cover all behavior of the SUT.

Problem: even though equivalence between two FSMs is decidable, the FSM representing
the SUT is “unknown”. It can always be different as assumed by the model.

Solutions:

 Chow „78: if the maximal number of states of the SUT FSM is known, completeness is
  possible.

 Vasilevskii „73, Naito & Tsunoyama „81, Aho, Dabhura, Lee, Uyar ‟88: various
  traversal methods, e.g. transition-tour method (rural postman variation), Unique-
  Input-Output method, etc.

 Many tools in practice ignore the completeness problem.
FSM Approaches to MBT
Overview:

 Lee & Yannakakis ‟96: Principles and methods of testing finite state
  machines - a survey

 Utting & Legeard „07: Practical Model-Based Testing (textbook)

Refinements:

 Nachmanson et. al „04: Traversals in the presence of non-determinism

 Huo & Petrenko „05: Implications of buffering input/output

 Hierons & Ural „08: Distributed Testing

 Ernits, Kull, et. al „06: Symbolic EFSMs

 …
LTS Approaches to MBT
Represents the model as a labeled-transition system (not
necessarily finite; distinct transitions for input/output)

Tretmans „96: IOCO (input-output conformance)

        outs(model after trace) >= outs(sut after trace)
               for all suspension traces in model

Properties:

 Deals with non-determinism

 Deals with under-specification

 Deals with quiescence
LTS Approaches to MBT
Refinements:

 Nielsen & Skou „03: Real-time

 Frantzen & Tretmans ‟04, Jeannet, Jeron, Rusu & Zinovieva
  ‟05: Symbolic LTS

 Hierons, Merayo, Nunez „08: Distribution

 …
LTS Approaches to MBT
Alternating Simulation (Veanes, Grieskamp, et. al ‟04): testing as a
two-player game:

  In each step, SUT must accept every input the model
  provides, model must accept every output the SUT provides

Properties:

 External non-determinism only

 Does not require input completeness

 Symmetric thus suitable for model composition (Grieskamp &
  Kicilloff ‟06)

Refinements:

 Grieskamp & Kicilloff „06: alternating simulation with
  symbolic transition systems (basis of Spec Explorer)
II

Ten years MBT at Microsoft – an Example of
                Adoption?
Tools at Microsoft
Source: Stobie‟s keynote at MBTUC11


2000: IE introduces TMT (Test Model Toolkit)
2001: MS Research introduces AsmL/T
2004: MS Research introduces Spec Explorer 2004
2004: Windows introduces MDE, successor of TMT
2005: SQL introduces Kokomo
2005: Windows introduces PICT
2006: MS Research introduces Spec Explorer for VS
2007: MS Research introduces NModel
2007: Windows takes over Spec Explorer for VS
2009: Windows releases Spec Explorer for VS as power tool

Subscribers to MBT mailing list:

2001    ~   a couple of dozen
2003    ~   a few hundred
2005    ~   around five hundred
2007    ~   around seven hundred
2009    ~   around one thousand
2011    ~   …
Technical Document Testing Program
of Windows (2007-2011)

 300 protocols/technical documents tested
  30,000 pages studied and converted and into
  tests
   69% using MBT w/ Spec Explorer 2010
   31% tested using traditional test automation

 66,962 person days (250+ years)

See also: CACM 7/11 interview with Wolfgang Grieskamp and Nico Kicilloff
Spec Explorer 2010 Technology Breakdown
 Model programs
   Guarded state update rules
   Rich object-oriented model state (collections, object graphs)
   Language agnostic (Based on .Net intermediate language interpretation)

 Trace patterns
   Regular style language to represent scenarios
   Slicing of model program by composition

 Symbolic state exploration and test generation
   Expands parameters using combinatorial interaction testing
   Extracts a finite interface automaton (IA) from composed model
   Traverses IA to generate standalone test code –or-
   Runs on-the-fly tests from IA

 Integrated into Visual Studio 2010 and released as a VS power tool
19


Spec Explorer 2010 Look & Feel
             C# Model                      Model Graph
             (or other .Net
             Language)         Explore

                               Analyze


                              Generate


                                         Execute
Test Suite




                                                         VSTT Result
Comparison MBT vs Traditional
            100         100
100
 90
 80
                   66
 70    58
 60
                                  Model-Based
 50
                                  Traditional
 40
 30                           •   In % of total effort per
 20                               requirement, normalizing
                                  individual vendor
 10                               performance
 0                            •   Vendor 2 modeled 85% of
                                  all test suites, performing
      Vendor 1    Vendor 2        relatively much better
                                  than Vendor 1
Evaluation
Good:

 Many different tools brings healthy internal competition

 Mega-application in course of technical document testing program

 As of late ‟11, a dozen of product areas using MBT (source: Stobie‟s keynote
  at MBTUC‟11)

Not-so-good:

 From around 10 teams which adopt MBT, 7 later seem to drop it
   Adoption often bound to individuals  as those people leave the team, no follow-
    up business
   Paradigm shift does not stick as people are rotating at shipping cycles
   Maintenance problem

 The tool which finally prevailed (Spec Explorer for VS) is horribly complex
   Contains some rocket science from MSR
   Spec Explorer VS power tool was last updated 2/11  investment in the tooling
    seems not sustainable outside of research
III

Some Lessons Learnt
Lesson #1:
The theory is (mostly) Consolidated
What do we understand?

 Will never have complete test selection for non-trivial systems
  (even when restricting to FSM)

 Have sound theories for non-deterministic systems with arbitrary
  large state space, but they aren‟t resulting in computable test
  selection procedures

How do we deal with this?

 Symbolic representations

 Heuristics and Stochastic methods

 User Control (MBT as an „assistance‟ technology)
Lesson #2:
The technology is complex
Tooling which can succeed on industrial scale needs the following:

 Test selection via symbolic state space exploration

 Ways for an engineer to control and visualize exploration and test
  selection – no push-button technology will do.

 A modeling notation which meets requirements on modern programming
  languages (expressiveness, modularization, etc.)

 Seamless integration into an IDE, code assistance, intellisense, etc.

 Being rocket-stable and reliable – production quality.


These are mostly engineering problems, not research problems.
It may still take another 5-10 years until we are there.
IV

MBT and the Cloud – new Opportunities?
How Google Tests Cloud Products

   Production        Monitoring
      Level

                  A simulation of
    Staging        the production
                 environment with          Monitoring        Load testing
     Level        faked identities
                        etc.



   Integration   Automated testing
                   of every code
                                        End-to-End testing
                                           with partial
                  change over the
                                           component
      Level         dependency
                      closure
                                             isolation



                     Super-strict
                     component           Extensive use of
   Unit Level    isolation using e.g.
                     dependency
                                           mock-based
                                             testing
                      injection
Test
                                              Class
                                                      Class

Unit Testing

 Using open source tools like EasyMock, Mockito, …

 These tools allow to „mock‟ the behavior of a
  dependency
Test
                                                                     Class
                                                                             Class

Mock-Based Unit Testing
class NotifierTest extends MockingTestCase {
 @Mock Emailer emailer;
 public void testNotify() {

        // Setting up the mock
        expect(emailer).send(“wgg”, “hello”);
        expect(emailer).send(“wrwg”, “hello”);
        …

        // Running the test
        Notifier notifier = new Notifier(emailer);
        notifier.notify(ImmutableList.of(“wgg”, “wrwg”), “hello”);
        verify();
    }
}

“Mocking” is rather similar to scenario-oriented modeling but strictly
more verbose
Service     Service         Service

Integration Testing
                                                   Storage


 Two or more services are plugged together within a
  simulated environment

 Tests are usually very „flaky‟ (unreliable) because:
   Difficulty to construct simulated component‟s precise
    behavior (its more than a simple mock in a unit test)
   Difficulty to synthesize simulated component‟s initial state
    (it may have a complex state)
   Hidden dependencies from production services

 Use a test model of a service as a simulation model
Conclusions
 MBT Theory mostly consolidated
   There is more to do but not expecting
    breakthroughs

 MBT Practice is ambivalent
   Adoption is promising but does not (yet) stick
   Better tools may help, but they need significant
    investment
   Stepping forward from MBT to full
    modeling/simulation may help

Model-Based Testing: Theory and Practice. Keynote @ MoTiP (ISSRE) 2012.

  • 1.
    Model-Based Testing: Theory and Practice Wolfgang Grieskamp Staff Software Engineer, Cloud Division, Google Corp, USA
  • 2.
    About Me  <2000: Formal Methods Research Europe (TU Berlin)  2001-2007: Microsoft Research: Languages and Model-Based Testing Tools  2007-2011: Microsoft Windows Interoperability Program: Model- Based Protocol Testing and Tools  Since 4/2011: Google Cloud Division  DISCLAIMER: This talk does not represent Google’s opinion or direction but my personal one.  DISCLAIMER: Some of the material presented here stems from my academic and practical work on model-based testing at Microsoft. All such material has been published via numerous articles and presentations.
  • 3.
    Model-Based Testing: theHigh-Level Picture Requirements Feedback Author Model Feedback Feedback Generate Issue Inputs Expected Outputs Verdict Control Observe Feedback System-under-Test
  • 4.
    Content of thistalk 1. Twenty years of MBT research: a short survey 2. Ten years MBT at Microsoft - an example of industrial adoption? 3. Some Lessons Learnt 4. MBT and the Cloud: New opportunities?
  • 5.
    I Twenty Years ofMBT Research: a Short Survey With emphasis on historical accuracy
  • 6.
    The Three MajorSchools  Axiomatic Approaches  Finite State Machine (FSM) Approaches  Labeled-Transition-System (LTS) Approaches
  • 7.
    Axiomatic Approaches toMBT Gaudel et. al ‘86: Test set generation from algebraic specifications Given conditional equation p(x) => f(g(x),a) = h(x) where f, g, h SUT functions … find x s.t. the equation is sufficiently tested. Approach: break p down into its DNF; find a value for x for each partition of the DNF. Justification: Uniformity Hypothesis If p represents the way the program discriminates x, most errors should be detected by trying one value each for the partitions of p.
  • 8.
    Axiomatic Approaches toMBT Dick and Faivre ‘93: Automating the generation and sequencing of test cases from model-based specifications Given a VDM specification of operations with pre/post conditions take pre(op1) | post(op1) | … | pre(opn) | post(opn) … and build the DNF. Consider each member of the DNF a disjoint state of a state machine. Draw a transition si -opl-> sk iff si => pre(opl) & post(opl) => sj Generate sequences from the state machine (see below …)
  • 9.
    Axiomatic Approaches toMBT Many derivations of this basic work:  Helke, Neustupny, Santen „97: Automatic test generation from Z specifications  Legeard, Peureux, Utting „02: Automated boundary testing from Z and B  Grieskamp et. al „02: Deriving finite state machines from abstract state machines  Kuliamin, Petrenko, et.al „03: The UniTesK approach to designing test suites  … Recent:  Brucker, Wolff „12: On theorem-prover based testing
  • 10.
    FSM Approaches toMBT Basic model is a mealy machine (FSM where transitions are labeled with inputs and outputs). In practice often refined as an EFSM. Objective: find sequences from the FSM model which cover all behavior of the SUT. Problem: even though equivalence between two FSMs is decidable, the FSM representing the SUT is “unknown”. It can always be different as assumed by the model. Solutions:  Chow „78: if the maximal number of states of the SUT FSM is known, completeness is possible.  Vasilevskii „73, Naito & Tsunoyama „81, Aho, Dabhura, Lee, Uyar ‟88: various traversal methods, e.g. transition-tour method (rural postman variation), Unique- Input-Output method, etc.  Many tools in practice ignore the completeness problem.
  • 11.
    FSM Approaches toMBT Overview:  Lee & Yannakakis ‟96: Principles and methods of testing finite state machines - a survey  Utting & Legeard „07: Practical Model-Based Testing (textbook) Refinements:  Nachmanson et. al „04: Traversals in the presence of non-determinism  Huo & Petrenko „05: Implications of buffering input/output  Hierons & Ural „08: Distributed Testing  Ernits, Kull, et. al „06: Symbolic EFSMs  …
  • 12.
    LTS Approaches toMBT Represents the model as a labeled-transition system (not necessarily finite; distinct transitions for input/output) Tretmans „96: IOCO (input-output conformance) outs(model after trace) >= outs(sut after trace) for all suspension traces in model Properties:  Deals with non-determinism  Deals with under-specification  Deals with quiescence
  • 13.
    LTS Approaches toMBT Refinements:  Nielsen & Skou „03: Real-time  Frantzen & Tretmans ‟04, Jeannet, Jeron, Rusu & Zinovieva ‟05: Symbolic LTS  Hierons, Merayo, Nunez „08: Distribution  …
  • 14.
    LTS Approaches toMBT Alternating Simulation (Veanes, Grieskamp, et. al ‟04): testing as a two-player game: In each step, SUT must accept every input the model provides, model must accept every output the SUT provides Properties:  External non-determinism only  Does not require input completeness  Symmetric thus suitable for model composition (Grieskamp & Kicilloff ‟06) Refinements:  Grieskamp & Kicilloff „06: alternating simulation with symbolic transition systems (basis of Spec Explorer)
  • 15.
    II Ten years MBTat Microsoft – an Example of Adoption?
  • 16.
    Tools at Microsoft Source:Stobie‟s keynote at MBTUC11 2000: IE introduces TMT (Test Model Toolkit) 2001: MS Research introduces AsmL/T 2004: MS Research introduces Spec Explorer 2004 2004: Windows introduces MDE, successor of TMT 2005: SQL introduces Kokomo 2005: Windows introduces PICT 2006: MS Research introduces Spec Explorer for VS 2007: MS Research introduces NModel 2007: Windows takes over Spec Explorer for VS 2009: Windows releases Spec Explorer for VS as power tool Subscribers to MBT mailing list: 2001 ~ a couple of dozen 2003 ~ a few hundred 2005 ~ around five hundred 2007 ~ around seven hundred 2009 ~ around one thousand 2011 ~ …
  • 17.
    Technical Document TestingProgram of Windows (2007-2011)  300 protocols/technical documents tested 30,000 pages studied and converted and into tests  69% using MBT w/ Spec Explorer 2010  31% tested using traditional test automation  66,962 person days (250+ years) See also: CACM 7/11 interview with Wolfgang Grieskamp and Nico Kicilloff
  • 18.
    Spec Explorer 2010Technology Breakdown  Model programs  Guarded state update rules  Rich object-oriented model state (collections, object graphs)  Language agnostic (Based on .Net intermediate language interpretation)  Trace patterns  Regular style language to represent scenarios  Slicing of model program by composition  Symbolic state exploration and test generation  Expands parameters using combinatorial interaction testing  Extracts a finite interface automaton (IA) from composed model  Traverses IA to generate standalone test code –or-  Runs on-the-fly tests from IA  Integrated into Visual Studio 2010 and released as a VS power tool
  • 19.
    19 Spec Explorer 2010Look & Feel C# Model Model Graph (or other .Net Language) Explore Analyze Generate Execute Test Suite VSTT Result
  • 20.
    Comparison MBT vsTraditional 100 100 100 90 80 66 70 58 60 Model-Based 50 Traditional 40 30 • In % of total effort per 20 requirement, normalizing individual vendor 10 performance 0 • Vendor 2 modeled 85% of all test suites, performing Vendor 1 Vendor 2 relatively much better than Vendor 1
  • 21.
    Evaluation Good:  Many differenttools brings healthy internal competition  Mega-application in course of technical document testing program  As of late ‟11, a dozen of product areas using MBT (source: Stobie‟s keynote at MBTUC‟11) Not-so-good:  From around 10 teams which adopt MBT, 7 later seem to drop it  Adoption often bound to individuals  as those people leave the team, no follow- up business  Paradigm shift does not stick as people are rotating at shipping cycles  Maintenance problem  The tool which finally prevailed (Spec Explorer for VS) is horribly complex  Contains some rocket science from MSR  Spec Explorer VS power tool was last updated 2/11  investment in the tooling seems not sustainable outside of research
  • 22.
  • 23.
    Lesson #1: The theoryis (mostly) Consolidated What do we understand?  Will never have complete test selection for non-trivial systems (even when restricting to FSM)  Have sound theories for non-deterministic systems with arbitrary large state space, but they aren‟t resulting in computable test selection procedures How do we deal with this?  Symbolic representations  Heuristics and Stochastic methods  User Control (MBT as an „assistance‟ technology)
  • 24.
    Lesson #2: The technologyis complex Tooling which can succeed on industrial scale needs the following:  Test selection via symbolic state space exploration  Ways for an engineer to control and visualize exploration and test selection – no push-button technology will do.  A modeling notation which meets requirements on modern programming languages (expressiveness, modularization, etc.)  Seamless integration into an IDE, code assistance, intellisense, etc.  Being rocket-stable and reliable – production quality. These are mostly engineering problems, not research problems. It may still take another 5-10 years until we are there.
  • 25.
    IV MBT and theCloud – new Opportunities?
  • 26.
    How Google TestsCloud Products Production Monitoring Level A simulation of Staging the production environment with Monitoring Load testing Level faked identities etc. Integration Automated testing of every code End-to-End testing with partial change over the component Level dependency closure isolation Super-strict component Extensive use of Unit Level isolation using e.g. dependency mock-based testing injection
  • 27.
    Test Class Class Unit Testing  Using open source tools like EasyMock, Mockito, …  These tools allow to „mock‟ the behavior of a dependency
  • 28.
    Test Class Class Mock-Based Unit Testing class NotifierTest extends MockingTestCase { @Mock Emailer emailer; public void testNotify() { // Setting up the mock expect(emailer).send(“wgg”, “hello”); expect(emailer).send(“wrwg”, “hello”); … // Running the test Notifier notifier = new Notifier(emailer); notifier.notify(ImmutableList.of(“wgg”, “wrwg”), “hello”); verify(); } } “Mocking” is rather similar to scenario-oriented modeling but strictly more verbose
  • 29.
    Service Service Service Integration Testing Storage  Two or more services are plugged together within a simulated environment  Tests are usually very „flaky‟ (unreliable) because:  Difficulty to construct simulated component‟s precise behavior (its more than a simple mock in a unit test)  Difficulty to synthesize simulated component‟s initial state (it may have a complex state)  Hidden dependencies from production services  Use a test model of a service as a simulation model
  • 30.
    Conclusions  MBT Theorymostly consolidated  There is more to do but not expecting breakthroughs  MBT Practice is ambivalent  Adoption is promising but does not (yet) stick  Better tools may help, but they need significant investment  Stepping forward from MBT to full modeling/simulation may help