Popular Delusions, Crowds, and the Coming Deluge: end of the Oracle?


Published on

Invited Talk at the 20th CREST Open Workshop, The Oracle Problem for Automated Software Testing. University College of London. May 21, 2012
Pragmatic Innovations for test oracles, a new Oracle Taxonomy, Characterization of test oracles, Challenges.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Popular Delusions, Crowds, and the Coming Deluge: end of the Oracle?

  1. 1. Popular Delusions, Crowds, and the Coming Deluge: end of the Oracle? Robert V. Binder The 20th CREST Open Workshop The Oracle Problem for Automated Software Testing University College of London May 21, 2012
  2. 2. Overview• Pragmatic Innovations• Oracle Taxonomy• Characterization• Challenges
  3. 3. Crowd Sourced Evaluation• Google presents street sign images in ReCAPTCHA• Crowd Sourced Formal Verification- DARPA – Correctness proof as web game – Players devise strategies to win• Bio: Foldit, Foldit@home, Phylo, Rosetta@home – “Top-ranked Foldit players can fold proteins better than a computer.”
  4. 4. Testing and its Discontents• “Testing is Dead”• Exploratory Testing• Crowd Testing – MobTest http://www.youtube.com/watch?v=X1jWe5rOu3g – UTest – Mob4Hire “58,159 people (mobsters) have 33473 different mobile handsets on 439 carriers in 155 countries”
  5. 5. What is a Test Oracle?Any strategy that canproduce a verdict from anobservation of a SUT inaction.John Collier’s Priestess of Delphi. Oil, 1891
  6. 6. Survey of Test Oracles • 600+ publications • Many strategies – Mostly esoteric – Some pragmatic • Hard to compare • No basis for evaluation
  7. 7. Test Oracle Taxonomy Predictive Imitative• For selected test inputs, predict or constrain • Develop one or more facsimile systems expected result • Submit any input to SUT and facsimile• Expect expected and actual same, for each • Expect all outputs equivalent test input Reactive Judging• Define output criteria • Cultivate sense of appropriate• Submit any input to SUT • Submit any input to SUT• Expect output criteria met • Decide if response is appropriate Any strategy that can produce a verdict from an observation of an SUT in action
  8. 8. Predictive Test OraclesStrategy TacticsSpecial Values Sensitive Points Rejection ResponseSolved Example Reference Table LookupDesign by test Test First DesignI-O Invariants Range Input-output balancing BehaviorMetamorphic Testing Constant Step Permute Reorder Add, dropRegression Test Reference Testing Capture/ReplaySpecification-based Abstract I-O Grammar Checker Concrete Transition system trace
  9. 9. Predictive Test OraclesI-O InvariantsFor specific input, expected output is within a range or a memberof a set; "Sanity Check"
  10. 10. Predictive Test OraclesMetamorphic TestingOutput tuples are expected to meet certain properties
  11. 11. Imitative Test OraclesStrategy TacticNeural Network Machine LearningReduced ImplementationExecutable Specification Complied Abstraction I-O Grammar CheckerVoting Reference Stack Variation Implementation N-way Voting Parallel Testing
  12. 12. Imitative Test OraclesExecutable SpecificationAn SUT specification is translated into an executable, whichmaps inputs to expected outputs.
  13. 13. Imitative Test OraclesVotingSubmit any input to the SUT and one or more facsimilesystems, expect result of each is equivalent
  14. 14. Reactive Test Oracles Strategy TacticsEnvironment Monitor Resource Utilization Abend TimersOutput Invariants No Change Content Range Entity Relationships Behavior Parametric FormatBuilt-In Test Assertions DBC - Sampling DBC - Built-in Application-specific DBC - PragmasParametric Output Stream Persistent StoreTrace Analysis As Built AdditionalAlgebraic ADT SQL APIPerformance Response Time Reliability Throughput Availability
  15. 15. Reactive Test OraclesAlgebraicExploit externally observable algebraic relationships assert(date.yesterday() == date.today - 1)
  16. 16. Reactive Test OraclesTrace AnalysisParse available outputs; check conditions, relations, grammar
  17. 17. Judging Test OraclesExploratory Testing The tester critiques the SUT while following an general interaction strategy Ad hoc The tester improvises interactions Tour-based The tester improvises interactions based a pre-defined strategyFDA Validation The SUT is used in situ to see how well it supports realisticTesting tasks and workflowBeta Testing Users interact with SUT according to idiosyncratic interestCrowd Testing Users selected for operational environments, modes, and configurations;Usability Testing Evaluate HCI for external standards Quantitative Compare measurements of user physiological responses to structured and unstructured interaction with the SUT Qualitative Study subjective like/dislike
  18. 18. Oracle Characterization• What attributes or • Cause Coverage properties are useful to • Effect Coverage characterize or compare • Precision oracle types? • Point of Control/Observation• Questions must be • Test Strategies supported germane and answerable for all types • Average Cost per verdict • Antecedent • Comparator
  19. 19. Example Comparison Causes Covered Predictive, Model ProgramAverage Effects Cost Covered Imitative, Reduced Implementation Generality Precision
  20. 20. Challenges• Scalability• Novel interfaces• Can judging be reduced to an expert system?• Effective integration of automated Oracles with Crowds?
  21. 21. High and Sly• Prophecy not free• “The” oracle was many individuals• Indeterminate questions got ambiguous answers• Opportunistic use of natural resources (ethylene)
  22. 22. Recommended Reading William J. Broad The Oracle: Ancient Delphi and The Science Behind Its Lost Secrets (2006)