Mutation testing


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 感谢 Yue Jia 和 Mark Harman
  • i.e., test data that distinguishes all programs differing from a correct one by only simple errors is so sensitive that it also implicitly distinguishes more complex errors
  •   Milu 是开源的么? 不是 可以使用 Milu 的 API 么? 可以当成 tool 来使用 不过生成的代码插入了许多无用的宏定义代码,这是为什么呢? 插入的宏定义代码对我的实验哪些影响呢? Evaluation 阶段:错误代码行号变化了。 编译执行阶段:不知道还能否编译通过:不能:只能用 milu 来调用执行 gcc 。( main 函数不见了) 抓捕覆盖信息的时候,不能保证行号一致性。 只能使用 GUI 操作么? 还有命令行的操作,但是仅有的文档如下所示: Usage:    milu [OPTION...] - Milu first order mutation testing system Help Options:    -h, --help             Show help options Application Options:    --GUI                  Call from milu_GUI    --mode=M               Mode for each options    --killingtime=K        Killing time for loop mutants    --gcc=G                Specify a compiler to use    -v, --verbose          Be verbose    -s, --silent           Be silent    --noindent             Not apply indent to pretty print source code    -f, --test-file        Try to parse individual file    -p, --test-project     Try to parse source folder    -i, --init             Init Milu with project source path, output function list    -m, --generate_mut     Generate mutants    -t, --run_test         Run mutants ======================================================= ======================================================= ======================================================= CSAW Variable type optimization   : 无法配置变异类型(在源码层次可以, table-driven ,非常难配置) Worked Example 1 – Bubble sort  如何配置变异类型? 变异之前需要预处理什么? 把相应的函数拷贝出来 变异完成之后需要做什么? 把相应的函数替换进去。 我们需要使用 driver , oracle 这些程序么? 不需要。 查看过 csaw 的论文,他们是基于函数来做 mutation 的,所以他们才需要 driver 这样的东西 而我们需要做的是,手动或者自动的把函数替换回原来的程序。 记得之前有一篇在 Siemens 上做 evaluation 的 tool 是哪一个? symbol tables 是什么? Manual 中最重要的章节是什么? Procedure 为什么需要在 test.c 中定义全局变量,全局函数呢? 因为 test.c 并不知道全局的信息。 处理过程的程序文件依次是什么? test.c nogen.txt: suppress the generation of specific numbered mutants line.c mutant.c  symbol_table.txt: be examined for errors pointers.h driver.c 我们试着先把第一步 line.c 跑完 ERROR : could not find input file <param.txt> 试着把 trash 中的文件移过来 Failed 重新解压一次 Failed 编写 nogen.txt Succeed 这里有一个最重要的问题:我们是如何配置变异类型的? 我们试着把 mutant.c 跑完 ======================================================= ======================================================= =======================================================
  • 其实我们这个变异不要求模拟人犯的错误 我们的目标是: 把正确的代码变得特别特别的错误,最好在大多数测试用例上将错误传播到输出 我们的目标:变异出更多的 failed 把错误的代码变得疑似度排名变化不大 这里有一个问题:要是原来 Coincidental Correctness 本来很多,这里一变异, Coincidental Correctness 反而减少,那么错误疑似度也会增长得很厉害。 我们的目标:变异出更多的 Coincidental Correctness mutation 的变异算子有哪些? mutation 如何分类? 而且我们的 Selection Criteria 是不同的 疑似度的变化 疑似度排名的变化 覆盖 hit 的变化 覆盖 count 的变化 出现 failed ,却没有覆盖了的语句!!!也有可能是错的了,因为可能有多个错误
  • Mutation testing

    1. 1. An Analysis and Survey of the Development of Mutation Testing[JH09] About 30 minutes Tao He Software Engineering Laboratory Department of Computer Science, Sun Yat-Sen University With thanks to Yue Jia and Mark Harman These slides are mainly extracted from Jia and Harman’s survey work May 2011 Sun Yat-Sen University, Guangzhou, China[JH09] Yue Jia, Mark Harman (September 2009). "An Analysis and Survey of the Development of MutationTesting" (PDF). CREST Centre, Kings College London, Technical Report TR-09-06. 1/37
    2. 2. Outline Objectives Scope Classification of Research Fundamental Hypotheses Related Concepts Future Trend Tools
    3. 3. Objectives of Mutation Testing Provide “mutation adequacy score” Measure the effectiveness of a test set in terms of its ability to detect faults 3/20
    4. 4. Scope of Mutation Testing The unit level The integration level The specification level As a white-box unit test technique
    5. 5. Theoretical work on Mutation Testing Hypotheses supporting Mutation Testing Optimization techniques  techniques for reducing computational cost  techniques for the detection of equivalent mutants
    6. 6. Practical work on Mutation Testing Applications of Mutation Testing Development work on Mutation Testing tools Empirical work
    7. 7. Fundamental Hypotheses
    8. 8. Competent Programmer Hypothesis (CPH) Programs can be corrected by a few small syntactical changes.
    9. 9. Coupling Effect Test data sets that detect simple types of faults are sensitive enough to detect more complex types of faults  A simple fault is represented by a simple mutant which is created by making a single syntactical change  A complex fault is represented as a complex mutant which is created by making more than one change
    10. 10. A Mutation Operator A transformation rule that generates a mutant from the original program Generate ‘realistic faults’
    11. 11. Behaviors of Mutation Operators Mutation Objects  Variables  Expressions Mutation Operation  Replacement  Insertion  Deletion
    12. 12. Problems of Mutation Analysis Executing the enormous number of mutants Human effort  The human oracle problem  The equivalent mutant problem
    13. 13. Cost Reduction Techniques
    14. 14. Classification ofCost Reduction Techniques Reduction of the generated mutants (may be taken into consideration in our approach for Fault Localization) Reduction of the execution cost
    15. 15. Mutants Reduction Techniques Mutant Sampling  Random  Based on the Bayesian sequential probability ratio test (SPRT) Mutant Clustering  Based on killable test cases Selective Mutation  By reducing the number of mutation operators applied.  Ignore operators ASR and SVR - redundant generation  Only using ABS and ROR
    16. 16. Evaluation of Mutants Reduction A mean mutation score Reduction in the number of mutants Number of equivalent mutants
    17. 17. Execution Cost Reduction Techniques Strong Mutation Weak Mutation Firm Mutation  by providing a continuum of intermediate possibilities - compare state, which lies between execution (Weak Mutation) and the final output (Strong Mutation).
    18. 18. Equivalent Mutant Detection Techniques 10% to 40% of mutants which are equivalent
    19. 19. Empirical Study Compare mutation criteria with data flow criteria such as all-use Compare mutants with real faults
    20. 20. Future Trend A need for high quality higher order mutants A need to reduce the equivalent mutants A preference for semantics over syntax mutation An interest in achieving a better balance between cost and value A pressing need to generate test cases to kill mutants
    21. 21. Life circle of Mutation Testing Generate mutants based on specified mutation operations Reduce mutants Run mutants against a test suite
    22. 22. Mutation Testing Tools
    23. 23. Goals of Mutationto Enhance Fault Localization Not aim to simulate real faults Analyze the impact of mutation on the suspiciousness of the mutated statement
    24. 24. Name Application Year Character Available Suitable Higher Order Mutation, MILU C 2008 Search-based technique, Yes No Test harness embeddingMUFORMAT C 2008 Format String Bugs No No ESPT C/C++ 2008 Tabular No No Variable type CSAW C 2007 Yes No optimization ExMAn C, Java 2006 TXL No No SESAME C, Lustre, Pascal 2006 Assembler Injection No No Certitude C/C++ 2006 General (Commercial) Commercially No Plextest C/C++ 2005 General (Commercial) Commercially No Interface Mutation,Proteum/IM 2.0 C 2001 Yes Yes Finite State Machines Source Code Insure++ C/C++ 1998 Instrumentation Commercially No (Commercial) Mutant Schemata TUMS C 1995 No No Generation Interface Mutation, Proteum 1.4 C 1993 No No Finite State Machines Published C Mutation Testing Tools
    25. 25. Proteum Environment Variable: PROTEUMIMHOME li -P pre-filename [-D directory] source-filename LI-filename  Call gcc to preprocess the source code file source-filename.c and produce a file pre-filename.c. Then parse the file pre-filename.c and generate various info files opmuta [-<operator> n m] [-all n m] source-filename Li- filename  Apply mutant operators to the source code and LI files. As ouput, opmuta produces a description file in a format that muta is able to read and include in the mutant database.
    26. 26. strutt .h include space.c source code li ~/smart_debugger/toolkit/proteum/li -P pre-space space li-space pre-space.c li-space.nli li-space.cgr li-spaec.gfc statement info function info call graph info def-use pair info opmuta ~/smart_debugger/toolkit/proteum/opmuta -O Operators pre-one_statement li one_statement > mutants.txt - include function name by directory Mutants line number by mutantdsc . mutants.txt GenetrateMutants .py mutation operator by mutant .dsc all mutants info Proteum – An Example: Space
    27. 27. Thoughts of Self-Made Mutants Generator I once consider to implement a self-made mutants generator.  Only mutate one statement  Not aim to simulate real faults IPO  Input  A piece of source code  Line number to mutate  Mutation operators  Process  Preprocess -> Scan -> Parse -> Mutate  Output  Mutants
    28. 28. Collection of Compiler Front-end Tools GCC-XML …failed  E.g.  source code:  Parse-tree: LLVM … not try, good for Objective-C?
    29. 29. Collection of Compiler Front-end Tools GCC … failed, maybe no parse-tree for GCC  -fdump-tree-fixup cfg-lineno  -fdump-tree-all -fdump-rtl-all  -fprofile-arcs -ftest-coverage  gcno gcda: not readable for human beings  gcov: offer the line number, separated statements
    30. 30. Useful links
    31. 31. Q&A 31/37
    32. 32. Thank you!Contact me via 32/37
    33. 33. Abstract Fault localization is a technique that aims to pinpoint faults by analyzing program execution spectrum, while mutation is a testing technique used by generating faulty programs called mutants. Most of existing research in fault localization focuses on the spectrum of the given programs. Yet there are few attempts to introduce mutation’s impact to enhance fault localization techniques. In this paper, we propose a strategy that automatically introduces mutation into fault localization techniques, and present variations of a heuristic method to compute suspiciousness of each statement by considering the impact of mutation. To validate our method, experiments is conducted on benchmark programs, namely Siemens, grep, gzip, sed, space, flex, make, vim, and bash. Results indicate that the method can help programmers locate faults more effectively than methods without mutation. 33/20