Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Survey on Automatic Software Evolution Techniques

711 views

Published on

Jindae's PQE

Published in: Software
  • Be the first to comment

  • Be the first to like this

A Survey on Automatic Software Evolution Techniques

  1. 1. A Survey on Automatic Software Evolution Techniques Jindae Kim PhD Qualifying Examination, Aug 31, 2015 HKUST 1
  2. 2. Overview • Automatic Software Evolution • Approaches • Challenges • Proposed Idea 2
  3. 3. Automatic Software Evolution • An activity or a technique to evolve software automatically. • Supports software development process and increase the productivity of human developers. 3
  4. 4. Area Techniques Refactoring Henkel et al.(2005), Murphy-Hill et al.(2007), Higo et al. (2008), Tsantalis et al.(2009), Tsantalis et al.(2010), Tsantalis et al.(2011), Dijkman et al.(2011) Automatic Patch Generation Arcuri (2008), Arcuri et al.(2008), Dallmeier et al.(2009), Weimer et al.(2009), Wei et al.(2010), Orlov and Sipper (2011), Le Goues et al.(2012), Kim et al. (2013), Nguyen et al.(2013), Long et al.(2015) Automatic Runtime Recovery Rinard et al.(2004), Elkarablieh and Khursid (2008), Dobolyi et al.(2008), Nagarajan et al.(2009), Perkins et al. (2009), Carbin et al.(2011), Kling et al.(2012), Carzaniga et al.(2013), Long et al.(2014) Performance Improvement White et al.(2008) , Langdon et al.(2010), Orlov et al. (2011), White et al.(2011), Harman et al.(2012), Langdon et al.(2013), Petke et al.(2014) 4
  5. 5. Approches 5
  6. 6. Generate and Validate • Most recent and popular approach in automatic software evolution. • Evolves a program in various aspects with validation. 6
  7. 7. 7 Generate Validate Seed Program Target Program Overview of Generate and Validate System Pass Program Variant Program Variant Program Variant Program Variant Program Variant Program Variant Fail
  8. 8. 8 Seed Program Program Feature ProgramFeature Search Space of Possible Program Variants
  9. 9. 8 Seed Program Plausible Variant Program Feature ProgramFeature Search Space of Possible Program Variants
  10. 10. 8 Seed Program Target Variant Plausible Variant Program Feature ProgramFeature Search Space of Possible Program Variants
  11. 11. 8 Seed Program Target Variant Good! Plausible Variant Program Feature ProgramFeature Search Space of Possible Program Variants
  12. 12. 8 Seed Program Target Variant Plausible Variant Not Bad Program Feature ProgramFeature Search Space of Possible Program Variants
  13. 13. 8 Seed Program Target Variant Plausible Variant Wrong! Program Feature ProgramFeature Search Space of Possible Program Variants
  14. 14. Categorization of Generate and Validate 9 Approaches Variant Generation Simple Mutations Mutations with Existing Source Code Pre-defined Templates Search Method Genetic Programming (GP) Arcuri et al. 2008, FINCH (Orlov et al. 2011). GenProg (Le Goues et al. 2012), Petke et al. 2014. PAR (Kim et al. 2013) Random/ Heuristic Search Debroy and Wong 2010. SemFix (Nguyen et al. 2013). AE (Weimer et al. 2013), TrpAutoRepair (Qi et al. 2013), RSRepair (Qi et al. 2014). SPR (Long et al. 2015), Prophet (Long et al. 2015)
  15. 15. Simple Mutations + Genetic Programming 10
  16. 16. Genetic Programming 11 Overview of GenProg (Le Goues, 2013)
  17. 17. Co-Evolutionary Method • Co-evolution of source code and a test suite. • Using basic 34 primitives. • Evaluated on eight seeded faults, and fixed five faults. 12
  18. 18. Co-Evolutionary Method 13
  19. 19. 14 Partial GP implementation of Bubble Sort by Arcuri et al.
  20. 20. Simple Mutations + Random/Heuristic Search 15
  21. 21. SemFix • Program Repair via Semantic Analysis. • Selects a target statement based on fault localization. • Generates repair constraint based on a test suite. • Synthesizes a new statement satisfying the constraint. 16
  22. 22. SemFix 17 Basic components used in SemFix by Nguyen et al.
  23. 23. SemFix • SemFix repaired 48 out of 90 bugs from SIR and GNU Coreutils. • SemFix also generated a repair faster than a GP-based technique. 18
  24. 24. 19 Techniques Description Limitation Co-evolutionary Method Evolving source code and a test suite together. Using simple primitives. Evaluated on seed faults for a simple program. Primitives are too small. SemFix Derive repair constraints by source code and test cases. Synthesize a statement satisfying the constraints. Components used in statement synthesis are simple. Only evaluated on small programs.
  25. 25. Existing Source Code + Genetic Programming 20
  26. 26. GenProg 21 Overview of GenProg (Le Goues, 2013)
  27. 27. GenProg • Automatic program repair technique. • Statement insertion/deletion/replacement. • Using source code in the same revision. • Fixed 55 out of 105 bugs. • Assumes a patch already exists in existing source code. 22
  28. 28. Genetic Improvement • Petke et al. evolve MiniSAT solver for Combinatorial Interaction Testing (CIT). • Using multiple variations of MiniSAT solver written by human as code base. • Evolved MiniSAT is even faster than the human’s on CIT. 23
  29. 29. Existing Source Code + Random/Heuristic Search 24
  30. 30. AE • Generating all possible variants one by one. • Using the same mutation as GenProg. • Selects a fix location based on fault localization. • Detects and skip equivalent variants validation. • Generates first order mutant only. 25
  31. 31. RSRepair • RSRepair has the same search space as GenProg. • Generates variants one by one - fewer patch trials. • Randomly selects a change location. • Outperforms GenProg in 23 out of 24 cases. 26
  32. 32. 27 Techniques Description Limitation GenProg Statement level mutations. Insert/replace new statement from existing code. Lack of ability to create new statements. Fitness guided search is not effective.Genetic Improvement Statement level mutations. Using multiple variation of the same program as code base. AE Deterministic search. Equivalent variant detection. Lack of ability to create new statements. Search space is limited to variants with only one mutation.RSRepair Random search.
  33. 33. Pre-defined Templates + Genetic Programming 28
  34. 34. PAR 29 Overview of PAR framework by Kim et al.
  35. 35. PAR • PAR uses 10 pre-defined fix templates. • Fix templates are drawn from manual inspection of human patches. • Fix templates include null checker, parameter changes and expression changes. • Patches requiring new code can be generated. 30
  36. 36. Patch Acceptability 31 Average ranks evaluated by 68 developers by Kim et al.
  37. 37. Pre-defined Templates + Heuristic Search 32
  38. 38. SPR • Using seven transformation schemas. • Condition synthesis for transformation schema instantiation. • Applying schemas in pre-defined order. • Prioritizes transformations on branches and memory initializations. 33
  39. 39. Plausible vs. Correct Patches 34 Total 69 defects SPR GenProg AE Plausible 37 16 25 Correct 11 1 2 Repair Generation Results by Long et al.
  40. 40. 35 Search Space comparison by Long et al.
  41. 41. Prophet • Using the same transformation schema as SPR. • Learning a model from successful patches. • Ranks candidate patches based on the trained model. • Prophet generated correct patches for 15 defects, while SPR generated 11. 36
  42. 42. 37 Seed Program Plausible Variant Program Feature ProgramFeature Repair Searches of SPR and Prophet Target Variant
  43. 43. 37 Seed Program Plausible Variant Program Feature ProgramFeature Repair Searches of SPR and Prophet Target Variant SPR
  44. 44. 37 Seed Program Plausible Variant Program Feature ProgramFeature Repair Searches of SPR and Prophet Target Variant Prophet
  45. 45. 38 Techniques Description Limitation PAR 10 fix templates from manual inspection. Only some of fix templates are useful. Template instantiation using existing code. SPR Seven transformation schemas. Condition synthesis for schema instantiation. Hard coded heuristic search. Search space is limited by schemas. Prophet Same transformation schemas as SPR. Ranks variants based on a probabilistic model. Search space is limited by schemas.
  46. 46. Challenges 39
  47. 47. Search Space Explosion • Pre-defined templates limit search space. • SPR has correct variants for only 19 out of 69 defects in its search space. • 35 out of 69 defects can be fixed with extended search space (Long et al. 2015). • How about additional costs? 40
  48. 48. Search Method • Random application of mutations may generate plausible, incorrect variants (Qi et al. 2015). • Prophet can find four more correct patches than SPR. • Only difference is search method. • Search space extension makes search even harder. • Effective and efficient search method is necessary. 41
  49. 49. How to address the issues? • Avoiding error-prone changes by learning from existing changes. • PAR and SPR show that template approach works. • Identify frequent changes from software repositories, then use them as templates. • Mining usage patterns of such changes to assist search. 42
  50. 50. Summary • Automatic software evolution have been used in many areas. • Generate and Validate systems have been advanced in two major directions - program variant generations and search method. • Current challenges in search space explosion and effective search method. 43
  51. 51. 44 Approaches Limitation Variant Generation Simple Mutations Applied modifications are very simple. Scalability issue - only works for small programs. Mutations with Existing Source Code Existing code restricts possible program variants. Low possibility that necessary code fragments exist. Pre-defined Templates Pre-defined templates restrict search space. Only a small number of templates are used. Search Method Genetic Programming Additional costs for fitness evaluation. Fitness guided search is not effective. Random/ Heuristic Search Search space is limited based on the number of mutations. Mostly consider only one mutation.

×