Upcoming SlideShare
×

# The Robust Optimization of Non-Linear Requirements Models

885 views

Published on

Masters Defense, WVU Dept. CSEE. April 7, 2010.

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
885
On SlideShare
0
From Embeds
0
Number of Embeds
71
Actions
Shares
0
19
0
Likes
0
Embeds 0
No embeds

No notes for slide

### The Robust Optimization of Non-Linear Requirements Models

1. 1. The Robust Optimization of Non- Linear Requirements Models GREGORY GAY THESIS DEFENSE WEST VIRGINIA UNIVERSITY GREG@GREGGAY.COM
2. 2. 2 Consider a requirements model…   Contains:   Various goals of a project.   Methods for reaching those goals.   Risks that prevent those goals.   Mitigations that remove risks.   (but carry costs)   A solution: balance between cost and attainment.   This is a non-linear optimization problem!
3. 3. 3 Understanding the Solution Space   Open and pressing issue. [Harman ‘07]   Many SE problems are over-constrained.   No right answer, so give partial solutions.   Robustness of solutions is key – many algorithms give brittle results. [Harman ‘01]   Important to present insight into the neighborhood.   What happens if I do B instead of A?
4. 4. 4 The Naïve Approach   Naive approaches to understanding neighborhood:   Run N times and report (a) the solutions appearing in more than N/2 cases, or (b) results with a 95% confidence interval.   Both are flawed – they require multiple trials!   Neighborhood assessment must be fast [Feather ’08]   Real-time if possible.   [Nielson ‘93] states that results must be within:   1Second before the mind begins to drift.   10 seconds before the mind has completely moved on.
5. 5. 5 Research Goals   Two important concerns:   Is demonstrating solution robustness a time-consuming task?   Must solution quality be traded against solution robustness?
6. 6. 6 What We Want   These are the features we want out of an algorithm: Feature Algorithm High-Quality Results* Yes Low Result Variance Yes Neighborhood (Tame) Assessment Scores Plateau to Stability Yes (Well-behaved) Speed Real-Time Results Yes Scalability Yes *(We want “more for less”)
7. 7. 7 Why Do We Care?   The later in the development process that a defect is identified, the more exponentially expensive it becomes to fix it (Paraphrased from Barry Boehm [Boehm ‘81])
8. 8. 8 Roadmap   Model   Algorithms   Experiments   Future Work   Conclusions
9. 9. 9 The Defect Detection and Prevention Model   Used at NASA JPL by Martin Feather’s “Team X”   [Cornford ’01, Feather ‘02, Feather ’08, Jalali ‘08]   Early-lifecycle requirements model   Light-weight ontology represents:   Requirements: project objectives, weighted by importance.   Risks: events that damage attainment of requirements.   Mitigations: precautions that remove risk, carry a cost value.   Mappings: Directed, weighted edges between requirements and risks and between risks and mitigations.   Part-of-relations: Provide structure between model components.
10. 10. 10 Light-weight != Trivial
11. 11. 11 Why Use DDP?   Three Reasons:   1. Demonstrably useful. [Feather ‘02, Feather ‘08]   Costsavings often over \$100,000   Numerous design improvements seen in DDP sessions   Overall shift in risks in JPL projects.   2. Availability of real-world models [Jalali ‘08]   Now and in the future.
12. 12. 12 The Third Reason   DDP is representative of other requirements tools.   Set of influences, expressed in a hierarchy, with relationships modeled through equations. QOC [MacLean ‘96] Soft-Goals [Myloupolus ‘99]   Sensitivity analysis [Cruz ‘73] not applicable to DDP.
13. 13. 13 Using DDP   Input = Set of enabled mitigations.   Output = Two values: (Cost, Attainment)   Those values are normalized and combined into a single score [Jalali ‘08]:
14. 14. 14 Roadmap   Model   Algorithms   Experiments   Future Work   Conclusions
15. 15. 15 Search-Based Software Engineering No single solution, so reformulate as a search problem and find several!   Four factors must be met: [Harman ’01, Harman ‘04]   1. A large search space.   2. Low computational complexity.   3. Approximate continuity (in the score space).   4. No known optimal solutions.   DDP Problem fits all:   1. Some models have up to (299 = 6.33*1029) possible settings.   2. Calculating the score is fast, algorithms run in O(N2)   3. Discrete variables, but continuous score space.   4. Solutions depend on project settings, optimal not known.
16. 16. 16 Theory of KEYS   Theory: A minority of variables control the majority of the search space. [Menzies ‘07]   If so, then a search that (a) finds those keys and (b) explores their ranges will rapidly plateau to stable, optimal solutions.   This is not new: narrows, master-variables, back doors, and feature subset selection all work on the same theory.   [Amarel ’86, Crawford ’94, Kohavi ’97, Menzies ’03, Williams ’03]   Everyone reports them, but few exploit them!
17. 17. 17 KEYS Algorithm   Two components: greedy search and a Bayesian ranking method (BORE = “Best or Rest”).   Each round, a greedy search:   Generate 100 configurations of mitigations 1…M.   Score them.   Sort top 10% of scores into “Best” grouping, bottom 90% into “Rest.”   Rank individual mitigations using BORE.   The top ranking mitigation is fixed for all subsequent rounds.   Stop when every mitigation has a value, return final cost and attainment values.
18. 18. 18 BORE Ranking Heuristic   We don’t have to actually search for the keys, just keep frequency counts for “best” and “rest” scores.   BORE [Clark ‘05] based on Bayes’ theorem. Use those frequency counts to calculate:   To avoid low-frequency evidence, add support term:
19. 19. 19 KEYS vs KEYS2   KEYS fixes a single top- ranked mitigation each round.   KEYS2 [Gay ‘10] incrementally sets more (1 in round 1, 2 in round 2… M in round M)   Slightly less tame, much faster.
20. 20. 20 Discovering KEYS2   KEYS2 is simple, developing it was not!   Many different heuristics tried:   Simple: Complex:   Set more in powers of 2 Shift best/rest slope   ln(round) Neighborhood exploration   top N% KEYS-R   Lesson in simplicity.
21. 21. 21 Benchmarked Algorithms   KEYS much be benchmarked against standard SBSE techniques.   Simulated Annealing, MaxWalkSat, A* Search   Chosen techniques are discrete, sequential, unconstrained algorithms. [Gu ‘97]   Constrained searches work towards a pre-determined number of solutions, unconstrained adjust to their goal space.
22. 22. 22 Simulated Annealing   Classic, yet common, approach. [Kirkpatrick ’83]   Choose a random starting position.   Look at a “neighboring” configuration.   If it is better, go to it.   If not, move based on guidance from probability function (biased by the current temperature).   Over time, temperature lowers. Wild jumps stabilize to small wiggles.
23. 23. 23 MaxWalkSat   Hybridized local/random search. [Kautz ‘96, Selman ‘93]   Start with random configuration.   Either perform   Local Search: Move to a neighboring configuration with a better score. (70%)   Random Search: Change one random mitigation setting. (30%)   Keeps working towards a score threshold. Allotted a certain number of resets, which it will use if it fails to pass the threshold within a certain number of rounds.
24. 24. 24 A* Search   Best first path-finding heuristic. [Hart ‘68]   Uses distance from origin (G) and estimated cost to goal (H), and moves to the neighbor that minimizes G+H.   Moves to new location and adds the previous location to a closed list to prevent backtracking.   Optimal search because it always underestimates H.   Stops after being stuck for 10 rounds.
25. 25. 25 Other Methods   [Gu ‘97]’s survey lists hundreds of methods!   Gradient descent methods and sensitivity analysis assume a continuous range for model variables.   DDP models are discrete!   Integer programming could still be used (CPLEX [Mittelmann ‘07])   Too slow! [Coarfa ‘00]   SE problems are over-constrained, so a solution over all constraints is not possible. [Harman ’07]   Parallel algorithms   Communications overhead overwhelms benefits.
26. 26. 26 Roadmap   Model   Algorithms   Experiments   Future Work   Conclusions
27. 27. 27 Experiment 1: Costs and Attainments   We use real-world models 2,4,5 (1,3 are too small and were only used for debugging).   Models discussed in [Feather ‘02, Jalali ‘08, Menzies ‘03]   Run each algorithm 1000 times per model.   Removed outlier problems by generating a lot of data points.   Still a small enough number to collect results in a short time span.   Graph cost and attainment values.   Values towards bottom-right better.
28. 28. 28 Experiment 1 Results Model Cost (Y-Axis) Algorithm Attainment (X-Axis)
29. 29. 29 Experiment 1 Results Bad Good
30. 30. 30 Experiment 1 Results
31. 31. 31 Summing it Up… Feature Simulated MaxWalkSat A* KEYS KEYS2 Annealing High-Quality Results No No Yes Yes Yes Low Result Variance No No No Yes Yes (Tame) Scores Plateau to Stability ? ? ? ? ? (Well-behaved) Real-Time Results ? ? ? ? ? Scalability ? ? ? ? ?
32. 32. 32 Experiment 2: Runtimes   For each model:   Run each algorithm 100 times.   Record runtime using Unix “time” command.   Divide runtime/100 to get average.
33. 33. 33 Experiment 2 Results Model 2 Model 4 Model 5 (31 mitigations) (58 mitigations) (99 mitigations) Simulated 0.577 1.258 0.854 Annealing MaxWalkSat 0.122 0.429 0.398 A* Search 0.003 0.017 0.048 KEYS 0.011 0.053 0.115 KEYS2 0.006 0.018 0.038
34. 34. 34 Summing it Up… Feature Simulated MaxWalkSat A* KEYS KEYS2 Annealing High-Quality Results No No Yes Yes Yes Low Result Variance No No No Yes Yes (Tame) Scores Plateau to Stability ? ? ? ? ? (Well-behaved) Real-Time Results No No Yes Yes Yes Scalability ? ? ? ? ?
35. 35. 35 Experiment 3: Scale-Up Study   By 2013, we expect DDP models 8x larger than those used in this thesis (from 2008) Year Num. Variables 2004 30   KEYS and KEYS2 are 2008 100 “real time” now, but will 2010 300 they scale? 2013 800
36. 36. 36 Artificial Model Generation   To test scalability, a generator builds synthesized models by:   Studying the existing real-world models and collecting statistics on its internal structure.   Mutating them into larger models based on user-input size, density, and distribution altering parameters   Models built that were 2,4, and 8 times larger than existing models.
37. 37. 37 Scale-Up Results (1)
38. 38. 38 Scale-Up Results (2)
39. 39. 39 Scale-Up Results (3) KEYS KEYS2 Runtimes Model Calls Runtimes Model Calls Exponential 0.82 0.83 0.88 0.93 Polynomial 0.99 0.99 0.99 0.98 (of degree 2)   KEYS and KEYS2 fit to O(N2)   Both scale to larger models, but KEYS requires exponentially more model calls (thus, large jump in execution time)   Thus, we recommend KEYS2
40. 40. 40 Summing it Up… Feature Simulated MaxWalkSat A* KEYS KEYS2 Annealing High-Quality Results No No Yes Yes Yes Low Result Variance No No No Yes Yes (Tame) Scores Plateau to Stability ? ? ? ? ? (Well-behaved) Real-Time Results No No Yes Yes Yes Scalability No No Maybe No Yes
41. 41. 41 Decision Ordering Diagrams   Design of KEYS2 automatically provides a way to explore the decision neighborhood.   Decision ordering diagrams – Visual format that ranks decisions from most to least important . [Gay ‘10]
42. 42. 42 Decision Ordering Diagrams (2)   These diagrams can be used to assess solution robustness in linear time by   (A) Considering the variance in performance after applying X decisions.   Spread = measure of variance (75th-25th quartile)
43. 43. 43 Decision Ordering Diagrams (3)   These diagrams can be used to assess solution robustness in linear time by   (B) Comparing the results of using the first X decisions to that of X-1 or X+1.
44. 44. 44 Decision Ordering Diagrams (4)   Useful under three conditions:   (a) scores output are well-behaved   (b) variance is tamed   (c) they are generated in a timely manner.
45. 45. 45 Summing it Up… Feature Simulated MaxWalkSat A* KEYS KEYS2 Annealing High-Quality Results No No Yes Yes Yes Low Result Variance No No No Yes Yes (Tame) Scores Plateau to Stability No No No Yes Yes (Well-behaved) Real-Time Results No No Yes Yes Yes Scalability No No Maybe Yes Yes
46. 46. 46 Roadmap   Model   Algorithms   Experiments   Future Work   Conclusions
47. 47. 47 Future Work   Weakness in KEYS2 – it must call the model 100x per round.   70% of execution time is spent calling the model.   To improve –call the model less.   Problem: do this without losing support!
48. 48. 48 Future Directions   Two main parts – cache and active learning.   Don’t call the model, just look up scores.   Clustering must be efficient [Jiang ’08, Poshyvanyk ‘08]   First round – generate 100 random configurations.   Greedily cluster them, assign identifiers to each, build a hierarchical distance tree.   After this, can simply drop new instances into the tree (as the number of clusters will remain fixed).   The model only needs to be called for new instances.
49. 49. 49 Future Directions (2)   Most learners passive – blindly generating or reading data.   Active learners exercise control over what data they learn on [Cohn ‘94].   Many clusters are linear – their members form a smooth area of the search space with similar scores [Shepperd ‘97].   If a new configuration falls into a linear region, don’t poll the model, just interpolate its score.   Train KEYS2 to outright ignore linear regions with poor scores.
50. 50. 50 KEYS as a Component Treatment Learning Bayesian Nets (NASA Ames) (Tsinghua University)   Work with Misty Davies, Karen   Work with Hongyu Zhang. Gundy-Burlet.   Bayes Net = directed graph of   Design process: Run variables and their influences. simulations, then apply   Variable ordering problem Treatment Learning via [Hsu ‘04] TAR3 or TAR4.1   Order of variables examined by [Menzies ‘03b] learning process is crucial to performance!   KEYS is a multi-stage   Need a ranking from best -> worst version of TAR4.1   Genetic algorithms sometimes used, but they are slow.   Future – KEYS as part of   KEYS offers a potential solution the simulator. to the VOP.
51. 51. 51 Roadmap   Model   Algorithms   Experiments   Future Work   Conclusions
52. 52. 52 Conclusions   Optimization tools can study the space of requirements, risks, and mitigations.   Finding a balance between costs and attainment is hard!   Such solutions can be brittle, so we must comment on solution robustness. [Harman ‘01]
53. 53. 53 Conclusions (2)   Pre-experimental concerns:   An algorithm would need to trade solution quality for robustness (variance vs score).   Demonstrating solution robustness is time-consuming and requires multiple procedure calls.   KEYS2 defies both concerns.   Generates higher quality solutions than standard methods, and generates results that are tame and well-behaved (thus, we can generate decision ordering graphs to assess robustness).   Is faster than other techniques, and can generate decision ordering graphs in O(N2)
54. 54. 54 We Recommend KEYS2 Feature Simulated MaxWalkSat A* KEYS KEYS2 Annealing High-Quality Results No No Yes Yes Yes Low Result Variance No No No Yes Yes (Tame) Scores Plateau to Stability No No No Yes Yes (Well-behaved) Real-Time Results No No Yes Yes Yes Scalability No No Maybe No Yes
55. 55. 55 References   Slide 3:   [Harman ‘01] M. Harman and B.F. Jones. Search-based software engineering. Journal of Information and Software Technology, 43:833–839, December 2001.   [Harman ‘07] Mark Harman. The current state and future of search based software engineering. In Future of Software Engineering, ICSE’07. 2007.   Slide 4:   [Feather ‘08] M. Feather, S. Cornford, K. Hicks, J. Kiper, and T. Menzies. Application of a broad- spectrum quantitative requirements model to early-lifecycle decision making. IEEE Software, 2008.   [Nielson ‘93] J. Nielson. Usability Engineering. Academic Press, 1993.   Slide 7:   [Boehm ‘81] B. W. Boehm. Software Engineering Economics. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1981.   Slide 9:   [Cornford ‘01] S.L. Cornford, M.S. Feather, and K.A. Hicks. DDP a tool for life-cycle risk management. In IEEE Aerospace Conference, Big Sky, Montana, pages 441–451, March 2001.   [Feather ‘02] M.S. Feather and T. Menzies. Converging on the optimal attainment of requirements. In IEEE Joint Conference On Requirements Engineering ICRE’02 and RE’02, 9-13th September, University of Essen, Germany, 2002.   [Jalali ‘08] Tim Menzies, Omid Jalali, and Martin Feather. Optimizing requirements decisions with keys. In Proceedings PROMISE ’08 (ICSE), 2008.   Slide 12:   [Cruz ‘73] Cruz, J.B., editor. System Sensitivity Analysis. Dowden, Hutchinson, & Ross. Stroudsburg, PA. 1973.   [Maclean ‘96] A. MacLean, R.M. Young, V. Bellotti, and T.P. Moran. Questions, options and criteria: Elements of design space analysis. In T.P. Moran and J.M. Carroll, editors, Design Rationale: Concepts, Techniques, and Use, pages 53–106. Lawerence Erlbaum Associates, 1996.   [Mylopoulos ‘99] J. Mylopoulos, L. Cheng, and E. Yu. From object-oriented to goal-oriented requirements analysis. Communications of the ACM, 42(1):31–37, January 1999.
56. 56. 56 References (2)   Slide 15:   [Harman ‘04] Mark Harman and John Clark. Metrics are fitness functions too. In 10th International Software Metrics Symposium (METRICS 2004), 2004), pages = 58–69, location = Chicago, IL, USA, publisher = IEEE Computer Society Press, address = Los Alamitos, CA, USA.   Slide 16:   [Amarel ‘86] S. Amarel. Program synthesis as a theory formation task: Problem representations and solution methods. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach: Volume II, pages 499–569. Kaufmann, Los Altos, CA, 1986.   [Crawford ‘94] J.Crawford and A.Baker. Experimental results on the application of satisfiability algorithms to scheduling problems. In AAAI ’94, 1994.   [Kohavi ‘97] Ron Kohavi and George H. John. Wrappers for feature subset selection. Artificial Intelli- gence, 97(1-2):273–324, 1997.   [Menzies ‘03] T. Menzies and H. Singh. Many maybes mean (mostly) the same thing. In M. Madravio, editor, Soft Computing in Software Engineering. Springer-Verlag, 2003.   [Menzies ‘07] T. Menzies, D. Owen, and K. Richardson. The Strangest Thing About Software. Computer 40, 1 (Jan. 2007), 54-60.   [Williams ’03] R.Williams, C.P.Gomes, and B.Selman. Backdoors to typical case complexity. In Proceedings of IJCAI 2003, 2003.   Slide 18:   [Clark ‘05] R. Clark. Faster treatment learning, Computer Science, Portland State University. Master’s thesis, 2005.   Slide 19:   [Gay ‘10] Gay, Gregory and Menzies, Tim and Jalali, Omid and Mundy, Gregory and Gilkerson, Beau and Feather, Martin and Kiper, James. Finding robust solutions in requirements models. Automated Software Engineering, 17(1): 87-116, 2010.
57. 57. 57 References (3)   Slide 21:   [Gu ‘97] Jun Gu, Paul W. Purdom, John Franco, and Benjamin W. Wah. Algorithms for the satisfiability (sat) problem: A survey. In DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 19–152. American Mathematical Society, 1997.   Slide 22:   [Kirkpatrick ‘83] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Sci- ence, Number 4598, 13 May 1983, 220, 4598:671–680, 1983.   Slide 23:   [Kautz ‘96] Henry Kautz and Bart Selman. Pushing the envelope: Planning, propositional logic and stochastic search. In Proceedings of the Thirteenth National Conference on Artificial Intel- ligence and the Eighth Innovative Applications of Artificial Intelligence Conference, pages 1194–1201, Menlo Park, August 4–8 1996. AAAI Press / MIT Press. Available from http://www.cc.gatech.edu/ ~jimmyd/summaries/kautz1996.ps.   [Selman ‘93] Bart Selman, Henry A. Kautz, and Bram Cohen. Local search strategies for satisfiability testing. In Michael Trick and David Stifler Johnson, editors, Proceedings of the Second DIMACS Challange on Cliques, Coloring, and Satisfiability, Providence RI, 1993.   Slide 24:   [Hart ‘68] P.E. Hart, N.J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4:100–107, 1968.
58. 58. 58 References (4)   Slide 25:   [Coarfa ‘00] Cristian Coarfa, Demetrios D. Demopoulos, Alfonso San, Miguel Aguirre, Devika Subramanian, and Moshe Y. Vardi. Random 3-sat: The plot thickens. In In Principles and Practice of Constraint Programming, pages 143–159, 2000.   [Mittelmann ‘07] H.D. Mittelmann. Recent benchmarks of optimization software. In 22nd Euorpean Conference on Operational Research, 2007.   Slide 48:   [Jiang ‘08] Y. Jiang, B. Cukic, and T. Menzies. Does transformation help? In Defects 2008, 2008.   [Poshyvanyk ‘08] D. Poshyvanyk A. Marcus and R. Ferenc. Using the conceptual cohesion of classes for fault prediction in object oriented systems. IEEE Transactions on Software Engineering, 34:287–300, 2 2008.   Slide 49:   [Cohn ‘94] David Cohn, Les Atlas, and Richard Ladner. Improving generalization with active learning. Mach. Learn., 15(2):201–221, 1994.   [Shepperd ‘97] M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE Transactions on Software Engineering, 23(12), November 1997.   Slide 50:   [Menzies ’03b] T. Menzies and Y. Hu. Data mining for very busy people. In IEEE Computer, November 2003. Available from http://menzies.us/pdf/03tar2.pdf.   [Hsu ‘04] Hsu, W.H. Genetic Wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Inf. Sci. 163, 1-3 (June 2004), 103-122.