1.
Gregory Gay West Virginia University greg@greggay.com Finding Robust Solutions to Requirements Models
2.
Consider a requirements model… Contains: Various goals of a project. All methods for reaching those goals. Risks that could compromise those goals. Mitigations that remove risks. A solution: balance between cost and attainment. This is a non-linear optimization problem! 2
3.
Understanding the Solution Space Open and pressing issue. Many SE problems are over-constrained – no right answer, so give partial solutions. Robustness of solutions is key – many algorithms give brittle results. Important to present insight into the neighborhood. What happens if I do B instead of A? 3
4.
The Naïve Approach Naive approaches to understanding neighborhood: Run N times and report (a) the solutions appearing in more than N/2 cases, or (b) results with a 95% confidence interval. Both are flawed – they require multiple trials! Neighborhood assessment must be fast. Real-time if possible. 4
5.
Research Goals 5 Two important concerns: Is demonstrating solution robustness a time-consuming task? Must solution quality be traded against solution robustness?
6.
The Defect Detection and Prevention Model Used at NASA JPL Early-lifecycle requirements model Light-weight ontology represents: Requirements: project objectives, weighted by importance. Risks: events that damage attainment of requirements. Mitigations: precautions that remove risk, carry a cost value. Mappings: Directed, weighted edges between requirements and risks and between risks and mitigations. Part-of-relations: Provide structure between model components. 6
8.
Why Use DDP? Three Reasons: Demonstrably useful – cost savings often over $100,000, numerous design improvements seen in DDP sessions, overall shift in risks in JPL projects. Availability of real-world models now and in the future. DDP is representative of other requirements tools. Set of influences, expressed in a hierarchy, relationships modeled through equations. 8
9.
Using DDP Input = Set of enabled mitigations. Output = Two values: (Cost, Attainment) Those values are normalized and combined into a single score: 9
10.
Theory of KEYS Theory: A minority of variables control the majority of the search space. If so, then a search that (a) finds those keys and (b) explores their ranges will rapidly plateau to stable, optimal solutions. This is not new: narrows, master-variables, back doors, and feature subset selection all work on the same theory. 10
11.
KEYS Algorithm Two components: greedy search and a Bayesian ranking method. Each round, a greedy search: Generate 100 configurations of mitigations 1…M. Score them. Sort top 10% of scores into “Best” grouping, bottom 90% into “Rest.” Rank individual mitigations using BORE. The top ranking mitigation is fixed for all subsequent rounds. Stop when every mitigation has a value, return final cost and attainment values. 11
12.
BORE Ranking Heuristic We don’t have to actually search for the keys, just keep frequency counts for “best” and “rest” scores. BORE based on Bayes’ theorem. Use those frequency counts to calculate: To avoid low-frequency evidence, add support term: 12
13.
KEYS vs KEYS2 13 KEYS fixes a single top-ranked mitigation each round. KEYS2 incrementally sets more (1 in round 1, w in round 2… M in round M) Slightly less tame, much faster.
14.
Benchmarked Algorithms KEYS much be benchmarked against standard SBSE techniques. Simulated Annealing, MaxWalkSat, A* Search Chosen techniques are discrete, sequential, unconstrained algorithms. Constrained searches work towards a pre-determined number of solutions, unconstrained adjust to their goal space. 14
15.
Simulated Annealing Classic, yet common, approach. Choose a random starting position. Look at a “neighboring” configuration. If it is better, go to it. If not, move based on guidance from probability function (biased by the current temperature). Over time, temperature lowers. Wild jumps stabilize to small wiggles. 15
16.
MaxWalkSat 16 Hybridized local/random search. Start with random configuration. Either perform Local Search: Move to a neighboring configuration with a better score. (70%) Random Search: Change one random mitigation setting. (30%) Keeps working towards a score threshold. Allotted a certain number of resets, which it will use if it fails to pass the threshold within a certain number of rounds.
17.
A* Search 17 Best first path-finding heuristic. Uses distance from origin (G) and estimated cost to goal (H), and moves to the neighbor that minimizes G+H. Moves to new location and adds the previous location to a closed list to prevent backtracking. Optimal search because it always underestimates H. Stops after being stuck for 10 rounds.
18.
Experiment 1: Costs and Attainments 18 Using real-world models 2,4,5 (1,3 are too small and were only used for debugging): Run each algorithm 1000 times per model. Removed outlier problems by generating a lot of data points. Still a small enough number to collect results in a short time span. Graph cost and attainment values. Values towards bottom-right better.
20.
Experiment 2: Runtimes 20 For each model: Run each algorithm 100 times. Record runtime using Unix “time” command. Divide runtime/100 to get average.
22.
Decision Ordering Diagrams 22 Design of KEYS2 automatically provides a way to explore the decision neighborhood. Decision ordering diagrams – Visual format that ranks decisions from most to least important .
23.
Decision Ordering Diagrams 23 These diagrams can be used to assess solution robustness in linear time by Considering the variance in performance after applying X decisions. Comparing the results of using the first X decisions to that of X-1 or X+1. Useful under three conditions: (a) scores output are well-behaved , (b) variance is tamed, and (c) they are generated in a timely manner.
24.
Conclusions 24 Optimization tools can study the space of requirements, risks, and mitigations. Finding a balance between costs and attainment is hard! Such solutions can be brittle, so we must comment on solution robustness. Candidate solution: KEYS2
25.
Conclusions (2) 25 Pre-experimental concerns: An algorithm would need to trade solution quality for robustness (variance vs score). Demonstrating solution robustness is time-consuming and requires multiple procedure calls. KEYS2 defies both concerns. Generates higher quality solutions than standard methods, and generates results that are tame and well-behaved (thus, we can generate decision ordering graphs to assess robustness). Is faster than other techniques, and can generate decision ordering graphs in O(N2)
26.
Conclusions (3) 26 Therefore, we recommend KEYS2 for the optimization of requirements models (and other SBSE problems) because it is fast, its results are well-behaved and tame, and it allows for exploration of the search space.
27.
Questions? 27 I would like to thank Dr. Zhang for the invitation and all of you for attending! Want to contact me later? Email: greg@greggay.com MSN Messenger: greg@4colorrebellion.com Gtalk: momoku@gmail.com More about me: http://www.greggay.com
Be the first to comment