IN THE AGE OF BIG
DATA, WHAT ROLE
FOR SOFTWARE
ENGINEERS?
TIM MENZIES
CS, NCSTATE,
JUNE 2015
2
• We hold these truths
to be self-evident….
• Better conclusions =
+ more data
+ more cpu
+ human analysts finding
bette...
BUT NOT EVERYONE AGREES
Edsger Dijkstra, ICSE 4, 1979
• “The notion of ‘user’ cannot be
precisely defined, and therefore
h...
SO WHAT ROLE FOR SE
IN THE AGE OF BIG DATA?
ANALYSIS IS A
“SYSTEMS” TASK?
The premise of Big
Data:
• better conclusions =
...
Q: IS BIG DATA A “SYSTEMS”
OR “HUMAN”-TASK?
A: YES
5
THIS TALK: IN THE AGE BIG DATA
SE ANALYSTS ARE “GOAL
ENGINEERS”
Search-based software engineering
• CPU-intensive analysis...
ROAD MAP
1. Define:
• “CPU crisis”
• “search-based software engineering”
• “goal-oriented requirements engineering”
2. Why...
ACKNOWLEDGEMENTS
8
• SBSE + Feature Maps:
– Dr. Abdel Sayyad Salem , Ph.D. WVU 2014
GALE + air traffic control
– Dr. Josep...
WHAT IS…
• GOAL-ORIENTED REQUIREMENTS
ENGINEERING?
• THE CPU CRISIS?
• SEARCH-BASED SOFTWARE
ENGINEERING?
9
GOAL-ORIENTED RE
Axel van Lamsweerde: Goal-Oriented Requirements
Engineering: A guides Tour [vanLam RE’01]
• Goals capture...
OLDE AND NEW STYLE SE
MANUAL SOFTWARE
ENGINEERING
• e.g. Full stack DEVLOPs
development
• engineers laboriously convert
(b...
Karplus and Levitt
• 2013 Nobel prize in chemistry
• development of multi-scale models
for complex chemical systems
• Expl...
MODELS: EVERYWHERE
If you call an ambulance in London or New
York,
• those ambulances are controlled by emergency
response...
“BIG MODELS”: MORE AND MORE PEOPLE
WRITING AND RUNNING MORE AND MORE
MODELS
Berkeley
Stanford
Washington
500
2500
2004 200...
THE CPU CRISIS
You do the math.
What happens to a resource when
• an exponentially increasing number of people ,
• make ex...
TO SOLVE THE CPU CRISIS:
DON’T BUILD MORE CPUS
CPU power requirements (and the
pollution associated with generating
that p...
“BIG MODELS” AND THE CPU
CRISIS: EXAMPLE #1
Cognitive models of the agents
(both pilots and computers)
• Late descent,
• U...
“BIG MODELS” AND THE CPU
CRISIS: EXAMPLE #2
18
• Very rapid agile software development
• Continually retesting all code
• ...
SEARCH-BASED SE (SBSE)
Many SE activities are like optimization
problems [Harman,Jones’01].
Due to computational complexi...
Repeat till happy or exhausted
• Selection (cull the herd)
• Cross-over (the rude bit)
• Mutation (stochastic jiggle)
PARE...
APPLICATIONS OF SBSE
1. Requirements Menzies, Feather, Bagnall, Mansouri, Zhang
2. Transformation Cooper, Ryan, Schielke, ...
EXPLOSIVE GROWTH IN
SBSE
Q: Why?
A: Thanks to Big Data, more access to more cpu.
22
WHY BUILD MORE
TOOLS FOR SBSE
AND
GOAL-ORIENTED RE?
(AREN’T THERE ENOUGH ALREADY?)
23
DO WE NEED MORE SBSE TOOLS
FOR GOAL-BASED RE?
24
Spea2
Nsga-II
DE
Scatter
search
PSO
SA
mocell
Z3
IBEA
SMT solvers
GALE
Ns...
CASE STUDY:
FEATURE MAPS  PRODUCTS
Design product line
[Kang’90]
Add in known constraints
• E.g. “if we use a camera
then...
SIZE OF FEATURE MAPS
This model: 10 features, 8 rules
[www.splot-research.org]:
ESHOP: 290 Features, 421
Rules
LINUX kerne...
4 STUDIES:
2 OR 3 OR 4 OR 5 GOALS
27
Software engineering = navigating the user goals:
1. Satisfy the most domain constrai...
HV = HYPERVOLUME OF DOMINATED REGION
SPREAD = COVERAGE OF FRONTIER
% CORRECT = %CONSTRAINTS SATISFIED
28
Example performan...
HV = HYPERVOLUME OF DOMINATED REGION
SPREAD = COVERAGE OF FRONTIER
% CORRECT = %CONSTRAINTS SATISFIED
29
Very similarVery ...
Q: WHAT IS SO DIFFERENT ABOUT IBEA?
A: CONTINUOUS DOMINANCE
CONTINUOUS
IBEA : [Zitzler, Kunzli, 2004]
I(x1,x2):
• How much...
WHAT ARE THE
ADDED
BENEFITS OF
GOAL-ORIENTED
REASONING?
CASE STUDY: FEATURE MAPS FOR
PRODUCT-LINE ENGINEERING
31
STATE OF THE ART
32
Features
9
290
544
6888
SPLOTLinux(LVAT)
Pohl ‘11 Lopez-
Herrejon
‘11
Henard
‘12
Sayyad,
Menzie
s’13a
...
THE SEEDING HEURISTIC
33
Given M < N goals that are hardest to solve
• Before running an N-optimization problem:
• Seed an...
CORRECT SOLUTIONS AFTER 30 MINUTES
FOR THE LARGE LINUX KERNEL MODEL
34
From IBEA
From Z3
Abdel
Salam
Sayyad
Joseph
Ingram ...
HOW TO MAKE GOAL-
BASED REASONING
FASTER?
(GALE = GEOMETRIC
ACTIVE LEARNING)
CASE STUDY: SAFETY CRITICAL
ANALYSIS OF AVIAT...
WMC: GIT’S WORK MODELS
THAT COMPUTE [KIM’11]
Cognitive models of the agents
(both pilots and computers)
• Late descent,
• ...
Repeat till happy or exhausted
• Selection (cull the herd)
• Cross-over (the rude bit)
• Mutation (stochastic jiggle)
ACTI...
38
e.g. 398 cars
Maximize acceleration,
Maximize mpg
14 evaluations
of goals
• Find splits using
FASTMAP O(n)
[Faloutsos &...
FOR FRONTIER AS CONVEX HULL,
FOR EACH LINE SEGMENT, PUSH
TOWARDS BEST END
Given goals u, v, …
• utopia = best values
• hel...
REPEAT FOR ALL POINTS ON LINE
SEGMENTS ON NON-DOMINATED
REGION OF CONVEX HULL
40
GALE:
1. Population[ 0 ] = N random point...
RESULTS ON NASA MODELS:
SCORES AS GOOD AS OTHER METHODS
ORDERS OF MAGNITUDE FEWER EVALUATIONS
41
1. #forgotten tasks
2. #i...
42
Runtimes,
Number of
evaluations
GALE:
Geometric
Active
Learning for
Search-
Based
Software
Engineering ,
IEEE TSE,
2015...
MINIMIZATIONS OF
OBJECTIVE SCORES
43
gray Significantly different (Mann Whitney, 95%) and least
GALE:
Geometric
Active
Lea...
GALE’S SEARCH: A MORE THOROUGH
SEARCH OF A SMALLER VOLUME
Less
hypervolume
Better
spread
44
GALE:
Geometric
Active
Learnin...
CONCLUSION
45
THE CPU CRISIS
You do the math.
What happens to a resource when
• an exponentially increasing number of people ,
• make ex...
TO MANAGE THE CPU CRISIS: NEED
A BETTER UNDERSTANDING OF THE
“SHAPE” OF THE USER GOALS
47
Spea2
Nsga-II
DE Scatter
search
...
Q: IN THE AGE OF BIG DATA, WHAT
ROLE FOR SOFTWARE ENGINEERS?
A: GOAL ENGINEERING
Search-based software engineering
• CPU-i...
49
• An optimization algorithm
• A data miner
• A visualization tool
• A requirements negotiation tool
• A compression alg...
50
Analysis = humans + systems
• better conclusions =
+ more data
+ more cpu
+ human analysts finding better
questions
+ a...
COMBINING ALGORITHMS
AND GOAL-ORIENTED RE
Edsger Dijkstra,
ICSE 4, 1979
• “The notion of ‘user’
cannot be precisely
define...
52
Upcoming SlideShare
Loading in...5
×

In the age of Big Data, what role for Software Engineers?

573
-1

Published on

ABSTRACT:

Consider the premise of Big Data:

better conclusions = same algorithms + more data + more cpu

If this were always true, then there would be no role for human analysts
that reflected over the domain to offer insights that produce better solutions
(since all such insight is now automatically generated from the CPUs).

This talk proposes a marriage of sorts between Big Data and software
engineering. It reviews over a decade of work by the author in exploring
user goals using CPU-intensive methods. It will be shown that analyst-insight was
useful from building “better" tools (where “better” means generate
more succinct recommendations, runs faster, scales to much larger problems).

The conclusion will be that in the age of big data, human analysis is still
useful and necessary. But a new kind of software engineering analyst is required- one
that know how to take full advantage of the power of Big Data.

ABOUT THE AUTHOR:

Tim Menzies (P.hD., UNSW) is a Professor in CS at WVU; the author of
over 230 referred publications; and is one of the 50 most cited
authors in software engineering (out of 50,000+ researchers, see
http://goo.gl/wqpQl). At WVU, he has been a lead researcher on
projects for NSF, NIJ, DoD, NASA, USDA, as well as joint research work
with private companies. He teaches data mining and artificial
intelligence and programming languages.

Prof. Menzies is the co-founder of the PROMISE conference series
devoted to reproducible experiments in software engineering (see
http://promisedata.googlecode.com). He is an associate editor of IEEE
Transactions on Software Engineering, Empirical Software Engineering
and the Automated Software Engineering Journal. In 2012, he served as
co-chair of the program committee for the IEEE Automated Software
Engineering conference. In 2015, he will serve as co-chair for the
ICSE'15 NIER track. For more information, see his web site
http://menzies.us or his vita at http://goo.gl/8eNhY or his list of
pubs at http://goo.gl/0SWJ2p.

Published in: Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
573
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
16
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

In the age of Big Data, what role for Software Engineers?

  1. 1. IN THE AGE OF BIG DATA, WHAT ROLE FOR SOFTWARE ENGINEERS? TIM MENZIES CS, NCSTATE, JUNE 2015
  2. 2. 2 • We hold these truths to be self-evident…. • Better conclusions = + more data + more cpu + human analysts finding better questions + automatic systems that better understand the questions THE DECLARATION OF (HUMAN) DEPENDENCE
  3. 3. BUT NOT EVERYONE AGREES Edsger Dijkstra, ICSE 4, 1979 • “The notion of ‘user’ cannot be precisely defined, and therefore has no place in CS or SE.” 3 Anonymous machine learning researcher, 1986 • “Kill all living human experts then resurrect the dead ones”
  4. 4. SO WHAT ROLE FOR SE IN THE AGE OF BIG DATA? ANALYSIS IS A “SYSTEMS” TASK? The premise of Big Data: • better conclusions = same algorithms + more data + more cpu If so, then … • No role for human analysts • All insight is auto- generated from CPUs. ANALYSIS IS A “HUMAN” TASK? Current results on “software analytics” • A human-intensive process 4
  5. 5. Q: IS BIG DATA A “SYSTEMS” OR “HUMAN”-TASK? A: YES 5
  6. 6. THIS TALK: IN THE AGE BIG DATA SE ANALYSTS ARE “GOAL ENGINEERS” Search-based software engineering • CPU-intensive analysis • Taming the CPU crisis by understanding user goals Algorithms needs goal-oriented requirements engineering • Goals are a primary design construct • To optimize, find the “landscape of the goals” Goal-oriented RE need algorithms • Better tools for better explorations of user goals 6
  7. 7. ROAD MAP 1. Define: • “CPU crisis” • “search-based software engineering” • “goal-oriented requirements engineering” 2. Why more tools? (not enough already) 3. The power of goal-oriented tools (IBEA) • Feature maps, product-line engineering 4. Next-gen goal-oriented tools (GALE) • Safety critical analysis cockpit software 5. Conclusions 6. Future work 7
  8. 8. ACKNOWLEDGEMENTS 8 • SBSE + Feature Maps: – Dr. Abdel Sayyad Salem , Ph.D. WVU 2014 GALE + air traffic control – Dr. Joseph Krall, Ph.D., WVU, 2014
  9. 9. WHAT IS… • GOAL-ORIENTED REQUIREMENTS ENGINEERING? • THE CPU CRISIS? • SEARCH-BASED SOFTWARE ENGINEERING? 9
  10. 10. GOAL-ORIENTED RE Axel van Lamsweerde: Goal-Oriented Requirements Engineering: A guides Tour [vanLam RE’01] • Goals capture objectives for the system. • Goal-oriented RE : using goals for eliciting, specifying, documenting, structuring, elaborating, analyzing, negotiating, modifying requirements. 10 ✗ ✔ ✗ ✗ Mostly manual Mostly automatic Notation- based e.g. UML Search- based SE [Kang’90]
  11. 11. OLDE AND NEW STYLE SE MANUAL SOFTWARE ENGINEERING • e.g. Full stack DEVLOPs development • engineers laboriously convert (by hand) non-executable paper models into executable code. • Focus of much prior and current work MODEL-BASED SE • Engineers codify the current understanding of the domain into a model, • Then study those models • My bet: focus of much future work 11
  12. 12. Karplus and Levitt • 2013 Nobel prize in chemistry • development of multi-scale models for complex chemical systems • Explored complex chemical reactions (e.g. split-second changes of photosynthesis). 12 Models are now a central tool in scientific research. • in physics, biology and other fields of science • complex simulations using supercomputers. E.g. genomic map required analyzing 80 trillion bytes E.g.. Other computational modeling projects • the rise and fall of native cultures, • subnuclear particles • the Big Bang. MODELS: EVERYWHERE
  13. 13. MODELS: EVERYWHERE If you call an ambulance in London or New York, • those ambulances are controlled by emergency response models. If you cross the border Arizona to Mexico, • A models determines if you are taken away for extra security measures. If you default on your car loans, • A model determines when (or if) someone to repossess your car. If the stock market crashes, • it might be that some model caused the crash. 13
  14. 14. “BIG MODELS”: MORE AND MORE PEOPLE WRITING AND RUNNING MORE AND MORE MODELS Berkeley Stanford Washington 500 2500 2004 2009 2013 http://goo.gl/MJuxSt Great coders are today’s rock stars. --Will.i.am http://goo.gl/ljFtX
  15. 15. THE CPU CRISIS You do the math. What happens to a resource when • an exponentially increasing number of people , • make exponentially increasing demands upon it? 15
  16. 16. TO SOLVE THE CPU CRISIS: DON’T BUILD MORE CPUS CPU power requirements (and the pollution associated with generating that power) is now a significant issue. • Data centers consume 1.5% of globally electrical output • This value is predicted to grow dramatically in the very near future. • Google reports that a 1% reduction in CPU requirements saves them millions of dollars in power costs. • Welcome to the age of green software engineering Moore’s Law’s is over • Power consumption and heat dissipation issues blocks further exponential increases to CPU clock frequencies. • CPU memory access time to extended memory can vary widely. • E.g. For systems on a chip, access time across the bus to the memory of a neighboring chip can be orders of magnitude slower that accessing memory on the local chip. 16
  17. 17. “BIG MODELS” AND THE CPU CRISIS: EXAMPLE #1 Cognitive models of the agents (both pilots and computers) • Late descent, • Unpredicted rerouting, • Different tailwind conditions Goal: validate operations procedures (are they safe?) NASA’s analysts want to explore 7000 scenarios. • With current tools (NSGA-II) • 300 weeks to complete Limited access to hardware • Queue of researchers wanting hardware access • Hardware pulled away if in-flight incidents for manned space missions 17 Asiana Airlines Flight 214
  18. 18. “BIG MODELS” AND THE CPU CRISIS: EXAMPLE #2 18 • Very rapid agile software development • Continually retesting all code • 4 billion unit tests Jan to Oct 2013 • Welcome to the resource economy. [Stokely et al. 2009]
  19. 19. SEARCH-BASED SE (SBSE) Many SE activities are like optimization problems [Harman,Jones’01]. Due to computational complexity, exact optimization methods can be impractical for large SBSE problems So researchers and practitioners use metaheuristic search to find near optimal or good-enough solutions. • E.g. simulated annealing [Rosenbluth et al.’53] • E.g. genetic algorithms [Goldberg’79] • E.g. tabu search [Glover86] 19
  20. 20. Repeat till happy or exhausted • Selection (cull the herd) • Cross-over (the rude bit) • Mutation (stochastic jiggle) PARETO OPTIMALITY AND EVOLUTIONARY COMPUTING 20 1 2 3 5 4 6 7 8 9 Pareto frontier -- better on some criteria, worse on none Selection: -- generation[i+1] comes from Pareto frontier of generation[i]
  21. 21. APPLICATIONS OF SBSE 1. Requirements Menzies, Feather, Bagnall, Mansouri, Zhang 2. Transformation Cooper, Ryan, Schielke, Subramanian, Fatiregun, Williams 3.Effort prediction Aguilar-Ruiz, Burgess, Dolado, Lefley, Shepperd 4. Management Alba, Antoniol, Chicano, Di Pentam Greer, Ruhe 5. Heap allocation Cohen, Kooi, Srisa-an 6. Regression test Li, Yoo, Elbaum, Rothermel, Walcott, Soffa, Kampfhamer 7. SOA Canfora, Di Penta, Esposito, Villani 8. Refactoring Antoniol, Briand, Cinneide, O’Keeffe, Merlo, Seng, Tratt 9. Test Generation Alba, Binkley, Bottaci, Briand, Chicano, Clark, Cohen, Gutjahr, Harrold, Holcombe, Jones, Korel, Pargass, Reformat, Roper, McMinn, Michael, Sthamer, Tracy, Tonella,Xanthakis, Xiao, Wegener, Wilkins 10. Maintenance Antoniol, Lutz, Di Penta, Madhavi, Mancoridis, Mitchell, Swift 11. Model checking Alba, Chicano, Godefroid 12. Probing Cohen, Elbaum 13. UIOs Derderian, Guo, Hierons 14. Comprehension Gold, Li, Mahdavi 15. Protocols Alba, Clark, Jacob, Troya 16. Component sel Baker, Skaliotis, Steinhofel, Yoo 17. Agent Oriented Haas, Peysakhov, Sinclair, Shami, Mancoridis 21
  22. 22. EXPLOSIVE GROWTH IN SBSE Q: Why? A: Thanks to Big Data, more access to more cpu. 22
  23. 23. WHY BUILD MORE TOOLS FOR SBSE AND GOAL-ORIENTED RE? (AREN’T THERE ENOUGH ALREADY?) 23
  24. 24. DO WE NEED MORE SBSE TOOLS FOR GOAL-BASED RE? 24 Spea2 Nsga-II DE Scatter search PSO SA mocell Z3 IBEA SMT solvers GALE Nsga-III MOEA/D
  25. 25. CASE STUDY: FEATURE MAPS  PRODUCTS Design product line [Kang’90] Add in known constraints • E.g. “if we use a camera then we need a high resolution screen”. Extract products • Find subsets of the product lines that satisfy constraints. • If no constraints, linear time • Otherwise, can defeat state-of-the-art optimizers [Pohl et at, ASE’11] [Sayyad, Menzies ICSE’13]. 25 Cross-Tree Constraints
  26. 26. SIZE OF FEATURE MAPS This model: 10 features, 8 rules [www.splot-research.org]: ESHOP: 290 Features, 421 Rules LINUX kernel variability project LINUX x86 kernel 6,888 Features; 344,000 Rules 26 Cross-Tree Constraints
  27. 27. 4 STUDIES: 2 OR 3 OR 4 OR 5 GOALS 27 Software engineering = navigating the user goals: 1. Satisfy the most domain constraints (0 ≤ #violations ≤ 100%) 2. Offers most features 3. Build “stuff” In least time 4. That we have used most before 5. Using features with least known defects Binary goals= 1,2 Tri-goals= 1,2,3 Quad-goals= 1,2,3,4 Five-goals= 1,2,3,4,5 Abdel Salam Sayyad, Tim Menzies, Hany Ammar: On the value of user preferences in search- based software engineering: a case study in software product lines. ICSE 2013: 492- 501
  28. 28. HV = HYPERVOLUME OF DOMINATED REGION SPREAD = COVERAGE OF FRONTIER % CORRECT = %CONSTRAINTS SATISFIED 28 Example performance criteria Example in bi-goal space Note: example on next slide reports HV, spread for bi, tri, quad, five objective space Abdel Salam Sayyad, Tim Menzies, Hany Ammar: On the value of user preferences in search- based software engineering: a case study in software product lines. ICSE 2013: 492- 501
  29. 29. HV = HYPERVOLUME OF DOMINATED REGION SPREAD = COVERAGE OF FRONTIER % CORRECT = %CONSTRAINTS SATISFIED 29 Very similarVery different, particularly in % correct Continuous dominance Binary dominance ESHOP: 290 features, 421 rules [Sayyad, Menzies ICSE’13]
  30. 30. Q: WHAT IS SO DIFFERENT ABOUT IBEA? A: CONTINUOUS DOMINANCE CONTINUOUS IBEA : [Zitzler, Kunzli, 2004] I(x1,x2): • How much do we have to adjust goal scores such that x1 dominates x2 Repeat till just a few left  Sort all instances by F  Delete worst Then, standard GA (cross-over, mutation) on the survivors DISCRETE Two sets of decisions One dominates the other if worse on none and better on at least one Note: returns true,false, not the size of the domination 30 K= 0.05 Cost of car time to 100 mph heaven [Wagner et.al. 2007]
  31. 31. WHAT ARE THE ADDED BENEFITS OF GOAL-ORIENTED REASONING? CASE STUDY: FEATURE MAPS FOR PRODUCT-LINE ENGINEERING 31
  32. 32. STATE OF THE ART 32 Features 9 290 544 6888 SPLOTLinux(LVAT) Pohl ‘11 Lopez- Herrejon ‘11 Henard ‘12 Sayyad, Menzie s’13a Velazco ‘13 Sayyad, Menzies’13b Johansen ‘11 Benavides ‘05 White ‘07, ‘08, 09a, 09b, Shi ‘10, Guo ‘11 Objectives Multi-goalSingle-goal 300,000+ clauses
  33. 33. THE SEEDING HEURISTIC 33 Given M < N goals that are hardest to solve • Before running an N-optimization problem: • Seed an initial population by via M-optimization Study1 (with Z3) : • Optimize for min constraint violations using Z3 Study2 (with IBEA): • Optimize for (a) max features and (b) min violations
  34. 34. CORRECT SOLUTIONS AFTER 30 MINUTES FOR THE LARGE LINUX KERNEL MODEL 34 From IBEA From Z3 Abdel Salam Sayyad Joseph Ingram Tim Menzies Hany Ammar, Scalable Product Line Configurati on: A Straw to Break the Camel’s Back , IEEE ASE 2013 130 of 6888 features 5704 of 6888 features
  35. 35. HOW TO MAKE GOAL- BASED REASONING FASTER? (GALE = GEOMETRIC ACTIVE LEARNING) CASE STUDY: SAFETY CRITICAL ANALYSIS OF AVIATION PROCEDURES 35
  36. 36. WMC: GIT’S WORK MODELS THAT COMPUTE [KIM’11] Cognitive models of the agents (both pilots and computers) • Late descent, • Unpredicted rerouting, • Different tailwind conditions Goal: validate operations procedures (are they safe?) NASA’s analysts want to explore 7000 scenarios. • With current tools (NSGA-II) • 300 weeks to complete Limited access to hardware • Queue of researchers wanting hardware access • Hardware pulled away if in-flight incidents for manned space missions 36 Asiana Airlines Flight 214
  37. 37. Repeat till happy or exhausted • Selection (cull the herd) • Cross-over (the rude bit) • Mutation (stochastic jiggle) ACTIVE LEARNING AND EVOLUTIONARY COMPUTING 37 Naïve selection • score every candidate Active learning • Score only the most informative candidates • e.g. just score most distant points in data clusters
  38. 38. 38 e.g. 398 cars Maximize acceleration, Maximize mpg 14 evaluations of goals • Find splits using FASTMAP O(n) [Faloutsos & Lin ’95] • At each level only check for dominance of two most extreme points • 2log2(N) evals, or less • Leaves = non-dominated examples (i.e. the Pareto frontier) RECURSIVELY CLUSTER DATA, FIND MOST DISTANT POINTS IN LEAF CLUSTERS
  39. 39. FOR FRONTIER AS CONVEX HULL, FOR EACH LINE SEGMENT, PUSH TOWARDS BEST END Given goals u, v, … • utopia = best values • hell = furthest from utopia • All distances normalized 0..1 Given a line east to west • s1 = I(east, hell) • s2 = I(west, hell), s2 > s1 • C = dist(west,east) p = push on line east,west • direction = towards better (west) • magnitude[i]= • D= west[i] – east[i] • new = old + old * C * D • Reject if over C*1.5 39 • utopia u v hell • s2 s1 east west p hell • u v hell • u v
  40. 40. REPEAT FOR ALL POINTS ON LINE SEGMENTS ON NON-DOMINATED REGION OF CONVEX HULL 40 GALE: 1. Population[ 0 ] = N random points 2. Find M points on local Pareto frontier (approximated as convex hull) 3. Mutants = mutate M over line segments on hull 4. Population[ i+1 ] = Mutants + (N – #Mutants) random points 5. Goto 2 Related work: [Zuluaga et al. ICML’13]
  41. 41. RESULTS ON NASA MODELS: SCORES AS GOOD AS OTHER METHODS ORDERS OF MAGNITUDE FEWER EVALUATIONS 41 1. #forgotten tasks 2. #interrupted acts 3. Interruption time 1 2 3 1 2 3 5 4 1. #delayed acts 2. Delay time5 4 4 mins (GALE) vs 7 hours (rest) "Better Model- Based Analysis of Human Factors for Safe Aircraft Approach” Krall, Joe; Menzies, Tim; Davies, Misty IEEE Transactions on Human-Machine Systems, to appear 2015
  42. 42. 42 Runtimes, Number of evaluations GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  43. 43. MINIMIZATIONS OF OBJECTIVE SCORES 43 gray Significantly different (Mann Whitney, 95%) and least GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  44. 44. GALE’S SEARCH: A MORE THOROUGH SEARCH OF A SMALLER VOLUME Less hypervolume Better spread 44 GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  45. 45. CONCLUSION 45
  46. 46. THE CPU CRISIS You do the math. What happens to a resource when • an exponentially increasing number of people , • make exponentially increasing demands apon it? 46
  47. 47. TO MANAGE THE CPU CRISIS: NEED A BETTER UNDERSTANDING OF THE “SHAPE” OF THE USER GOALS 47 Spea2 Nsga-II DE Scatter search PSO SA mocell Z3 IBEA SMT solvers Domination Is a binary concept Aggressive exploration of preference space GALE TAR WHICH Nsga-III MOEA/D
  48. 48. Q: IN THE AGE OF BIG DATA, WHAT ROLE FOR SOFTWARE ENGINEERS? A: GOAL ENGINEERING Search-based software engineering • CPU-intensive analysis • Taming the CPU crisis by understanding user goals Algorithms needs goal-oriented requirements engineering • Goals are a primary design construct • To optimize, find the “landscape of the goals” Goal-oriented RE need algorithms • Better tools for better explorations of user goals 48
  49. 49. 49 • An optimization algorithm • A data miner • A visualization tool • A requirements negotiation tool • A compression algorithm • summarize interesting regions of complex space • An anomaly detector • The story thus far • Data exchange tool for agents • Share least data with most value • A comment on the paradoxical success of beings as confused as humans • seemingly complex problems, aren’t GALE : A TOOLKIT FOR UNDERSTANDING THE SHAPE OF GOAL SPACE
  50. 50. 50 Analysis = humans + systems • better conclusions = + more data + more cpu + human analysts finding better questions + automatic systems that better understand the questions
  51. 51. COMBINING ALGORITHMS AND GOAL-ORIENTED RE Edsger Dijkstra, ICSE 4, 1979 • “The notion of ‘user’ cannot be precisely defined, and therefore has no place in CS or SE.” TIM MENZIES, 2015 • Mathematical definition of “user” • “The force that changes the geometry of search space.” 51
  52. 52. 52
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×