SlideShare a Scribd company logo
1 of 52
IN THE AGE OF BIG
DATA, WHAT ROLE
FOR SOFTWARE
ENGINEERS?
TIM MENZIES
CS, NCSTATE,
JUNE 2015
2
• We hold these truths
to be self-evident….
• Better conclusions =
+ more data
+ more cpu
+ human analysts finding
better questions
+ automatic systems that better
understand the questions
THE DECLARATION OF
(HUMAN) DEPENDENCE
BUT NOT EVERYONE AGREES
Edsger Dijkstra, ICSE 4, 1979
• “The notion of ‘user’ cannot be
precisely defined, and therefore
has no place in CS or SE.”
3
Anonymous machine learning researcher, 1986
• “Kill all living human experts
then resurrect the dead ones”
SO WHAT ROLE FOR SE
IN THE AGE OF BIG DATA?
ANALYSIS IS A
“SYSTEMS” TASK?
The premise of Big
Data:
• better conclusions =
same algorithms +
more data + more cpu
If so, then …
• No role for human
analysts
• All insight is auto-
generated from CPUs.
ANALYSIS IS A
“HUMAN” TASK?
Current results on
“software analytics”
• A human-intensive
process
4
Q: IS BIG DATA A “SYSTEMS”
OR “HUMAN”-TASK?
A: YES
5
THIS TALK: IN THE AGE BIG DATA
SE ANALYSTS ARE “GOAL
ENGINEERS”
Search-based software engineering
• CPU-intensive analysis
• Taming the CPU crisis by understanding user goals
Algorithms needs goal-oriented requirements engineering
• Goals are a primary design construct
• To optimize, find the “landscape of the goals”
Goal-oriented RE need algorithms
• Better tools for better explorations of user goals
6
ROAD MAP
1. Define:
• “CPU crisis”
• “search-based software engineering”
• “goal-oriented requirements engineering”
2. Why more tools? (not enough already)
3. The power of goal-oriented tools (IBEA)
• Feature maps, product-line engineering
4. Next-gen goal-oriented tools (GALE)
• Safety critical analysis cockpit software
5. Conclusions
6. Future work
7
ACKNOWLEDGEMENTS
8
• SBSE + Feature Maps:
– Dr. Abdel Sayyad Salem , Ph.D. WVU 2014
GALE + air traffic control
– Dr. Joseph Krall, Ph.D., WVU, 2014
WHAT IS…
• GOAL-ORIENTED REQUIREMENTS
ENGINEERING?
• THE CPU CRISIS?
• SEARCH-BASED SOFTWARE
ENGINEERING?
9
GOAL-ORIENTED RE
Axel van Lamsweerde: Goal-Oriented Requirements
Engineering: A guides Tour [vanLam RE’01]
• Goals capture objectives for the system.
• Goal-oriented RE : using goals for eliciting, specifying, documenting,
structuring, elaborating, analyzing, negotiating, modifying requirements.
10
✗
✔
✗
✗
Mostly
manual
Mostly
automatic
Notation-
based
e.g. UML
Search-
based
SE
[Kang’90]
OLDE AND NEW STYLE SE
MANUAL SOFTWARE
ENGINEERING
• e.g. Full stack DEVLOPs
development
• engineers laboriously convert
(by hand) non-executable
paper models into
executable code.
• Focus of much prior and
current work
MODEL-BASED
SE
• Engineers codify the
current understanding of
the domain into a
model,
• Then study those
models
• My bet: focus of much
future work
11
Karplus and Levitt
• 2013 Nobel prize in chemistry
• development of multi-scale models
for complex chemical systems
• Explored complex chemical
reactions (e.g. split-second
changes of photosynthesis).
12
Models are now a central tool in
scientific research.
• in physics, biology and other fields
of science
• complex simulations using
supercomputers.
E.g. genomic map required
analyzing 80 trillion bytes
E.g.. Other computational
modeling projects
• the rise and fall of native cultures,
• subnuclear particles
• the Big Bang.
MODELS: EVERYWHERE
MODELS: EVERYWHERE
If you call an ambulance in London or New
York,
• those ambulances are controlled by emergency
response models.
If you cross the border Arizona to Mexico,
• A models determines if you are taken away for
extra security measures.
If you default on your car loans,
• A model determines when (or if) someone to
repossess your car.
If the stock market crashes,
• it might be that some model caused the crash.
13
“BIG MODELS”: MORE AND MORE PEOPLE
WRITING AND RUNNING MORE AND MORE
MODELS
Berkeley
Stanford
Washington
500
2500
2004 2009 2013
http://goo.gl/MJuxSt
Great
coders are
today’s
rock stars.
--Will.i.am
http://goo.gl/ljFtX
THE CPU CRISIS
You do the math.
What happens to a resource when
• an exponentially increasing number of people ,
• make exponentially increasing demands upon it?
15
TO SOLVE THE CPU CRISIS:
DON’T BUILD MORE CPUS
CPU power requirements (and the
pollution associated with generating
that power) is now a significant issue.
• Data centers consume 1.5% of
globally electrical output
• This value is predicted to grow
dramatically in the very near
future.
• Google reports that a 1%
reduction in CPU requirements
saves them millions of dollars in
power costs.
• Welcome to the age of green
software engineering
Moore’s Law’s is over
• Power consumption and heat
dissipation issues blocks further
exponential increases to CPU
clock frequencies.
• CPU memory access time to
extended memory can vary
widely.
• E.g. For systems on a chip,
access time across the bus to the
memory of a neighboring chip can
be orders of magnitude slower
that accessing memory on the
local chip.
16
“BIG MODELS” AND THE CPU
CRISIS: EXAMPLE #1
Cognitive models of the agents
(both pilots and computers)
• Late descent,
• Unpredicted rerouting,
• Different tailwind conditions
Goal: validate operations
procedures (are they safe?)
NASA’s analysts want to
explore 7000 scenarios.
• With current tools (NSGA-II)
• 300 weeks to complete
Limited access to hardware
• Queue of researchers wanting
hardware access
• Hardware pulled away if in-flight
incidents for manned space
missions
17
Asiana Airlines
Flight 214
“BIG MODELS” AND THE CPU
CRISIS: EXAMPLE #2
18
• Very rapid agile software development
• Continually retesting all code
• 4 billion unit tests Jan to Oct 2013
• Welcome to the resource economy. [Stokely et al. 2009]
SEARCH-BASED SE (SBSE)
Many SE activities are like optimization
problems [Harman,Jones’01].
Due to computational complexity, exact optimization
methods can be impractical for large SBSE problems
So researchers and practitioners use metaheuristic search
to find near optimal or good-enough solutions.
• E.g. simulated annealing [Rosenbluth et al.’53]
• E.g. genetic algorithms [Goldberg’79]
• E.g. tabu search [Glover86]
19
Repeat till happy or exhausted
• Selection (cull the herd)
• Cross-over (the rude bit)
• Mutation (stochastic jiggle)
PARETO OPTIMALITY AND
EVOLUTIONARY COMPUTING
20
1
2
3
5
4
6
7
8
9
Pareto frontier
-- better on some
criteria, worse on none
Selection:
-- generation[i+1] comes
from Pareto frontier of
generation[i]
APPLICATIONS OF SBSE
1. Requirements Menzies, Feather, Bagnall, Mansouri, Zhang
2. Transformation Cooper, Ryan, Schielke, Subramanian, Fatiregun, Williams
3.Effort prediction Aguilar-Ruiz, Burgess, Dolado, Lefley, Shepperd
4. Management Alba, Antoniol, Chicano, Di Pentam Greer, Ruhe
5. Heap allocation Cohen, Kooi, Srisa-an
6. Regression test Li, Yoo, Elbaum, Rothermel, Walcott, Soffa, Kampfhamer
7. SOA Canfora, Di Penta, Esposito, Villani
8. Refactoring Antoniol, Briand, Cinneide, O’Keeffe, Merlo, Seng, Tratt
9. Test Generation Alba, Binkley, Bottaci, Briand, Chicano, Clark, Cohen, Gutjahr, Harrold, Holcombe, Jones, Korel,
Pargass, Reformat, Roper, McMinn, Michael, Sthamer, Tracy, Tonella,Xanthakis, Xiao, Wegener,
Wilkins
10. Maintenance Antoniol, Lutz, Di Penta, Madhavi, Mancoridis, Mitchell, Swift
11. Model checking Alba, Chicano, Godefroid
12. Probing Cohen, Elbaum
13. UIOs Derderian, Guo, Hierons
14. Comprehension Gold, Li, Mahdavi
15. Protocols Alba, Clark, Jacob, Troya
16. Component sel Baker, Skaliotis, Steinhofel, Yoo
17. Agent Oriented Haas, Peysakhov, Sinclair, Shami, Mancoridis
21
EXPLOSIVE GROWTH IN
SBSE
Q: Why?
A: Thanks to Big Data, more access to more cpu.
22
WHY BUILD MORE
TOOLS FOR SBSE
AND
GOAL-ORIENTED RE?
(AREN’T THERE ENOUGH ALREADY?)
23
DO WE NEED MORE SBSE TOOLS
FOR GOAL-BASED RE?
24
Spea2
Nsga-II
DE
Scatter
search
PSO
SA
mocell
Z3
IBEA
SMT solvers
GALE
Nsga-III
MOEA/D
CASE STUDY:
FEATURE MAPS  PRODUCTS
Design product line
[Kang’90]
Add in known constraints
• E.g. “if we use a camera
then we need a high
resolution screen”.
Extract products
• Find subsets of the product
lines that satisfy
constraints.
• If no constraints, linear time
• Otherwise, can defeat
state-of-the-art optimizers
[Pohl et at, ASE’11]
[Sayyad, Menzies ICSE’13].
25
Cross-Tree
Constraints
SIZE OF FEATURE MAPS
This model: 10 features, 8 rules
[www.splot-research.org]:
ESHOP: 290 Features, 421
Rules
LINUX kernel variability project
LINUX x86 kernel
6,888 Features; 344,000 Rules
26
Cross-Tree Constraints
4 STUDIES:
2 OR 3 OR 4 OR 5 GOALS
27
Software engineering = navigating the user goals:
1. Satisfy the most domain constraints (0 ≤ #violations ≤ 100%)
2. Offers most features
3. Build “stuff” In least time
4. That we have used most before
5. Using features with least known defects
Binary goals= 1,2
Tri-goals= 1,2,3
Quad-goals= 1,2,3,4
Five-goals= 1,2,3,4,5
Abdel Salam
Sayyad, Tim
Menzies,
Hany
Ammar:
On the value
of user
preferences
in search-
based
software
engineering:
a case study
in software
product
lines. ICSE
2013: 492-
501
HV = HYPERVOLUME OF DOMINATED REGION
SPREAD = COVERAGE OF FRONTIER
% CORRECT = %CONSTRAINTS SATISFIED
28
Example performance
criteria
Example in bi-goal space
Note: example on next slide reports
HV, spread for bi, tri, quad, five objective space
Abdel Salam
Sayyad, Tim
Menzies,
Hany
Ammar:
On the value
of user
preferences
in search-
based
software
engineering:
a case study
in software
product
lines. ICSE
2013: 492-
501
HV = HYPERVOLUME OF DOMINATED REGION
SPREAD = COVERAGE OF FRONTIER
% CORRECT = %CONSTRAINTS SATISFIED
29
Very similarVery different, particularly in % correct
Continuous
dominance
Binary
dominance
ESHOP: 290 features, 421 rules
[Sayyad, Menzies ICSE’13]
Q: WHAT IS SO DIFFERENT ABOUT IBEA?
A: CONTINUOUS DOMINANCE
CONTINUOUS
IBEA : [Zitzler, Kunzli, 2004]
I(x1,x2):
• How much do we have to adjust goal
scores such that x1 dominates x2
Repeat till just a few left
 Sort all instances by F
 Delete worst
Then, standard GA (cross-over,
mutation) on the survivors
DISCRETE
Two sets of decisions
One dominates the other if worse
on none and better on at least one
Note: returns true,false, not the
size of the domination
30
K=
0.05
Cost of car
time to 100 mph
heaven
[Wagner et.al. 2007]
WHAT ARE THE
ADDED
BENEFITS OF
GOAL-ORIENTED
REASONING?
CASE STUDY: FEATURE MAPS FOR
PRODUCT-LINE ENGINEERING
31
STATE OF THE ART
32
Features
9
290
544
6888
SPLOTLinux(LVAT)
Pohl ‘11 Lopez-
Herrejon
‘11
Henard
‘12
Sayyad,
Menzie
s’13a
Velazco
‘13
Sayyad,
Menzies’13b
Johansen
‘11
Benavides
‘05
White ‘07, ‘08, 09a, 09b,
Shi ‘10, Guo ‘11
Objectives
Multi-goalSingle-goal
300,000+
clauses
THE SEEDING HEURISTIC
33
Given M < N goals that are hardest to solve
• Before running an N-optimization problem:
• Seed an initial population by via M-optimization
Study1 (with Z3) :
• Optimize for min constraint violations using Z3
Study2 (with IBEA):
• Optimize for (a) max features and (b) min violations
CORRECT SOLUTIONS AFTER 30 MINUTES
FOR THE LARGE LINUX KERNEL MODEL
34
From IBEA
From Z3
Abdel
Salam
Sayyad
Joseph
Ingram Tim
Menzies
Hany
Ammar,
Scalable
Product
Line
Configurati
on: A
Straw to
Break the
Camel’s
Back ,
IEEE ASE
2013
130 of
6888
features
5704 of
6888
features
HOW TO MAKE GOAL-
BASED REASONING
FASTER?
(GALE = GEOMETRIC
ACTIVE LEARNING)
CASE STUDY: SAFETY CRITICAL
ANALYSIS OF AVIATION PROCEDURES
35
WMC: GIT’S WORK MODELS
THAT COMPUTE [KIM’11]
Cognitive models of the agents
(both pilots and computers)
• Late descent,
• Unpredicted rerouting,
• Different tailwind conditions
Goal: validate operations
procedures (are they safe?)
NASA’s analysts want to
explore 7000 scenarios.
• With current tools (NSGA-II)
• 300 weeks to complete
Limited access to hardware
• Queue of researchers wanting
hardware access
• Hardware pulled away if in-flight
incidents for manned space
missions
36
Asiana Airlines
Flight 214
Repeat till happy or exhausted
• Selection (cull the herd)
• Cross-over (the rude bit)
• Mutation (stochastic jiggle)
ACTIVE LEARNING AND
EVOLUTIONARY COMPUTING
37
Naïve selection
• score every candidate
Active learning
• Score only the most
informative candidates
• e.g. just score most
distant points in data
clusters
38
e.g. 398 cars
Maximize acceleration,
Maximize mpg
14 evaluations
of goals
• Find splits using
FASTMAP O(n)
[Faloutsos & Lin ’95]
• At each level only check
for dominance of two
most extreme points
• 2log2(N) evals, or
less
• Leaves =
non-dominated
examples (i.e. the
Pareto frontier)
RECURSIVELY CLUSTER DATA, FIND
MOST DISTANT POINTS IN LEAF
CLUSTERS
FOR FRONTIER AS CONVEX HULL,
FOR EACH LINE SEGMENT, PUSH
TOWARDS BEST END
Given goals u, v, …
• utopia = best values
• hell = furthest from utopia
• All distances normalized 0..1
Given a line east to west
• s1 = I(east, hell)
• s2 = I(west, hell), s2 > s1
• C = dist(west,east)
p = push on line east,west
• direction = towards better (west)
• magnitude[i]=
• D= west[i] – east[i]
• new = old + old * C * D
• Reject if over C*1.5
39
• utopia
u
v
hell •
s2
s1
east
west
p
hell • u
v
hell • u
v
REPEAT FOR ALL POINTS ON LINE
SEGMENTS ON NON-DOMINATED
REGION OF CONVEX HULL
40
GALE:
1. Population[ 0 ] = N random points
2. Find M points on local Pareto frontier (approximated as convex
hull)
3. Mutants = mutate M over line segments on hull
4. Population[ i+1 ] = Mutants + (N – #Mutants) random points
5. Goto 2
Related work: [Zuluaga et al. ICML’13]
RESULTS ON NASA MODELS:
SCORES AS GOOD AS OTHER METHODS
ORDERS OF MAGNITUDE FEWER EVALUATIONS
41
1. #forgotten tasks
2. #interrupted acts
3. Interruption time
1
2
3
1
2
3
5
4 1. #delayed acts
2. Delay time5
4
4 mins (GALE) vs 7 hours (rest)
"Better Model-
Based Analysis
of Human
Factors for Safe
Aircraft
Approach”
Krall, Joe;
Menzies, Tim;
Davies, Misty
IEEE
Transactions on
Human-Machine
Systems, to
appear 2015
42
Runtimes,
Number of
evaluations
GALE:
Geometric
Active
Learning for
Search-
Based
Software
Engineering ,
IEEE TSE,
2015, to
appear
Joseph Krall,
Tim Menzies,
and Misty
Davies
MINIMIZATIONS OF
OBJECTIVE SCORES
43
gray Significantly different (Mann Whitney, 95%) and least
GALE:
Geometric
Active
Learning for
Search-
Based
Software
Engineering ,
IEEE TSE,
2015, to
appear
Joseph Krall,
Tim Menzies,
and Misty
Davies
GALE’S SEARCH: A MORE THOROUGH
SEARCH OF A SMALLER VOLUME
Less
hypervolume
Better
spread
44
GALE:
Geometric
Active
Learning for
Search-
Based
Software
Engineering ,
IEEE TSE,
2015, to
appear
Joseph Krall,
Tim Menzies,
and Misty
Davies
CONCLUSION
45
THE CPU CRISIS
You do the math.
What happens to a resource when
• an exponentially increasing number of people ,
• make exponentially increasing demands apon it?
46
TO MANAGE THE CPU CRISIS: NEED
A BETTER UNDERSTANDING OF THE
“SHAPE” OF THE USER GOALS
47
Spea2
Nsga-II
DE Scatter
search
PSO
SA
mocell
Z3
IBEA
SMT solvers
Domination
Is a binary
concept
Aggressive
exploration
of preference
space
GALE
TAR
WHICH
Nsga-III
MOEA/D
Q: IN THE AGE OF BIG DATA, WHAT
ROLE FOR SOFTWARE ENGINEERS?
A: GOAL ENGINEERING
Search-based software engineering
• CPU-intensive analysis
• Taming the CPU crisis by understanding user goals
Algorithms needs goal-oriented requirements engineering
• Goals are a primary design construct
• To optimize, find the “landscape of the goals”
Goal-oriented RE need algorithms
• Better tools for better explorations of user goals
48
49
• An optimization algorithm
• A data miner
• A visualization tool
• A requirements negotiation tool
• A compression algorithm
• summarize interesting regions of
complex space
• An anomaly detector
• The story thus far
• Data exchange tool for agents
• Share least data with most value
• A comment on the paradoxical success of
beings as confused as humans
• seemingly complex problems, aren’t
GALE : A TOOLKIT FOR UNDERSTANDING
THE SHAPE OF GOAL SPACE
50
Analysis = humans + systems
• better conclusions =
+ more data
+ more cpu
+ human analysts finding better
questions
+ automatic systems that better
understand the questions
COMBINING ALGORITHMS
AND GOAL-ORIENTED RE
Edsger Dijkstra,
ICSE 4, 1979
• “The notion of ‘user’
cannot be precisely
defined, and
therefore has no
place in CS or SE.”
TIM MENZIES,
2015
• Mathematical
definition of “user”
• “The force that
changes the
geometry of search
space.”
51
52

More Related Content

Similar to In the age of Big Data, what role for Software Engineers?

Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingLionel Briand
 
Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Maarten Smeets
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performanceinside-BigData.com
 
FutureOfTesting2008
FutureOfTesting2008FutureOfTesting2008
FutureOfTesting2008vipulkocher
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software EngineeringCS, NcState
 
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...Amazon Web Services
 
PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko Neotys
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsWeaveworks
 
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...Abdel Salam Sayyad
 
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization StudioSolving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization Studiooptimizatiodirectdirect
 
ExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine LearningExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine Learninginside-BigData.com
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Lionel Briand
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceLionel Briand
 
On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringAbdel Salam Sayyad
 
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalTLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalAnna Royzman
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 Georgina Tilby
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 

Similar to In the age of Big Data, what role for Software Engineers? (20)

Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performance
 
FutureOfTesting2008
FutureOfTesting2008FutureOfTesting2008
FutureOfTesting2008
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...
Big Data in the Cloud: How the RISElab Enables Computers to Make Intelligent ...
 
PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
 
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization StudioSolving Large Scale Optimization Problems using CPLEX Optimization Studio
Solving Large Scale Optimization Problems using CPLEX Optimization Studio
 
ExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine LearningExaLearn Overview - ECP Co-Design Center for Machine Learning
ExaLearn Overview - ECP Co-Design Center for Machine Learning
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software Engineering
 
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalTLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 

More from CS, NcState

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdecCS, NcState
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...CS, NcState
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9CS, NcState
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).CS, NcState
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceCS, NcState
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab templateCS, NcState
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements EngineeringCS, NcState
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginiaCS, NcState
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceCS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1CS, NcState
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataCS, NcState
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter? CS, NcState
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4CS, NcState
 

More from CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4
 

Recently uploaded

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 

Recently uploaded (20)

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 

In the age of Big Data, what role for Software Engineers?

  • 1. IN THE AGE OF BIG DATA, WHAT ROLE FOR SOFTWARE ENGINEERS? TIM MENZIES CS, NCSTATE, JUNE 2015
  • 2. 2 • We hold these truths to be self-evident…. • Better conclusions = + more data + more cpu + human analysts finding better questions + automatic systems that better understand the questions THE DECLARATION OF (HUMAN) DEPENDENCE
  • 3. BUT NOT EVERYONE AGREES Edsger Dijkstra, ICSE 4, 1979 • “The notion of ‘user’ cannot be precisely defined, and therefore has no place in CS or SE.” 3 Anonymous machine learning researcher, 1986 • “Kill all living human experts then resurrect the dead ones”
  • 4. SO WHAT ROLE FOR SE IN THE AGE OF BIG DATA? ANALYSIS IS A “SYSTEMS” TASK? The premise of Big Data: • better conclusions = same algorithms + more data + more cpu If so, then … • No role for human analysts • All insight is auto- generated from CPUs. ANALYSIS IS A “HUMAN” TASK? Current results on “software analytics” • A human-intensive process 4
  • 5. Q: IS BIG DATA A “SYSTEMS” OR “HUMAN”-TASK? A: YES 5
  • 6. THIS TALK: IN THE AGE BIG DATA SE ANALYSTS ARE “GOAL ENGINEERS” Search-based software engineering • CPU-intensive analysis • Taming the CPU crisis by understanding user goals Algorithms needs goal-oriented requirements engineering • Goals are a primary design construct • To optimize, find the “landscape of the goals” Goal-oriented RE need algorithms • Better tools for better explorations of user goals 6
  • 7. ROAD MAP 1. Define: • “CPU crisis” • “search-based software engineering” • “goal-oriented requirements engineering” 2. Why more tools? (not enough already) 3. The power of goal-oriented tools (IBEA) • Feature maps, product-line engineering 4. Next-gen goal-oriented tools (GALE) • Safety critical analysis cockpit software 5. Conclusions 6. Future work 7
  • 8. ACKNOWLEDGEMENTS 8 • SBSE + Feature Maps: – Dr. Abdel Sayyad Salem , Ph.D. WVU 2014 GALE + air traffic control – Dr. Joseph Krall, Ph.D., WVU, 2014
  • 9. WHAT IS… • GOAL-ORIENTED REQUIREMENTS ENGINEERING? • THE CPU CRISIS? • SEARCH-BASED SOFTWARE ENGINEERING? 9
  • 10. GOAL-ORIENTED RE Axel van Lamsweerde: Goal-Oriented Requirements Engineering: A guides Tour [vanLam RE’01] • Goals capture objectives for the system. • Goal-oriented RE : using goals for eliciting, specifying, documenting, structuring, elaborating, analyzing, negotiating, modifying requirements. 10 ✗ ✔ ✗ ✗ Mostly manual Mostly automatic Notation- based e.g. UML Search- based SE [Kang’90]
  • 11. OLDE AND NEW STYLE SE MANUAL SOFTWARE ENGINEERING • e.g. Full stack DEVLOPs development • engineers laboriously convert (by hand) non-executable paper models into executable code. • Focus of much prior and current work MODEL-BASED SE • Engineers codify the current understanding of the domain into a model, • Then study those models • My bet: focus of much future work 11
  • 12. Karplus and Levitt • 2013 Nobel prize in chemistry • development of multi-scale models for complex chemical systems • Explored complex chemical reactions (e.g. split-second changes of photosynthesis). 12 Models are now a central tool in scientific research. • in physics, biology and other fields of science • complex simulations using supercomputers. E.g. genomic map required analyzing 80 trillion bytes E.g.. Other computational modeling projects • the rise and fall of native cultures, • subnuclear particles • the Big Bang. MODELS: EVERYWHERE
  • 13. MODELS: EVERYWHERE If you call an ambulance in London or New York, • those ambulances are controlled by emergency response models. If you cross the border Arizona to Mexico, • A models determines if you are taken away for extra security measures. If you default on your car loans, • A model determines when (or if) someone to repossess your car. If the stock market crashes, • it might be that some model caused the crash. 13
  • 14. “BIG MODELS”: MORE AND MORE PEOPLE WRITING AND RUNNING MORE AND MORE MODELS Berkeley Stanford Washington 500 2500 2004 2009 2013 http://goo.gl/MJuxSt Great coders are today’s rock stars. --Will.i.am http://goo.gl/ljFtX
  • 15. THE CPU CRISIS You do the math. What happens to a resource when • an exponentially increasing number of people , • make exponentially increasing demands upon it? 15
  • 16. TO SOLVE THE CPU CRISIS: DON’T BUILD MORE CPUS CPU power requirements (and the pollution associated with generating that power) is now a significant issue. • Data centers consume 1.5% of globally electrical output • This value is predicted to grow dramatically in the very near future. • Google reports that a 1% reduction in CPU requirements saves them millions of dollars in power costs. • Welcome to the age of green software engineering Moore’s Law’s is over • Power consumption and heat dissipation issues blocks further exponential increases to CPU clock frequencies. • CPU memory access time to extended memory can vary widely. • E.g. For systems on a chip, access time across the bus to the memory of a neighboring chip can be orders of magnitude slower that accessing memory on the local chip. 16
  • 17. “BIG MODELS” AND THE CPU CRISIS: EXAMPLE #1 Cognitive models of the agents (both pilots and computers) • Late descent, • Unpredicted rerouting, • Different tailwind conditions Goal: validate operations procedures (are they safe?) NASA’s analysts want to explore 7000 scenarios. • With current tools (NSGA-II) • 300 weeks to complete Limited access to hardware • Queue of researchers wanting hardware access • Hardware pulled away if in-flight incidents for manned space missions 17 Asiana Airlines Flight 214
  • 18. “BIG MODELS” AND THE CPU CRISIS: EXAMPLE #2 18 • Very rapid agile software development • Continually retesting all code • 4 billion unit tests Jan to Oct 2013 • Welcome to the resource economy. [Stokely et al. 2009]
  • 19. SEARCH-BASED SE (SBSE) Many SE activities are like optimization problems [Harman,Jones’01]. Due to computational complexity, exact optimization methods can be impractical for large SBSE problems So researchers and practitioners use metaheuristic search to find near optimal or good-enough solutions. • E.g. simulated annealing [Rosenbluth et al.’53] • E.g. genetic algorithms [Goldberg’79] • E.g. tabu search [Glover86] 19
  • 20. Repeat till happy or exhausted • Selection (cull the herd) • Cross-over (the rude bit) • Mutation (stochastic jiggle) PARETO OPTIMALITY AND EVOLUTIONARY COMPUTING 20 1 2 3 5 4 6 7 8 9 Pareto frontier -- better on some criteria, worse on none Selection: -- generation[i+1] comes from Pareto frontier of generation[i]
  • 21. APPLICATIONS OF SBSE 1. Requirements Menzies, Feather, Bagnall, Mansouri, Zhang 2. Transformation Cooper, Ryan, Schielke, Subramanian, Fatiregun, Williams 3.Effort prediction Aguilar-Ruiz, Burgess, Dolado, Lefley, Shepperd 4. Management Alba, Antoniol, Chicano, Di Pentam Greer, Ruhe 5. Heap allocation Cohen, Kooi, Srisa-an 6. Regression test Li, Yoo, Elbaum, Rothermel, Walcott, Soffa, Kampfhamer 7. SOA Canfora, Di Penta, Esposito, Villani 8. Refactoring Antoniol, Briand, Cinneide, O’Keeffe, Merlo, Seng, Tratt 9. Test Generation Alba, Binkley, Bottaci, Briand, Chicano, Clark, Cohen, Gutjahr, Harrold, Holcombe, Jones, Korel, Pargass, Reformat, Roper, McMinn, Michael, Sthamer, Tracy, Tonella,Xanthakis, Xiao, Wegener, Wilkins 10. Maintenance Antoniol, Lutz, Di Penta, Madhavi, Mancoridis, Mitchell, Swift 11. Model checking Alba, Chicano, Godefroid 12. Probing Cohen, Elbaum 13. UIOs Derderian, Guo, Hierons 14. Comprehension Gold, Li, Mahdavi 15. Protocols Alba, Clark, Jacob, Troya 16. Component sel Baker, Skaliotis, Steinhofel, Yoo 17. Agent Oriented Haas, Peysakhov, Sinclair, Shami, Mancoridis 21
  • 22. EXPLOSIVE GROWTH IN SBSE Q: Why? A: Thanks to Big Data, more access to more cpu. 22
  • 23. WHY BUILD MORE TOOLS FOR SBSE AND GOAL-ORIENTED RE? (AREN’T THERE ENOUGH ALREADY?) 23
  • 24. DO WE NEED MORE SBSE TOOLS FOR GOAL-BASED RE? 24 Spea2 Nsga-II DE Scatter search PSO SA mocell Z3 IBEA SMT solvers GALE Nsga-III MOEA/D
  • 25. CASE STUDY: FEATURE MAPS  PRODUCTS Design product line [Kang’90] Add in known constraints • E.g. “if we use a camera then we need a high resolution screen”. Extract products • Find subsets of the product lines that satisfy constraints. • If no constraints, linear time • Otherwise, can defeat state-of-the-art optimizers [Pohl et at, ASE’11] [Sayyad, Menzies ICSE’13]. 25 Cross-Tree Constraints
  • 26. SIZE OF FEATURE MAPS This model: 10 features, 8 rules [www.splot-research.org]: ESHOP: 290 Features, 421 Rules LINUX kernel variability project LINUX x86 kernel 6,888 Features; 344,000 Rules 26 Cross-Tree Constraints
  • 27. 4 STUDIES: 2 OR 3 OR 4 OR 5 GOALS 27 Software engineering = navigating the user goals: 1. Satisfy the most domain constraints (0 ≤ #violations ≤ 100%) 2. Offers most features 3. Build “stuff” In least time 4. That we have used most before 5. Using features with least known defects Binary goals= 1,2 Tri-goals= 1,2,3 Quad-goals= 1,2,3,4 Five-goals= 1,2,3,4,5 Abdel Salam Sayyad, Tim Menzies, Hany Ammar: On the value of user preferences in search- based software engineering: a case study in software product lines. ICSE 2013: 492- 501
  • 28. HV = HYPERVOLUME OF DOMINATED REGION SPREAD = COVERAGE OF FRONTIER % CORRECT = %CONSTRAINTS SATISFIED 28 Example performance criteria Example in bi-goal space Note: example on next slide reports HV, spread for bi, tri, quad, five objective space Abdel Salam Sayyad, Tim Menzies, Hany Ammar: On the value of user preferences in search- based software engineering: a case study in software product lines. ICSE 2013: 492- 501
  • 29. HV = HYPERVOLUME OF DOMINATED REGION SPREAD = COVERAGE OF FRONTIER % CORRECT = %CONSTRAINTS SATISFIED 29 Very similarVery different, particularly in % correct Continuous dominance Binary dominance ESHOP: 290 features, 421 rules [Sayyad, Menzies ICSE’13]
  • 30. Q: WHAT IS SO DIFFERENT ABOUT IBEA? A: CONTINUOUS DOMINANCE CONTINUOUS IBEA : [Zitzler, Kunzli, 2004] I(x1,x2): • How much do we have to adjust goal scores such that x1 dominates x2 Repeat till just a few left  Sort all instances by F  Delete worst Then, standard GA (cross-over, mutation) on the survivors DISCRETE Two sets of decisions One dominates the other if worse on none and better on at least one Note: returns true,false, not the size of the domination 30 K= 0.05 Cost of car time to 100 mph heaven [Wagner et.al. 2007]
  • 31. WHAT ARE THE ADDED BENEFITS OF GOAL-ORIENTED REASONING? CASE STUDY: FEATURE MAPS FOR PRODUCT-LINE ENGINEERING 31
  • 32. STATE OF THE ART 32 Features 9 290 544 6888 SPLOTLinux(LVAT) Pohl ‘11 Lopez- Herrejon ‘11 Henard ‘12 Sayyad, Menzie s’13a Velazco ‘13 Sayyad, Menzies’13b Johansen ‘11 Benavides ‘05 White ‘07, ‘08, 09a, 09b, Shi ‘10, Guo ‘11 Objectives Multi-goalSingle-goal 300,000+ clauses
  • 33. THE SEEDING HEURISTIC 33 Given M < N goals that are hardest to solve • Before running an N-optimization problem: • Seed an initial population by via M-optimization Study1 (with Z3) : • Optimize for min constraint violations using Z3 Study2 (with IBEA): • Optimize for (a) max features and (b) min violations
  • 34. CORRECT SOLUTIONS AFTER 30 MINUTES FOR THE LARGE LINUX KERNEL MODEL 34 From IBEA From Z3 Abdel Salam Sayyad Joseph Ingram Tim Menzies Hany Ammar, Scalable Product Line Configurati on: A Straw to Break the Camel’s Back , IEEE ASE 2013 130 of 6888 features 5704 of 6888 features
  • 35. HOW TO MAKE GOAL- BASED REASONING FASTER? (GALE = GEOMETRIC ACTIVE LEARNING) CASE STUDY: SAFETY CRITICAL ANALYSIS OF AVIATION PROCEDURES 35
  • 36. WMC: GIT’S WORK MODELS THAT COMPUTE [KIM’11] Cognitive models of the agents (both pilots and computers) • Late descent, • Unpredicted rerouting, • Different tailwind conditions Goal: validate operations procedures (are they safe?) NASA’s analysts want to explore 7000 scenarios. • With current tools (NSGA-II) • 300 weeks to complete Limited access to hardware • Queue of researchers wanting hardware access • Hardware pulled away if in-flight incidents for manned space missions 36 Asiana Airlines Flight 214
  • 37. Repeat till happy or exhausted • Selection (cull the herd) • Cross-over (the rude bit) • Mutation (stochastic jiggle) ACTIVE LEARNING AND EVOLUTIONARY COMPUTING 37 Naïve selection • score every candidate Active learning • Score only the most informative candidates • e.g. just score most distant points in data clusters
  • 38. 38 e.g. 398 cars Maximize acceleration, Maximize mpg 14 evaluations of goals • Find splits using FASTMAP O(n) [Faloutsos & Lin ’95] • At each level only check for dominance of two most extreme points • 2log2(N) evals, or less • Leaves = non-dominated examples (i.e. the Pareto frontier) RECURSIVELY CLUSTER DATA, FIND MOST DISTANT POINTS IN LEAF CLUSTERS
  • 39. FOR FRONTIER AS CONVEX HULL, FOR EACH LINE SEGMENT, PUSH TOWARDS BEST END Given goals u, v, … • utopia = best values • hell = furthest from utopia • All distances normalized 0..1 Given a line east to west • s1 = I(east, hell) • s2 = I(west, hell), s2 > s1 • C = dist(west,east) p = push on line east,west • direction = towards better (west) • magnitude[i]= • D= west[i] – east[i] • new = old + old * C * D • Reject if over C*1.5 39 • utopia u v hell • s2 s1 east west p hell • u v hell • u v
  • 40. REPEAT FOR ALL POINTS ON LINE SEGMENTS ON NON-DOMINATED REGION OF CONVEX HULL 40 GALE: 1. Population[ 0 ] = N random points 2. Find M points on local Pareto frontier (approximated as convex hull) 3. Mutants = mutate M over line segments on hull 4. Population[ i+1 ] = Mutants + (N – #Mutants) random points 5. Goto 2 Related work: [Zuluaga et al. ICML’13]
  • 41. RESULTS ON NASA MODELS: SCORES AS GOOD AS OTHER METHODS ORDERS OF MAGNITUDE FEWER EVALUATIONS 41 1. #forgotten tasks 2. #interrupted acts 3. Interruption time 1 2 3 1 2 3 5 4 1. #delayed acts 2. Delay time5 4 4 mins (GALE) vs 7 hours (rest) "Better Model- Based Analysis of Human Factors for Safe Aircraft Approach” Krall, Joe; Menzies, Tim; Davies, Misty IEEE Transactions on Human-Machine Systems, to appear 2015
  • 42. 42 Runtimes, Number of evaluations GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  • 43. MINIMIZATIONS OF OBJECTIVE SCORES 43 gray Significantly different (Mann Whitney, 95%) and least GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  • 44. GALE’S SEARCH: A MORE THOROUGH SEARCH OF A SMALLER VOLUME Less hypervolume Better spread 44 GALE: Geometric Active Learning for Search- Based Software Engineering , IEEE TSE, 2015, to appear Joseph Krall, Tim Menzies, and Misty Davies
  • 46. THE CPU CRISIS You do the math. What happens to a resource when • an exponentially increasing number of people , • make exponentially increasing demands apon it? 46
  • 47. TO MANAGE THE CPU CRISIS: NEED A BETTER UNDERSTANDING OF THE “SHAPE” OF THE USER GOALS 47 Spea2 Nsga-II DE Scatter search PSO SA mocell Z3 IBEA SMT solvers Domination Is a binary concept Aggressive exploration of preference space GALE TAR WHICH Nsga-III MOEA/D
  • 48. Q: IN THE AGE OF BIG DATA, WHAT ROLE FOR SOFTWARE ENGINEERS? A: GOAL ENGINEERING Search-based software engineering • CPU-intensive analysis • Taming the CPU crisis by understanding user goals Algorithms needs goal-oriented requirements engineering • Goals are a primary design construct • To optimize, find the “landscape of the goals” Goal-oriented RE need algorithms • Better tools for better explorations of user goals 48
  • 49. 49 • An optimization algorithm • A data miner • A visualization tool • A requirements negotiation tool • A compression algorithm • summarize interesting regions of complex space • An anomaly detector • The story thus far • Data exchange tool for agents • Share least data with most value • A comment on the paradoxical success of beings as confused as humans • seemingly complex problems, aren’t GALE : A TOOLKIT FOR UNDERSTANDING THE SHAPE OF GOAL SPACE
  • 50. 50 Analysis = humans + systems • better conclusions = + more data + more cpu + human analysts finding better questions + automatic systems that better understand the questions
  • 51. COMBINING ALGORITHMS AND GOAL-ORIENTED RE Edsger Dijkstra, ICSE 4, 1979 • “The notion of ‘user’ cannot be precisely defined, and therefore has no place in CS or SE.” TIM MENZIES, 2015 • Mathematical definition of “user” • “The force that changes the geometry of search space.” 51
  • 52. 52