SlideShare a Scribd company logo
On Parameter Tuning in Search-Based
Software Engineering:
A Replicated Empirical Study
Abdel Salam Sayyad
Katerina Goseva-Popstojanova
Tim Menzies
Hany Ammar
West Virginia University, USA
International Workshop on Replication in Software
Engineering Research (RESER)
Oct 9, 2013
Sound bites
Search-based Software Engineering
Is here… to stay.
A helper… Not an alternative to human SE

Randomness…
is an essential part of Search Algorithms
… hence the need for statistical examination (A lot to learn from Empirical SE)

Parameter Tuning
A real problem…
Default values (rules of thumb) do exist… and (sadly?) they are being followed

Default parameter values fail to optimize performance…
… As seen in the original study, and in this replication…
No Free Lunch Theorems for Optimization [Wolpert and Macready ‘97+
the same parameter values don’t optimize all algorithms for all problems.
2
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Searching for what?
• Correct solutions…
– Conform to system relationships and constraints.

• Optimal solutions…
– Achieve user objectives/preferences…

• Complex problems have big Search spaces
– Exhaustive search not a practical idea.
5
Genetic Algorithm
• Start with a large population of candidate
solutions… (How large?)
• Evaluate the fitness of your solutions.
• Let your candidate solutions crossover –
exchange genes… (How often?)
• Mutate a small portion of your solutions.
(How small?)
• How do those choices affect performance?
6
Multi-objective Optimization

The Pareto Front

Higher-level
Decision Making

The Chosen Solution

7
Survival of the fittest
(according to NSGA-II [Deb et al. 2002])
Boolean dominance (x Dominates y, or does not):
- In no objective is x worse than y
- In at least one objective, x is better than y

Crowd
pruning

8
Indicator-Based Evolutionary
Algorithm (IBEA) [Zitzler and Kunzli ‘04+
1) For {old generation + new generation} do
– Add up every individual’s amount of dominance with
respect to everyone else

– Sort all instances by F
– Delete worst, recalculate, delete worst, recalculate, …

2) Then, standard GA (cross-over, mutation) on the
survivors  Create a new generation  Back to 1.
9
NSGA-II… the default algorithm
• Much prior work in SBSE (*)
Used NSGA-II

Didn’t state why!

-------------------------(*) Sayyad and Ammar, RAISE’13

10
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
The Original Study
• A. Arcuri and G. Fraser, "On Parameter Tuning in Search
Based Software Engineering," in Proc. SSBSE, 2011, pp.
33-47.
• A. Arcuri and G. Fraser, "Parameter Tuning or Default
Values? An Empirical Investigation in Search-Based
Software Engineering," Empirical Software Engineering,
Feb 2013.

• Problem: generating test vectors for objectoriented software.
• Fitness function: percentage of test coverage.
12
Results of original study
• Different parameter settings cause very large
variance in the performance.
• Default parameter settings perform relatively well,
but are far from optimal on individual problem
instances.

13
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Feature–oriented domain analysis [Kang 1990]
• Feature models = a
lightweight method for
defining a space of options
• De facto standard for
modeling variability, e.g.
Software Product Lines
Cross-Tree Constraints

Cross-Tree Constraints
15
What are the user preferences?
• Suppose each feature had the following metrics:
1. Boolean USED_BEFORE?
2. Integer DEFECTS
3. Real
COST
• Show me the space of “best options” according to the objectives:
1. That satisfies most domain constraints (0 ≤ #violations ≤ 100%)
2. That offers most features
3. Maximize overall feature that were used before. (promote re-use)
4. Minimize overall known defects.
5. Minimize cost.

16
Previous Work *Sayyad et al. ICSE’13+
• IBEA (continuous dominance criterion) beats NSGA-II
and a host of other algorithms based on Boolean
dominance criterion.
• Especially with a high number of objectives.
• Quality indicators:
– Percentage of conforming (useable) solutions
• We’re interested in 100% conforming solutions.

– Hypervolume (how close to optimal?)
– Spread (how diverse?)

17
Setup

18
What are “default settings”?
• Population size = 100
• Crossover rate = 80%
– 60% < Crossover rate < 90%
• [A. E. Eiben and J. E. Smith, Introduction to Evolutionary
Computing.: Springer, 2003.]

• Mutation rate = 1/Features
• [one bit out of the whole string]
19
Research Questions

20
Results [10 sec / algorithm / FM]

21
Answer to RQ1
• RQ1: How Large is the Potential Impact of a
Wrong Choice of Parameter Settings?
• We confirm Arcuri and Fraser’s conclusion:
“Different parameter settings cause very large
variance in the performance.”

22
Answer to RQ2
• RQ2: How Does a “Default” Setting Compare to the
Best and Worst Achievable Performance?
• Arcuri and Fraser concluded that: “Default parameter
settings perform relatively well, but are far from
optimal on individual problem instances.”
• We make a stronger conclusion: “Default parameter
settings perform generally poorly, but might perform
relatively well on individual problem instances.”
23
Answer to RQ3
• RQ3: How does the performance of IBEA’s
best tuning compare to NSGA-II’s best
tuning?

• Our results show that “IBEA’s best tuning
performs generally much better than NSGA-II’s
best tuning.”

24
RQ4: Parameter Training
• Find best tuning for a group of problem instances, apply it
to a new problem instance, would it be best tuning for the
new problem?
• Arcuri and Fraser concluded that: “Tuning should be done
on a very large sample of problem instances. Otherwise, the
obtained parameter settings are likely to be worse than
arbitrary default values.”
• Our conclusion: “Tuning on a sample of problem instances
does not, in general, result in the best parameter values for
a new problem instance, but the obtained setting are
generally better than the defaults settings.”
25
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Conclusion
• Default parameter values fail
to optimize performance…

• And, sadly, many SBSE
researchers choose “default”
algorithms (e.g. NSGA-II) along
with “default” parameters.
• Alternatives?
– A long way to go!

Acknowledgment
This research work
was funded by the
Qatar National
Research Fund under
the National Priorities
Research Program

• Parameter control
• Adaptive parameter control
27

More Related Content

What's hot

VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
Annibale Panichella
 
[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?
Ho Chi Minh City Software Testing Club
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
Tim Menzies
 
Experimental design
Experimental designExperimental design
Experimental designDan Toma
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithms
Nurhussen Menza
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Lionel Briand
 
Wcre13b.ppt
Wcre13b.pptWcre13b.ppt
Wcre13b.ppt
Ptidej Team
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software Testing
Jaguaraci Silva
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
alessio_ferrari
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
Chakkrit (Kla) Tantithamthavorn
 
Scenario $4$
Scenario $4$Scenario $4$
Scenario $4$Jason121
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
Ptidej Team
 
Wcre13a.ppt
Wcre13a.pptWcre13a.ppt
Wcre13a.ppt
Ptidej Team
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOps
Chakkrit (Kla) Tantithamthavorn
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Review
inventionjournals
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
Chakkrit (Kla) Tantithamthavorn
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Chakkrit (Kla) Tantithamthavorn
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software Architectures
Tim Menzies
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_Prediction
Minh Nguyen
 

What's hot (19)

VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
 
[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
 
Experimental design
Experimental designExperimental design
Experimental design
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithms
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
 
Wcre13b.ppt
Wcre13b.pptWcre13b.ppt
Wcre13b.ppt
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software Testing
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
 
Scenario $4$
Scenario $4$Scenario $4$
Scenario $4$
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
 
Wcre13a.ppt
Wcre13a.pptWcre13a.ppt
Wcre13a.ppt
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOps
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Review
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software Architectures
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_Prediction
 

Viewers also liked

Evolución de la web
Evolución de la webEvolución de la web
Evolución de la webJuliana Punk
 
Bases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasBases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasJoana Andrino
 
Plasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement SoftwarePlasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement Software
Jim Moss
 
Funciones inversas y compuestas
Funciones inversas y compuestasFunciones inversas y compuestas
Funciones inversas y compuestasalbertoalamos09
 
Fracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosFracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosPhirored
 
Top Ten Devices to Get on the Web
Top Ten Devices to Get on the WebTop Ten Devices to Get on the Web
Top Ten Devices to Get on the Webmatthewjfrederick2
 
Legacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded GamesLegacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded Games
Ariella Lehrer
 
Sistema de llenado
Sistema de llenadoSistema de llenado
Sistema de llenadoyanirys26
 

Viewers also liked (8)

Evolución de la web
Evolución de la webEvolución de la web
Evolución de la web
 
Bases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasBases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemas
 
Plasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement SoftwarePlasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement Software
 
Funciones inversas y compuestas
Funciones inversas y compuestasFunciones inversas y compuestas
Funciones inversas y compuestas
 
Fracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosFracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburos
 
Top Ten Devices to Get on the Web
Top Ten Devices to Get on the WebTop Ten Devices to Get on the Web
Top Ten Devices to Get on the Web
 
Legacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded GamesLegacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded Games
 
Sistema de llenado
Sistema de llenadoSistema de llenado
Sistema de llenado
 

Similar to On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study

On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software Engineering
Abdel Salam Sayyad
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
Lionel Briand
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
Lionel Briand
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Abdel Salam Sayyad
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
CS, NcState
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
Lionel Briand
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
Claire Le Goues
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
Ptidej Team
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
CS, NcState
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
Lionel Briand
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
Simon Hughes
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
Lionel Briand
 
SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART Infrastructure Facility
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
Vaibhav Varshney
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimation
CS, NcState
 
Software Testing
Software TestingSoftware Testing
Software Testing
Rahul Krishna
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
SigOpt
 

Similar to On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study (20)

On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software Engineering
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimation
 
Software Testing
Software TestingSoftware Testing
Software Testing
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 

More from Abdel Salam Sayyad

Slide set 5 workplace rights
Slide set 5  workplace rightsSlide set 5  workplace rights
Slide set 5 workplace rights
Abdel Salam Sayyad
 
Slide set 4 safety and risk
Slide set 4  safety and riskSlide set 4  safety and risk
Slide set 4 safety and risk
Abdel Salam Sayyad
 
Slide set 3 honesty, academic ethics
Slide set 3  honesty, academic ethicsSlide set 3  honesty, academic ethics
Slide set 3 honesty, academic ethics
Abdel Salam Sayyad
 
Slide set 2 moral dilemmas
Slide set 2  moral dilemmasSlide set 2  moral dilemmas
Slide set 2 moral dilemmas
Abdel Salam Sayyad
 
Slide set 1 intro to professional ethics
Slide set 1  intro to professional ethicsSlide set 1  intro to professional ethics
Slide set 1 intro to professional ethics
Abdel Salam Sayyad
 
Teaching methods - Active learning
Teaching methods - Active learningTeaching methods - Active learning
Teaching methods - Active learning
Abdel Salam Sayyad
 
Software Engineering Code of Ethics
Software Engineering Code of EthicsSoftware Engineering Code of Ethics
Software Engineering Code of Ethics
Abdel Salam Sayyad
 
Of Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingOf Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingAbdel Salam Sayyad
 
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAScalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Abdel Salam Sayyad
 
Guest Lecture 1/30/2013
Guest Lecture 1/30/2013Guest Lecture 1/30/2013
Guest Lecture 1/30/2013
Abdel Salam Sayyad
 

More from Abdel Salam Sayyad (11)

Slide set 5 workplace rights
Slide set 5  workplace rightsSlide set 5  workplace rights
Slide set 5 workplace rights
 
Slide set 4 safety and risk
Slide set 4  safety and riskSlide set 4  safety and risk
Slide set 4 safety and risk
 
Slide set 3 honesty, academic ethics
Slide set 3  honesty, academic ethicsSlide set 3  honesty, academic ethics
Slide set 3 honesty, academic ethics
 
Slide set 2 moral dilemmas
Slide set 2  moral dilemmasSlide set 2  moral dilemmas
Slide set 2 moral dilemmas
 
Slide set 1 intro to professional ethics
Slide set 1  intro to professional ethicsSlide set 1  intro to professional ethics
Slide set 1 intro to professional ethics
 
Teaching methods - Active learning
Teaching methods - Active learningTeaching methods - Active learning
Teaching methods - Active learning
 
Software Engineering Code of Ethics
Software Engineering Code of EthicsSoftware Engineering Code of Ethics
Software Engineering Code of Ethics
 
Of Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingOf Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision Making
 
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAScalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
 
My summary 6-24-2013
My summary 6-24-2013My summary 6-24-2013
My summary 6-24-2013
 
Guest Lecture 1/30/2013
Guest Lecture 1/30/2013Guest Lecture 1/30/2013
Guest Lecture 1/30/2013
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 

On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study

  • 1. On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study Abdel Salam Sayyad Katerina Goseva-Popstojanova Tim Menzies Hany Ammar West Virginia University, USA International Workshop on Replication in Software Engineering Research (RESER) Oct 9, 2013
  • 2. Sound bites Search-based Software Engineering Is here… to stay. A helper… Not an alternative to human SE Randomness… is an essential part of Search Algorithms … hence the need for statistical examination (A lot to learn from Empirical SE) Parameter Tuning A real problem… Default values (rules of thumb) do exist… and (sadly?) they are being followed Default parameter values fail to optimize performance… … As seen in the original study, and in this replication… No Free Lunch Theorems for Optimization [Wolpert and Macready ‘97+ the same parameter values don’t optimize all algorithms for all problems. 2
  • 3. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 4. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 5. Searching for what? • Correct solutions… – Conform to system relationships and constraints. • Optimal solutions… – Achieve user objectives/preferences… • Complex problems have big Search spaces – Exhaustive search not a practical idea. 5
  • 6. Genetic Algorithm • Start with a large population of candidate solutions… (How large?) • Evaluate the fitness of your solutions. • Let your candidate solutions crossover – exchange genes… (How often?) • Mutate a small portion of your solutions. (How small?) • How do those choices affect performance? 6
  • 7. Multi-objective Optimization The Pareto Front Higher-level Decision Making The Chosen Solution 7
  • 8. Survival of the fittest (according to NSGA-II [Deb et al. 2002]) Boolean dominance (x Dominates y, or does not): - In no objective is x worse than y - In at least one objective, x is better than y Crowd pruning 8
  • 9. Indicator-Based Evolutionary Algorithm (IBEA) [Zitzler and Kunzli ‘04+ 1) For {old generation + new generation} do – Add up every individual’s amount of dominance with respect to everyone else – Sort all instances by F – Delete worst, recalculate, delete worst, recalculate, … 2) Then, standard GA (cross-over, mutation) on the survivors  Create a new generation  Back to 1. 9
  • 10. NSGA-II… the default algorithm • Much prior work in SBSE (*) Used NSGA-II Didn’t state why! -------------------------(*) Sayyad and Ammar, RAISE’13 10
  • 11. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 12. The Original Study • A. Arcuri and G. Fraser, "On Parameter Tuning in Search Based Software Engineering," in Proc. SSBSE, 2011, pp. 33-47. • A. Arcuri and G. Fraser, "Parameter Tuning or Default Values? An Empirical Investigation in Search-Based Software Engineering," Empirical Software Engineering, Feb 2013. • Problem: generating test vectors for objectoriented software. • Fitness function: percentage of test coverage. 12
  • 13. Results of original study • Different parameter settings cause very large variance in the performance. • Default parameter settings perform relatively well, but are far from optimal on individual problem instances. 13
  • 14. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 15. Feature–oriented domain analysis [Kang 1990] • Feature models = a lightweight method for defining a space of options • De facto standard for modeling variability, e.g. Software Product Lines Cross-Tree Constraints Cross-Tree Constraints 15
  • 16. What are the user preferences? • Suppose each feature had the following metrics: 1. Boolean USED_BEFORE? 2. Integer DEFECTS 3. Real COST • Show me the space of “best options” according to the objectives: 1. That satisfies most domain constraints (0 ≤ #violations ≤ 100%) 2. That offers most features 3. Maximize overall feature that were used before. (promote re-use) 4. Minimize overall known defects. 5. Minimize cost. 16
  • 17. Previous Work *Sayyad et al. ICSE’13+ • IBEA (continuous dominance criterion) beats NSGA-II and a host of other algorithms based on Boolean dominance criterion. • Especially with a high number of objectives. • Quality indicators: – Percentage of conforming (useable) solutions • We’re interested in 100% conforming solutions. – Hypervolume (how close to optimal?) – Spread (how diverse?) 17
  • 19. What are “default settings”? • Population size = 100 • Crossover rate = 80% – 60% < Crossover rate < 90% • [A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing.: Springer, 2003.] • Mutation rate = 1/Features • [one bit out of the whole string] 19
  • 21. Results [10 sec / algorithm / FM] 21
  • 22. Answer to RQ1 • RQ1: How Large is the Potential Impact of a Wrong Choice of Parameter Settings? • We confirm Arcuri and Fraser’s conclusion: “Different parameter settings cause very large variance in the performance.” 22
  • 23. Answer to RQ2 • RQ2: How Does a “Default” Setting Compare to the Best and Worst Achievable Performance? • Arcuri and Fraser concluded that: “Default parameter settings perform relatively well, but are far from optimal on individual problem instances.” • We make a stronger conclusion: “Default parameter settings perform generally poorly, but might perform relatively well on individual problem instances.” 23
  • 24. Answer to RQ3 • RQ3: How does the performance of IBEA’s best tuning compare to NSGA-II’s best tuning? • Our results show that “IBEA’s best tuning performs generally much better than NSGA-II’s best tuning.” 24
  • 25. RQ4: Parameter Training • Find best tuning for a group of problem instances, apply it to a new problem instance, would it be best tuning for the new problem? • Arcuri and Fraser concluded that: “Tuning should be done on a very large sample of problem instances. Otherwise, the obtained parameter settings are likely to be worse than arbitrary default values.” • Our conclusion: “Tuning on a sample of problem instances does not, in general, result in the best parameter values for a new problem instance, but the obtained setting are generally better than the defaults settings.” 25
  • 26. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 27. Conclusion • Default parameter values fail to optimize performance… • And, sadly, many SBSE researchers choose “default” algorithms (e.g. NSGA-II) along with “default” parameters. • Alternatives? – A long way to go! Acknowledgment This research work was funded by the Qatar National Research Fund under the National Priorities Research Program • Parameter control • Adaptive parameter control 27