Transparencias utilizadas en el seminario-taller celebrado en el marco de los cursos de doctorado de la Universidad de Almería, los días 26 y 27 de octubre de 2020.
This document provides an overview of data modeling and SQL. It introduces key concepts in relational databases including relations, schemas, tuples, domains, keys, and referential integrity. It also describes the relational data model including the structure of relations, attributes, and relation instances. Finally, it covers the relational algebra including operations like select, project, join, union, difference, and rename that form the basis for SQL queries. The document uses examples from a banking domain to illustrate these concepts.
Tutorial at the European Nanoelectronics Applications, Design & Technology Co...Eugenio Villar
The document discusses electronic system modeling and design beyond Moore's law. It proposes the Contrex modeling methodology which uses UML profiles like MARTE for model-driven design of embedded systems. The methodology supports mixed-criticality modeling, design space exploration, performance analysis, and software synthesis from a single model. It also discusses how system design needs to evolve beyond Moore's law as chip scaling slows, focusing on programming platforms for the Internet of Everything.
Introduction to database-Formal Query language and Relational calculusAjit Nayak
The document provides an introduction to relational databases and formal relational query languages. It discusses relational algebra and relational calculus as the two formal query languages that form the mathematical foundation for commercial relational query languages. Relational algebra is a procedural query language that supports operations like select, project, union, set difference, cartesian product and rename. Example queries are provided for each operation to demonstrate their usage. Relational calculus is described as a non-procedural query language with examples of queries written using its syntax.
This document describes an approach for supporting software change tasks using automated query reformulations. It begins with an example of a software change request between Alex and Bob. It then discusses using techniques like TextRank and POSRank, adapted from PageRank, to identify important terms from change requests for querying a codebase. The approach was evaluated on a dataset of over 1,900 change tasks from eight open source projects. Experimental results found the proposed approach improved 57.84% of queries and outperformed existing state-of-the-art methods based on measures like query effectiveness and retrieval performance.
This document discusses various techniques for optimizing R code performance, including profiling code to identify bottlenecks, vectorizing operations, avoiding copies, and byte code compilation. It provides examples demonstrating how to measure performance, compare alternative implementations, and apply techniques like doing less work, vectorization, and avoiding method dispatch overhead. The key message is that optimizing performance is an iterative process of measuring, testing alternatives, and applying strategies like these to eliminate bottlenecks.
Introduction to R - from Rstudio to ggplotOlga Scrivner
This document provides an outline for an introduction to R programming. It discusses the materials needed like R Studio and example datasets. It covers R basics like assigning variables, vectors, indexing, logical operators and data types. It also discusses importing data from files like CSV, installing and loading packages, and basic data visualization. The document is intended to introduce learners to key concepts in R through examples and exercises in R Studio.
This document provides an overview of data modeling and SQL. It introduces key concepts in relational databases including relations, schemas, tuples, domains, keys, and referential integrity. It also describes the relational data model including the structure of relations, attributes, and relation instances. Finally, it covers the relational algebra including operations like select, project, join, union, difference, and rename that form the basis for SQL queries. The document uses examples from a banking domain to illustrate these concepts.
Tutorial at the European Nanoelectronics Applications, Design & Technology Co...Eugenio Villar
The document discusses electronic system modeling and design beyond Moore's law. It proposes the Contrex modeling methodology which uses UML profiles like MARTE for model-driven design of embedded systems. The methodology supports mixed-criticality modeling, design space exploration, performance analysis, and software synthesis from a single model. It also discusses how system design needs to evolve beyond Moore's law as chip scaling slows, focusing on programming platforms for the Internet of Everything.
Introduction to database-Formal Query language and Relational calculusAjit Nayak
The document provides an introduction to relational databases and formal relational query languages. It discusses relational algebra and relational calculus as the two formal query languages that form the mathematical foundation for commercial relational query languages. Relational algebra is a procedural query language that supports operations like select, project, union, set difference, cartesian product and rename. Example queries are provided for each operation to demonstrate their usage. Relational calculus is described as a non-procedural query language with examples of queries written using its syntax.
This document describes an approach for supporting software change tasks using automated query reformulations. It begins with an example of a software change request between Alex and Bob. It then discusses using techniques like TextRank and POSRank, adapted from PageRank, to identify important terms from change requests for querying a codebase. The approach was evaluated on a dataset of over 1,900 change tasks from eight open source projects. Experimental results found the proposed approach improved 57.84% of queries and outperformed existing state-of-the-art methods based on measures like query effectiveness and retrieval performance.
This document discusses various techniques for optimizing R code performance, including profiling code to identify bottlenecks, vectorizing operations, avoiding copies, and byte code compilation. It provides examples demonstrating how to measure performance, compare alternative implementations, and apply techniques like doing less work, vectorization, and avoiding method dispatch overhead. The key message is that optimizing performance is an iterative process of measuring, testing alternatives, and applying strategies like these to eliminate bottlenecks.
Introduction to R - from Rstudio to ggplotOlga Scrivner
This document provides an outline for an introduction to R programming. It discusses the materials needed like R Studio and example datasets. It covers R basics like assigning variables, vectors, indexing, logical operators and data types. It also discusses importing data from files like CSV, installing and loading packages, and basic data visualization. The document is intended to introduce learners to key concepts in R through examples and exercises in R Studio.
This document provides an overview and introduction to using the statistical software R. It outlines R's interface, workspace, help system, packages, input/output functions, and how to reuse results. It also discusses downloading and installing R, basic functions and syntax, data manipulation techniques like sorting and merging, creating graphs, and performing statistical analyses such as t-tests, regression, ANOVA, and multiple comparisons. The document recommends several tutorials that provide more in-depth information on using R for statistical modeling, data analysis, and graphics.
The document discusses various topics related to reactive and functional programming including NGRX, RxJS, Redux, Reactive Streams specification, and computing derived data using Reselect. It provides code examples for setting up an NGRX application with state management, effects, selectors, and composing the root reducer. It also discusses hot and cold streams, converting cold streams to hot, and the anatomy of RxJS operators.
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...IJERA Editor
The Case Based Reasoning is a paradigm of intelligent reasoning which consists on reusing results of previously solved problems (Source Cases) to solve new problems (Target Cases). It has been formalized as a five-step process consisting of: "Elaboration", "Retrieve", "Reuse", "Revise" and "Retain". In this paper we focus on the first phase of the CBR cycle with all of the required modeling to formalize a Case in our CBR-based system for semantic Web service discovery (CBR4WSD). This phase consists in formalizing the problem description and its structuring before launching the “Retrieve” phase and select the most appropriate Source Cases from the Case Base. We identify a set of basic descriptors to formalize Cases handled in our CBR4WSD system. In this conduct and in accordance with CBR policies, we put forward our Case representation model.
This document presents a goal-driven framework for software project data analytics. It uses qualitative goal models represented as AND/OR trees to capture different stakeholder views and contexts. Past project data is used to train the models using the Alchemy statistical learning and inference engine. This allows reasoning under uncertainty to determine satisfaction probabilities for goals like high effort, low cost, and high product quality on new projects. An evaluation on a dataset of 5000 projects found correctness between 60-74% and the approach was stable and able to handle different policy views. Future work includes developing goal models for specific methodologies and increasing model expressiveness.
Machine learning techniques can be applied in formal verification in several ways:
1) To enhance current formal verification tools by automating tasks like debugging, specification mining, and theorem proving.
2) To enable the development of new formal verification tools by applying machine learning to problems like SAT solving, model checking, and property checking.
3) Specific applications include using machine learning for debugging and root cause identification, learning specifications from runtime traces, aiding theorem proving by selecting heuristics, and tuning SAT solver parameters and selection.
This document discusses query languages in database management systems. It covers the main categories of query languages: procedural languages like relational algebra, and non-procedural languages like tuple and domain relational calculus. Relational algebra operators like selection, projection, union, and join are defined. Example queries are provided in both relational algebra and relational calculus formats. Functional dependencies, candidate keys, and the closure of attribute sets under a set of functional dependencies are also explained.
This was a brief 1-hour introduction to R programming, presented at the 1st Inter-experimental Machine Learning (IML) Working Group Workshop at CERN, 20-22 March 2017.
Surrogate modeling for industrial designShinwoo Jang
We describe GTApprox | a new tool for medium-scale surrogate modeling in industrial design. Compared to existing software, GTApprox brings several innovations: a few novel approximation algorithms, several advanced methods of automated model selection, novel options in the form of hints. We demonstrate the efficiency of GTApprox on a large collection of test problems. In addition, we describe several applications of GTApprox to real engineering problems.
This document provides an overview of object-oriented programming concepts using Python. It discusses how classes define objects that encapsulate state and methods. An example Counter class is provided that initializes count to 0 and provides methods like reset(), current(), and advance() to manipulate the count. Commercial databases are discussed as being more complex than the databases implemented in PS5 due to features like atomic transactions, security, large storage needs, and ability to scale. Indexing is identified as key to databases performing table lookups faster than a linear search.
Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.
Tools using AI will affect and, in many cases, redefine most areas of societal impact such as medical practice and intervention, autonomous transportation and law enforcement. While so far, most of the focus and time is invested into optimizing models’ performance, whenever a single wrong prediction has big implications in terms of value or life, accuracy becomes less important than explainability.
In this talk, we will learn about explainable AI and we will see how to apply some of the available tools to answer the question ‘’what did my system consider in order to output a specific prediction’.
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
Vibrant Technologies is headquarted in Mumbai,India.We are the best Teradata training provider in Navi Mumbai who provides Live Projects to students.We provide Corporate Training also.We are Best Teradata Database classes in Mumbai according to our students and corporates
This document discusses recommending job ads to people based on their profile and interests. It describes a job recommendation framework that uses features like a user's career path, social connections, interests and interactions to estimate the relevance of job postings. A regression model is trained on past user interactions to combine these feature scores. Additional filters may then be applied to further refine recommendations. Career path graphs are mined from user profiles to infer appropriate job roles and industries based on their experience and education. The system aims to identify job postings that closely match a user's demands and skills.
Aggregate Computing Platforms: Bridging the GapsRoberto Casadei
This presentation, held in the context of the CS & Eng M.D. course "Pervasive Computing" (Unibo, Cesena), drafts some analysis for an Aggregate Computing platform and suggests areas of investigation.
The document discusses query optimization in database management systems. It covers converting SQL queries to logical and physical query plans, improving logical plans through algebraic transformations, and choosing the optimal physical query plan by considering the order of operations and join trees. The goal is to select the most efficient physical plan by estimating the size of relations and intermediate results.
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
Providing high level tools for parallel programming while sustaining a high level of performance has been a challenge that techniques like Domain Specific Embedded Languages try to solve. In previous works, we investigated the design of such a DSEL – NT2 – providing a Matlab -like syntax for parallel numerical computations inside a C++ library.
Main issues addressed here is how liimtaions of classical DSEL generation and multithreaded code generation can be overcome.
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
The document summarizes a research paper that empirically validated several object-oriented design metrics proposed by Chidamber and Kemerer as indicators of fault-prone classes. The study analyzed 6 metrics on 180 classes from a system. Univariate analysis found 5 metrics to be significantly correlated with fault probability. Multivariate analysis using these 5 metrics achieved better prediction of faulty classes than models using traditional code metrics. The research validated that these OO design metrics can help identify fault-prone classes early in the development lifecycle.
The document outlines various statistical and data analysis techniques that can be performed in R including importing data, data visualization, correlation and regression, and provides code examples for functions to conduct t-tests, ANOVA, PCA, clustering, time series analysis, and producing publication-quality output. It also reviews basic R syntax and functions for computing summary statistics, transforming data, and performing vector and matrix operations.
This chapter discusses recursion, including recursive definitions, algorithms, and methods. It explores the concepts of base cases and general cases in recursion. The chapter covers tracing and designing recursive methods, as well as direct and indirect recursion. Examples of recursive functions like factorial, Fibonacci, and drawing Sierpinski gaskets are provided. Recursion is compared to iteration, and both approaches are discussed as solutions to problems.
SWE 316 discusses software design and architecture using object orientation principles. It covers key concepts like classes and objects, where classes represent concepts and objects are instances of classes. The document discusses how classes can relate to each other through inheritance, aggregation, and as clients. It also covers polymorphism, which allows the same method to perform different actions depending on the object's type. The goal is to describe software design using object-oriented techniques.
This is the presentation of the paper "Quasi-Optimal Recombination Operator" presented in EvoCOP 2019 (Best paper session). The paper is available in LNCS with doi: https://doi.org/10.1007/978-3-030-16711-0_9
More Related Content
Similar to Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
This document provides an overview and introduction to using the statistical software R. It outlines R's interface, workspace, help system, packages, input/output functions, and how to reuse results. It also discusses downloading and installing R, basic functions and syntax, data manipulation techniques like sorting and merging, creating graphs, and performing statistical analyses such as t-tests, regression, ANOVA, and multiple comparisons. The document recommends several tutorials that provide more in-depth information on using R for statistical modeling, data analysis, and graphics.
The document discusses various topics related to reactive and functional programming including NGRX, RxJS, Redux, Reactive Streams specification, and computing derived data using Reselect. It provides code examples for setting up an NGRX application with state management, effects, selectors, and composing the root reducer. It also discusses hot and cold streams, converting cold streams to hot, and the anatomy of RxJS operators.
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...IJERA Editor
The Case Based Reasoning is a paradigm of intelligent reasoning which consists on reusing results of previously solved problems (Source Cases) to solve new problems (Target Cases). It has been formalized as a five-step process consisting of: "Elaboration", "Retrieve", "Reuse", "Revise" and "Retain". In this paper we focus on the first phase of the CBR cycle with all of the required modeling to formalize a Case in our CBR-based system for semantic Web service discovery (CBR4WSD). This phase consists in formalizing the problem description and its structuring before launching the “Retrieve” phase and select the most appropriate Source Cases from the Case Base. We identify a set of basic descriptors to formalize Cases handled in our CBR4WSD system. In this conduct and in accordance with CBR policies, we put forward our Case representation model.
This document presents a goal-driven framework for software project data analytics. It uses qualitative goal models represented as AND/OR trees to capture different stakeholder views and contexts. Past project data is used to train the models using the Alchemy statistical learning and inference engine. This allows reasoning under uncertainty to determine satisfaction probabilities for goals like high effort, low cost, and high product quality on new projects. An evaluation on a dataset of 5000 projects found correctness between 60-74% and the approach was stable and able to handle different policy views. Future work includes developing goal models for specific methodologies and increasing model expressiveness.
Machine learning techniques can be applied in formal verification in several ways:
1) To enhance current formal verification tools by automating tasks like debugging, specification mining, and theorem proving.
2) To enable the development of new formal verification tools by applying machine learning to problems like SAT solving, model checking, and property checking.
3) Specific applications include using machine learning for debugging and root cause identification, learning specifications from runtime traces, aiding theorem proving by selecting heuristics, and tuning SAT solver parameters and selection.
This document discusses query languages in database management systems. It covers the main categories of query languages: procedural languages like relational algebra, and non-procedural languages like tuple and domain relational calculus. Relational algebra operators like selection, projection, union, and join are defined. Example queries are provided in both relational algebra and relational calculus formats. Functional dependencies, candidate keys, and the closure of attribute sets under a set of functional dependencies are also explained.
This was a brief 1-hour introduction to R programming, presented at the 1st Inter-experimental Machine Learning (IML) Working Group Workshop at CERN, 20-22 March 2017.
Surrogate modeling for industrial designShinwoo Jang
We describe GTApprox | a new tool for medium-scale surrogate modeling in industrial design. Compared to existing software, GTApprox brings several innovations: a few novel approximation algorithms, several advanced methods of automated model selection, novel options in the form of hints. We demonstrate the efficiency of GTApprox on a large collection of test problems. In addition, we describe several applications of GTApprox to real engineering problems.
This document provides an overview of object-oriented programming concepts using Python. It discusses how classes define objects that encapsulate state and methods. An example Counter class is provided that initializes count to 0 and provides methods like reset(), current(), and advance() to manipulate the count. Commercial databases are discussed as being more complex than the databases implemented in PS5 due to features like atomic transactions, security, large storage needs, and ability to scale. Indexing is identified as key to databases performing table lookups faster than a linear search.
Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.
Tools using AI will affect and, in many cases, redefine most areas of societal impact such as medical practice and intervention, autonomous transportation and law enforcement. While so far, most of the focus and time is invested into optimizing models’ performance, whenever a single wrong prediction has big implications in terms of value or life, accuracy becomes less important than explainability.
In this talk, we will learn about explainable AI and we will see how to apply some of the available tools to answer the question ‘’what did my system consider in order to output a specific prediction’.
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
Vibrant Technologies is headquarted in Mumbai,India.We are the best Teradata training provider in Navi Mumbai who provides Live Projects to students.We provide Corporate Training also.We are Best Teradata Database classes in Mumbai according to our students and corporates
This document discusses recommending job ads to people based on their profile and interests. It describes a job recommendation framework that uses features like a user's career path, social connections, interests and interactions to estimate the relevance of job postings. A regression model is trained on past user interactions to combine these feature scores. Additional filters may then be applied to further refine recommendations. Career path graphs are mined from user profiles to infer appropriate job roles and industries based on their experience and education. The system aims to identify job postings that closely match a user's demands and skills.
Aggregate Computing Platforms: Bridging the GapsRoberto Casadei
This presentation, held in the context of the CS & Eng M.D. course "Pervasive Computing" (Unibo, Cesena), drafts some analysis for an Aggregate Computing platform and suggests areas of investigation.
The document discusses query optimization in database management systems. It covers converting SQL queries to logical and physical query plans, improving logical plans through algebraic transformations, and choosing the optimal physical query plan by considering the order of operations and join trees. The goal is to select the most efficient physical plan by estimating the size of relations and intermediate results.
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
Providing high level tools for parallel programming while sustaining a high level of performance has been a challenge that techniques like Domain Specific Embedded Languages try to solve. In previous works, we investigated the design of such a DSEL – NT2 – providing a Matlab -like syntax for parallel numerical computations inside a C++ library.
Main issues addressed here is how liimtaions of classical DSEL generation and multithreaded code generation can be overcome.
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
The document summarizes a research paper that empirically validated several object-oriented design metrics proposed by Chidamber and Kemerer as indicators of fault-prone classes. The study analyzed 6 metrics on 180 classes from a system. Univariate analysis found 5 metrics to be significantly correlated with fault probability. Multivariate analysis using these 5 metrics achieved better prediction of faulty classes than models using traditional code metrics. The research validated that these OO design metrics can help identify fault-prone classes early in the development lifecycle.
The document outlines various statistical and data analysis techniques that can be performed in R including importing data, data visualization, correlation and regression, and provides code examples for functions to conduct t-tests, ANOVA, PCA, clustering, time series analysis, and producing publication-quality output. It also reviews basic R syntax and functions for computing summary statistics, transforming data, and performing vector and matrix operations.
This chapter discusses recursion, including recursive definitions, algorithms, and methods. It explores the concepts of base cases and general cases in recursion. The chapter covers tracing and designing recursive methods, as well as direct and indirect recursion. Examples of recursive functions like factorial, Fibonacci, and drawing Sierpinski gaskets are provided. Recursion is compared to iteration, and both approaches are discussed as solutions to problems.
SWE 316 discusses software design and architecture using object orientation principles. It covers key concepts like classes and objects, where classes represent concepts and objects are instances of classes. The document discusses how classes can relate to each other through inheritance, aggregation, and as clients. It also covers polymorphism, which allows the same method to perform different actions depending on the object's type. The goal is to describe software design using object-oriented techniques.
Similar to Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda (20)
This is the presentation of the paper "Quasi-Optimal Recombination Operator" presented in EvoCOP 2019 (Best paper session). The paper is available in LNCS with doi: https://doi.org/10.1007/978-3-030-16711-0_9
Uso de CMSA para resolver el problema de selección de requisitosjfrchicanog
El documento describe el uso del algoritmo Construct, Merge, Solve and Adapt (CMSA) para resolver el problema de selección de requisitos (Next Release Problem, NRP). Se proponen dos versiones de CMSA para NRP donde los componentes son los requisitos o los clientes. Se generan instancias aleatorias de NRP y se comparan los resultados de CMSA con un resolutor exacto (CPLEX) en términos de valor objetivo medio obtenido. Los resultados muestran que CMSA es capaz de encontrar soluciones de calidad similar al resolutor exacto pero en menos tiempo.
Enhancing Partition Crossover with Articulation Points Analysisjfrchicanog
This is the presentation of the paper entitled "Enhancing Partition Crossover with Articulation Points Analysis" at the ECOM track in gECCO 2018 (Kyoto). This paper was awarded with a "Best Paper Award"
The document discusses different formulations of the search-based software project scheduling problem:
1) A basic formulation aims to minimize project cost and duration by assigning employees to tasks while satisfying constraints like all tasks being performed and employee skills matching task requirements.
2) A multi-objective formulation considers both minimizing project cost and duration as objectives rather than a single objective.
3) Additional formulations include robust formulations to handle uncertainty and preference-based formulations to include decision-maker preferences.
The document outlines the objectives, constraints, and solution representations used for the different problem formulations.
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization Problemsjfrchicanog
This document describes research on developing an efficient hill climber algorithm for constrained pseudo-Boolean optimization problems. It discusses how scores can be computed to identify improving moves in the search space and updated efficiently as the solution changes. The key ideas are to compute scores initially and then only update a small, constant number of scores after each move instead of recomputing all possible scores from scratch. This approach is extended to handle multi-objective optimization problems with constraints by considering both strong and weak improving moves.
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimizationjfrchicanog
1) The document proposes an efficient hill climber algorithm for multi-objective pseudo-boolean optimization problems.
2) It computes scores that represent the change in fitness from moving to neighboring solutions, and updates these scores incrementally as the solution moves rather than recomputing from scratch.
3) The scores can be decomposed and updated in constant time by analyzing the variable interaction graph to identify variables that do not interact.
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problemjfrchicanog
The document presents a mixed integer linear programming (MILP) formulation for solving the taxi sharing problem. The taxi sharing problem aims to optimize taxi routes by allowing passengers with similar pick-up and drop-off locations to share taxis. The formulation models the problem as sequences of passenger locations that represent taxi rides. Experiments on real-world taxi trip data show the MILP formulation finds lower cost solutions than a parallel evolutionary algorithm, especially on medium and large problem instances, demonstrating the benefits of the exact MILP approach.
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...jfrchicanog
El documento describe la descomposición del problema de diseño de redes de radio en funciones elementales. Explica que la función objetivo que minimiza el número de antenas es elemental, mientras que la función que maximiza la cobertura puede escribirse como suma de hasta n funciones elementales, donde n es el número máximo de posiciones para antenas. Además, presenta ejemplos de cómo otras funciones objetivo complejas en otros problemas de optimización también se pueden descomponer en funciones elementales para analizar mejor la estructura del problema.
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...jfrchicanog
El documento presenta el problema de selección de requisitos (Next Release Problem, NRP), un problema de optimización multiobjetivo que busca minimizar el coste y maximizar el valor de un conjunto de requisitos sujeto a restricciones funcionales entre los requisitos. Se transforma el problema de optimización en una serie de problemas de decisión mediante el uso de restricciones pseudobooleanas, las cuales pueden ser resueltas de forma eficiente por resolutores SAT como MiniSAT+. Esto permite aprovechar los avances en resolución de problemas SAT para resolver problemas de optim
On the application of SAT solvers for Search Based Software Testingjfrchicanog
The document discusses using SAT solvers to solve optimization problems in search-based software testing. It introduces optimization problems and techniques like metaheuristics and evolutionary algorithms. The document then focuses on applying SAT solvers to the test suite minimization problem, which aims to minimize the number of tests needed to achieve full code coverage. It describes translating the optimization problem into a SAT instance that can be solved by SAT solvers to obtain optimal solutions.
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problemjfrchicanog
The document describes research on decomposing optimization problem landscapes into elementary components. It defines key landscape concepts like configuration space, neighborhood operators, and objective functions. It then introduces the idea of elementary landscapes where the objective function is a linear combination of eigenfunctions. The paper discusses decomposing general landscapes into a sum of elementary components and proposes using average neighborhood fitness for selection in non-elementary landscapes. It applies these concepts to the Hamiltonian Path Optimization problem, analyzing the problem's reversals and swaps neighborhoods.
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...jfrchicanog
The document proposes a new method to efficiently identify improving moves within a ball of radius r for k-bounded pseudo-Boolean optimization problems. The key ideas are to (1) decompose the scores of potential moves into scores of individual subfunctions, and (2) update only a constant number of subfunction scores in constant time as the solution moves within the ball, rather than recomputing all scores from scratch. This avoids the typical computational cost of O(nr) and allows identifying improving moves in constant time O(1), independent of the problem size n.
This document summarizes research on using ant colony optimization (ACO) metaheuristics to find safety errors in software models. It introduces ACO and describes its key components, such as pheromone trails and probabilistic solution construction. It then presents ACOhg, a new ACO model for exploring huge graphs with bounded memory. ACOhg allows construction of partial solutions and uses expanding path lengths and periodic pheromone removal. The researchers applied ACOhg to 5 Promela models and found it could find errors in much larger models than exhaustive search algorithms like DFS and BFS, using less memory. They conclude ACO metaheuristics show promise for scalable heuristic model checking of safety properties.
Elementary Landscape Decomposition of Combinatorial Optimization Problemsjfrchicanog
This document discusses elementary landscape decomposition for analyzing combinatorial optimization problems. It begins with definitions of landscapes, elementary landscapes, and landscape decomposition. Elementary landscapes have specific properties, like local maxima and minima. Any landscape can be decomposed into a set of elementary components. This decomposition provides insights into problem structure and can be used to design selection strategies and predict search performance. The document concludes that landscape decomposition is useful for understanding problems but methodology is still needed to decompose general landscapes.
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
How Can Hiring A Mobile App Development Company Help Your Business Grow?ToXSL Technologies
ToXSL Technologies is an award-winning Mobile App Development Company in Dubai that helps businesses reshape their digital possibilities with custom app services. As a top app development company in Dubai, we offer highly engaging iOS & Android app solutions. https://rb.gy/necdnt
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...kalichargn70th171
In today's business landscape, digital integration is ubiquitous, demanding swift innovation as a necessity rather than a luxury. In a fiercely competitive market with heightened customer expectations, the timely launch of flawless digital products is crucial for both acquisition and retention—any delay risks ceding market share to competitors.
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsPeter Muessig
The UI5 tooling is the development and build tooling of UI5. It is built in a modular and extensible way so that it can be easily extended by your needs. This session will showcase various tooling extensions which can boost your development experience by far so that you can really work offline, transpile your code in your project to use even newer versions of EcmaScript (than 2022 which is supported right now by the UI5 tooling), consume any npm package of your choice in your project, using different kind of proxies, and even stitching UI5 projects during development together to mimic your target environment.
14 th Edition of International conference on computer visionShulagnaSarkar2
About the event
14th Edition of International conference on computer vision
Computer conferences organized by ScienceFather group. ScienceFather takes the privilege to invite speakers participants students delegates and exhibitors from across the globe to its International Conference on computer conferences to be held in the Various Beautiful cites of the world. computer conferences are a discussion of common Inventions-related issues and additionally trade information share proof thoughts and insight into advanced developments in the science inventions service system. New technology may create many materials and devices with a vast range of applications such as in Science medicine electronics biomaterials energy production and consumer products.
Nomination are Open!! Don't Miss it
Visit: computer.scifat.com
Award Nomination: https://x-i.me/ishnom
Conference Submission: https://x-i.me/anicon
For Enquiry: Computer@scifat.com
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
1. Seminario-taller
Introducción a la Ingeniería del
Software Guiada por Búsqueda
Francisco Chicano
Departamento de Lenguajes y Ciencias de la Computación
2. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 2
chicano@lcc.uma.es
@francischicano
www.franciscochicano.es
José Francisco Chicano García
3. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 3
Planificación temporal
Hora Lunes 26 Martes 27
9:00-10:30 Introducción a SBSE y NRP Minimización de casos de prueba
10:30-10:45 Descanso Descanso
10:45-12:15 NRP (continuación) Refactorización
12:15-12:30 Descanso Descanso
12:30-14:00 Agrupamiento de módulos Planificación de proyectos y
prueba de conocimiento
Habrá una pequeña prueba de conocimiento el martes 27 en la última
franja
4. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 4
Materiales para seguir el taller
Software:
• RStudio (versión on-line en https://rstudio.cloud)
• Symphony (resolutor ILP open-source)
• Rsymphony (paquete de R para conectar con Symphony)
Código y ejemplos
• Disponibles en GitHub: https://github.com/jfrchicanog/TallerUAL2020
• Y en Rstudio.cloud: https://rstudio.cloud/project/1815713
Tarea: acceder a RStudio e
instalar Rsymphony
5. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 5
• Introducción a SBSE
• Requisitos para la Siguiente Versión (NRP)
• Programación Lineal Entera
• Optimización Multiobjetivo
• Agrupamiento de Módulos Software
• Minimización de Casos de Prueba
• Refactorización Automática de Software
• Planificación de Proyectos Software
• Conclusión
• Prueba de Conocimiento
Índice
6. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 6
7. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 7
Ingeniería del Software
8. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 8
Problemas de búsqueda
Un problema de búsqueda es una relación binaria R ⊆ X×Y, tal que dado un x ∈
X (instancia) estamos interesados en encontrar y ∈ Y (solución) con (x,y) ∈ R
Ejemplos de instancias de problemas de búsqueda:
- Encontrar los factores primos de 15
- Encontrar una cadena que case con la expresión regular a*b
- Encontrar un número real x que minimice la expresión (x-1)^2
Nos centraremos fundamentalmente en un subtipo de problemas de búsqueda:
los problemas de optimización
9. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 9
Un problema de optimización es un par: P = (S,f) donde:
S es un conjunto de soluciones (o espacio de búsqueda)
f: S → R es una función objetivo a minimizar o maximizar
Si nuestro objetivo es minimizar la función buscamos:
Máximo global
Máximo local
Mínimo global
Mínimo local
s’ Î S | f(s’) ≤ f(s), "s Î S
Problemas de optimización
10. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 10
Algoritmos de optimización
TÉCNICAS DE OPTIMIZACIÓN
EXACTAS APROXIMADAS
HEURÍSTICAS AD HOC METAHEURÍSTICAS
Gradiente
Mult. de Lagrange
Basadas en el cálculo
Programación dinámica
Ramificación y poda
Resolutor ILP
Exhaustivas
SA
VNS
TS
Trayectoria
EA
ACO
PSO
Población
Híbridos
11. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 11
Ingeniería del Software Guiada por Búsqueda
Máximo Global
Máximo Local
Mínimo Global
Mínimo Local
Problema de búsqueda
u optimización
Algoritmo de
búsqueda u
optimización
Solución
Término en inglés: Search-Based Software Engineering (SBSE)
12. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 12
Ingeniería del Software Guiada por Búsqueda
Término en inglés: Search-Based Software Engineering (SBSE)
Requisitos para la
siguiente versión
Agrupamiento de
módulos software
Minimización de
casos de prueba
Refactorización
automática
Planificación
de proyectos
13. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 13
14. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 14
Dados:
Ø Un conjunto de requisitos R = {r1, r2, ..., rn} …
Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes)
Ø Un conjunto de interacciones funcionales entre requisitos
Ø Implicación (ri antes que rj):
Ø Combinación (ri a la vez que rj):
Ø Exclusión (no a la vez):
Encontrar un subconjunto de requisitos que además de cumplir con las
interacciones minimice el coste y maximice el valor:
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas par
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no p
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido j
Si llamamos X ✓ R al conjunto de requisitos seleccionado
de X vienen dados por las funciones:
coste(X) =
nX
cj y valor(X) =
nX
ar como la suma ponderada de los valores de imporPm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos,
sarrollo determinado, lo que limita las alternativas p
teracciones funcionales entre requisitos se clasifican
mplicaci´on o precedencia. ri ) rj. Un requisito rj no
eviamente otro requisito ri no ha sido implementado
ombinaci´on o acoplamiento. ri rj. Los requisitos ri y
forma conjunta en el software.
xclusi´on. ri rj. El requisito ri no puede ser incluido
llamamos X ✓ R al conjunto de requisitos selecciona
vienen dados por las funciones:
nX nX
calcular como la suma ponderada de los va
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccion
de desarrollo determinado, lo que limita las
Las interacciones funcionales entre requisito
Implicaci´on o precedencia. ri ) rj. Un
previamente otro requisito ri no ha sido
Combinaci´on o acoplamiento. ri rj. Los
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no pu
Si llamamos X ✓ R al conjunto de requis
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y v
da requisito rj 2 R tiene un coste cj para la empresa si se
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas pa
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido
Si llamamos X ✓ R al conjunto de requisitos seleccionad
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y valor(X) =
nX
j,rj 2X
respectivamente. Consideraremos una versi´on multi-objetiv
minimice el coste y maximice el valor del conjunto de requi
min
max
Bagnall et al. van der Akker et al.
Next Release Problem (NRP)
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
15. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 15
Next Release Problem (NRP): ejemplo
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
coste({r1, r3})=
valor({r1, r3})=
coste({r1, r2, r3})=
valor({r1, r2, r3})=
16. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 16
17. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 17
Introducción a la programación lineal
Un problema en programación lineal tiene la forma
max
nX
j=1
cjxj
nX
j=1
a1jxj b1
nX
j=1
a2jxj b2
. . .
nX
j=1
amjxj bm
xj 0 j = 1, 2, . . . , n
max
nX
cjxj
X
j=1
a2jxj b2
. . .
nX
j=1
amjxj bm
xj 0 j = 1, 2, . . . , n
max
nX
j=1
cjxj
sujeto a
nX
j=1
aijxj bi i = 1, 2, . . . , m
xj 0 j = 1, 2, . . . , n
max c · x
sujeto a
Ax b
x 0
j=1
sujeto a
nX
j=1
aijxj bi i = 1, 2, . . . , m
xj 0 j = 1, 2, . . . , n
max c · x
sujeto a
Ax b
x 0
1
Sujeto a: Sujeto a: Sujeto a:
18. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 18
Introducción a la programación lineal
Ejemplo:
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Región factible
x1+x2=cte
19. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 19
Introducción a la programación lineal
20. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 20
Introducción a la programación lineal
Con Rsymphony
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Región factible
Por defecto, las columnas
se rellenan primero
Tarea: resolver el
programa con RStudio
21. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 21
Programación lineal entera
Se añade la restricción de que las variables solo pueden tomar
valores enteros
Ejemplo:
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
x1, x2 enteros
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Soluciones factibles
22. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 22
Con Rsymphony
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
x1, x2 enteros
Tarea: resolver el
programa con RStudio
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Soluciones factibles
Programación lineal entera
23. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 23
24. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 24
Dados:
Ø Un conjunto de requisitos R = {r1, r2, ..., rn} …
Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes)
Ø Un conjunto de interacciones funcionales entre requisitos
Ø Implicación (ri antes que rj):
Ø Combinación (ri a la vez que rj):
Ø Exclusión (no a la vez):
Encontrar un subconjunto de requisitos que además de cumplir con las
interacciones minimice el coste y maximice el valor:
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas par
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no p
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido j
Si llamamos X ✓ R al conjunto de requisitos seleccionado
de X vienen dados por las funciones:
coste(X) =
nX
cj y valor(X) =
nX
ar como la suma ponderada de los valores de imporPm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos,
sarrollo determinado, lo que limita las alternativas p
teracciones funcionales entre requisitos se clasifican
mplicaci´on o precedencia. ri ) rj. Un requisito rj no
eviamente otro requisito ri no ha sido implementado
ombinaci´on o acoplamiento. ri rj. Los requisitos ri y
forma conjunta en el software.
xclusi´on. ri rj. El requisito ri no puede ser incluido
llamamos X ✓ R al conjunto de requisitos selecciona
vienen dados por las funciones:
nX nX
calcular como la suma ponderada de los va
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccion
de desarrollo determinado, lo que limita las
Las interacciones funcionales entre requisito
Implicaci´on o precedencia. ri ) rj. Un
previamente otro requisito ri no ha sido
Combinaci´on o acoplamiento. ri rj. Los
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no pu
Si llamamos X ✓ R al conjunto de requis
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y v
da requisito rj 2 R tiene un coste cj para la empresa si se
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas pa
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido
Si llamamos X ✓ R al conjunto de requisitos seleccionad
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y valor(X) =
nX
j,rj 2X
respectivamente. Consideraremos una versi´on multi-objetiv
minimice el coste y maximice el valor del conjunto de requi
min
max
Bagnall et al. van der Akker et al.
Next Release Problem (NRP)
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
25. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 25
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: Objetivomax c · x
o a
Ax b
x 0
max
mX
i=1
wisi
Tarea: hallar la
expresión objetivo
26. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 26
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: Objetivomax c · x
o a
Ax b
x 0
max
mX
i=1
wisi
Tarea: hallar la
expresión objetivo
27. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 27
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: restricción de coste
max c · x
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
Tarea: hallar la
restricción de coste
28. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 28
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: restricción de coste
max c · x
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
Tarea: hallar la
restricción de coste
29. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 29
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
1
Tarea: hallar las restricciones
de dependencias entre
requisitos (implicación)
30. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 30
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
1
Tarea: hallar las restricciones
de dependencias entre
requisitos (implicación)
31. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 31
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
Tarea: hallar las restricciones
de dependencias entre
requisitos (combinación)
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
rj = ri 8(i, j) 2 C
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
32. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 32
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
Tarea: hallar las restricciones
de dependencias entre
requisitos (combinación)
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
rj = ri 8(i, j) 2 C
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
33. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 33
max c · x
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: satisfacción de clientes
Tarea: hallar las restricciones
de satisfacción de clientes
34. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 34
max c · x
sujeto a
Ax b
x 0
max
mX
i=1
wisi
nX
i=1
ciri B
sj ri 8(i, j) 2 Q
rj ri 8(i, j) 2 P
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: satisfacción de clientes
Tarea: hallar las restricciones
de satisfacción de clientes
35. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 35
En la implementación en R se han usado las primeras n variables del vector de
variables para los requisitos y las restantes m variables para los clientes
Funciones relevantes:
• readNrpInstance(file): lee un fichero de instancia y devuelve una lista con una
representación interna
• ilpModel(nrpInstance, budgetLimitFraction): toma una lista con una instancia y una
fracción (número real) y crea un modelo ILP para la instancia
Ejemplo:
Modelo ILP de NRP
Tarea: resolver algunas
instancias con R
36. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 36
37. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 37
• En un problema MO hay varios objetivos (funciones) que queremos optimizar
f1
f2 Soluciones eficientes
(no dominadas)
Soluciones débilmente
eficientes
Solución no
soportada
Optimización multiobjetivo
Solución
dominada
38. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 38
Si minimizamos ambos objetivos
f1
f2
Optimización multiobjetivo
f1
f2
Frente convexo
Frente cóncavo
Fácil de resolver con
sumas ponderadas
de objetivos
No se puede resolver
con sumas ponderadas
de objetivos
39. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 39
¿Cómo será el frente en NRP?
coste
valor
valor
coste
40. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 40
0
100
200
300
400
500
600
700
800
0 10 20 30 40 50 60
Valor
Coste
ACS
NSGAII
GRASP
Pareto
(a) dataset1
0
500
1000
1500
2000
0 100 200 300 400 500 600 700
Valor
Coste
ACS
NSGAII
GRASP
Pareto
(b) dataset2
Figura 1. Frente de Pareto y aproximaciones de los algoritmos metaheur´ısticos.
Hemos de indicar que estos tiempos se refieren de nuevo a una m´aquina
diferente (Pentium 4 a 3,2 GHz) y el objetivo no era encontrar el frente completo,
Algunos ejemplos
C., Domínguez-Ríos, del Águila, del Sagrado, Alba, JISBD 2016
41. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 41
NRP Multiobjetivo
Tarea: hallar manualmente el frente
de Pareto para nuestro ejemplo
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
42. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 42
NRP Multiobjetivo
Tarea: calcula el frente usando R
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
43. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 43
44. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 44
Queremos encontrar una partición de un conjunto de módulos software de
manera que el software quede estructurado en subsistemas que permitan
una mejora en el desarrollo y mantenibilidad del mismo
Agrupamiento de módulos software
45. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 45
Cómo medir la calidad de la solución obtenida:
Intra-conectividad: mide la cohesión entre módulos pertenecientes
a un mismo subsistema.
Inter-conectividad: mide el acoplamiento existente entre módulos
que pertenecen a distintos subsistemas.
La calidad de modularización del sistema (Modularization Quality, MQ)
combina ambas.
Agrupamiento de módulos software
46. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 46
Dado un grafo de dependencias de módulos G = (V, A) , definimos un peso
w para cada arista. Llamamos n al número de nodos (módulos) y m al
número de aristas (número de relaciones o dependencias).
Se define la calidad de modularización del sistema como
El valor i (intra-conectividad) es la suma de los pesos de las aristas cuyos
extremos están ambos dentro del subsistema. Mide la cohesión.
El valor j (inter-conectividad) representa la suma de los pesos de las aristas con
un extremo en el subsistema y el otro no. Mide el acoplamiento.
Agrupamiento de módulos software
47. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 47
087631 ===== MFMFMFMFMF
2
1
21
1
2
15 =
×+
=MF
7
4
32
2
2
12 =
×+
=MF
7
6
13
3
2
14 =
×+
=MF
...928571.1
14
27
7
6
7
4
2
1
==++=MQ
Agrupamiento de módulos software: ejemplo
Tarea: hallar MQ
48. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 48
087631 ===== MFMFMFMFMF
2
1
21
1
2
15 =
×+
=MF
7
4
32
2
2
12 =
×+
=MF
7
6
13
3
2
14 =
×+
=MF
...928571.1
14
27
7
6
7
4
2
1
==++=MQ
Agrupamiento de módulos software: ejemplo
Tarea: hallar MQ
49. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 49
Agrupamiento de módulos software: preguntas
¿Cuánto vale MQ si todos los módulos
están en grupos diferentes?
¿Cuánto vale MQ si todos los módulos
están en el mismo grupo?
¿Qué valor máximo puede tomar MQ?
50. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 50
El número de particiones de un conjunto de n elementos es un número de Bell
1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975, …
¡Esto crece muy rápido!
Los algoritmos enumerativos son inviables para muchos módulos
El problema es no lineal (se descarta programación lineal entera)
Algoritmos exactos: ramificación y poda
Algoritmos aproximados: heurísticas y metaheurísticas
Agrupamiento de módulos software: resolución
51. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 51
Análisis del modelo:
- Si n = 1, MQ* = 0
- Si n = 2, MQ* = 1
- Si todos los nodos están aislados, MQ = 0
- Si hay un único subsistema (y más de un nodo), MQ = 1
- Para k subsistemas y n-k subsistemas: MQ <= k
- Experimentalmente se observa que el valor MQ* suele ser bajo en comparación
con el número de módulos
- Para k fijo, si hay gran diferencia de cardinalidad entre el grupo más grande y el
más pequeño, se obtiene un valor de MQ más bajo.
( )2,1,3,1,2,1* =xFormato de una solución:
[ ]1,0ÎiMF
Agrupamiento de módulos software: resolución
¿Por qué?
52. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 52
Agrupamiento de módulos software: resolución
53. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 53
Valor obtenido por el mejor algoritmo heurístico de Praditwong et al
MQ
Enumerativo Algoritmo B&B
Soluciones
visitadas
Tiempo (s)
Soluciones
visitadas
Tiempo (s)
MDG 8 1,92857 4140 0,09 6 0,10
MDG 10 2,5 115975 0,14 11 0,13
MDG 15 2,812 1382958545 226,00 24 23,00
mtunis 2,314* 2,314* 121,00*
Agrupamiento de módulos software: resolución
54. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 54
55. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 55
Test Suite Minimization
Given:
Ø A set of test cases T = {t1, t2, ..., tn}
Ø A set of software elements to be covered (e.g., use cases) E= {e1, e2, ..., ek}
Ø A coverage matrix
Find a subset of tests X Í T maximizing coverage and minimizing the testing cost
tests X ✓ T with minimum cost covering all the program elements. In formal
terms:
minimize cost(X) =
nX
i=1
ti2X
ci (2)
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mij = 1.
The multi-objective version of the TSMP does not impose the constraint of
full coverage, but it defines the coverage as the second objective to optimize,
leading to a bi-objective problem. In short, the bi-objective TSMP consists in
finding a subset of tests X ✓ T having minimum cost and maximum coverage.
Formally:
minimize cost(X) =
nX
i=1
ti2X
ci (3)
maximize cov(X) = |{ej 2 E|9ti 2 X with mij = 1}| (4)
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
M=
3 Test Suite Minimization Problem
When a piece of software is modified, the new software is tested using
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
56. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 56
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
t1 + t2 + t4 + t5 4e1 (10)
t2 + t6 4e2 (11)
t1 + t3 + t6 4e3 (12)
t5 4e4 (13)
Assume unitary cost for tests: ci=1
cost({t1, t5})=
cov({t1, t5})=
cost({t1, t2, t5})=
cov({t1, t2, t5})=
57. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 57
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: constraints relating
covered elements and tests
The single-objective formulation of TSMP is a p
formulation. Then, we can translate the 2-obj T
and then infer the translation of the 1-obj TSM
Let us introduce n binary variables ti 2 {0,
ti = 1 then the corresponding test case is inclu
the test case is not included. We also introduc
one for each program element to cover. If ej = 1
is covered by one of the selected test cases a
covered by a selected test case.
The values of the ej variables are not indepe
variable ej must be 1 if and only if there exist
and ti = 1. The dependence between both sets
the following 2m PB constraints:
ej
nX
i=1
mijti n · ej
We can see that if the sum in the middle
58. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 58
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: constraints relating
covered elements and tests
The single-objective formulation of TSMP is a p
formulation. Then, we can translate the 2-obj T
and then infer the translation of the 1-obj TSM
Let us introduce n binary variables ti 2 {0,
ti = 1 then the corresponding test case is inclu
the test case is not included. We also introduc
one for each program element to cover. If ej = 1
is covered by one of the selected test cases a
covered by a selected test case.
The values of the ej variables are not indepe
variable ej must be 1 if and only if there exist
and ti = 1. The dependence between both sets
the following 2m PB constraints:
ej
nX
i=1
mijti n · ej
We can see that if the sum in the middle
59. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 59
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for coverage
ej
nX
i=1
mijti n · ej 1 j
We can see that if the sum in the middle is zero
element ej) then the variable ej = 0. However, if the
ej = 1. Now we need to introduce a constraint related t
in order to transform the optimization problem in a
described in Section 2.2. These constraints are:
nX
i=1
citi B,
mX
j=1
ej P,
where B 2 Z is the maximum allowed cost and P 2 {0, 1
60. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 60
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for coverage
ej
nX
i=1
mijti n · ej 1 j
We can see that if the sum in the middle is zero
element ej) then the variable ej = 0. However, if the
ej = 1. Now we need to introduce a constraint related t
in order to transform the optimization problem in a
described in Section 2.2. These constraints are:
nX
i=1
citi B,
mX
j=1
ej P,
where B 2 Z is the maximum allowed cost and P 2 {0, 1
61. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 61
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for cost
riable ej must be 1 if and only if there exists a ti variable f
d ti = 1. The dependence between both sets of variables can
e following 2m PB constraints:
ej
nX
i=1
mijti n · ej 1 j m.
We can see that if the sum in the middle is zero (no tes
ment ej) then the variable ej = 0. However, if the sum is
= 1. Now we need to introduce a constraint related to each o
order to transform the optimization problem in a decision
scribed in Section 2.2. These constraints are:
nX
i=1
citi B,
mX
ej P,
62. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 62
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for cost
riable ej must be 1 if and only if there exists a ti variable f
d ti = 1. The dependence between both sets of variables can
e following 2m PB constraints:
ej
nX
i=1
mijti n · ej 1 j m.
We can see that if the sum in the middle is zero (no tes
ment ej) then the variable ej = 0. However, if the sum is
= 1. Now we need to introduce a constraint related to each o
order to transform the optimization problem in a decision
scribed in Section 2.2. These constraints are:
nX
i=1
citi B,
mX
ej P,
63. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 63
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
t1 + t2 + t4 + t5 4e1 (10)
t2 + t6 4e2 (11)
t1 + t3 + t6 4e3 (12)
t5 4e4 (13)
t5 1 0 0 1
t6 0 1 1 0
If we want to solve the 2-obj TSMP we need to instantiate E
(7). The result is:
e1 t1 + t2 + t4 + t5 4e1
e2 t2 + t6 4e2
e3 t1 + t3 + t6 4e3
e4 t5 4e4
t1 + t2 + t3 + t4 + t5 + t6 B
e1 + e2 + e3 + e4 P
where P, B 2 N.
If we are otherwise interested in the 1-obj version the formula
t1 + t2 + t4 + t5 1
t2 + t6 1
t1 + t3 + t6 1
t5 1
t1 + t2 + t3 + t4 + t5 + t6 B
f(x) B
e1 t1 + t2 + t4 + t5 6e1
e2 t2 + t6 6e2
e3 t1 + t3 + t6 6e3
e4 t5 6e4
Task: find equations for
this example
min
max
64. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 64
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
t1 + t2 + t4 + t5 4e1 (10)
t2 + t6 4e2 (11)
t1 + t3 + t6 4e3 (12)
t5 4e4 (13)
t5 1 0 0 1
t6 0 1 1 0
If we want to solve the 2-obj TSMP we need to instantiate E
(7). The result is:
e1 t1 + t2 + t4 + t5 4e1
e2 t2 + t6 4e2
e3 t1 + t3 + t6 4e3
e4 t5 4e4
t1 + t2 + t3 + t4 + t5 + t6 B
e1 + e2 + e3 + e4 P
where P, B 2 N.
If we are otherwise interested in the 1-obj version the formula
t1 + t2 + t4 + t5 1
t2 + t6 1
t1 + t3 + t6 1
t5 1
t1 + t2 + t3 + t4 + t5 + t6 B
f(x) B
e1 t1 + t2 + t4 + t5 6e1
e2 t2 + t6 6e2
e3 t1 + t3 + t6 6e3
e4 t5 6e4
Task: find equations for
this example
min
max
65. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 65
Algorithm for Solving the 2-obj TSM
Cost
Coverage
Max coverage
Find max coverage
Decrease cost and find
the maximum coverage
again
and again
min cost, keeping cov
66. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 66
Instances from the Software-artifact Infrastructure Repository (SIR)
TSM Instances
http://sir.unl.edu/portal/index.php
Instance Tests Elements to cover
printtokens1 4130 189
printtokens2 4115 199
replace 5542 242
schedule 2650 151
schedule2 2710 128
tcas 1608 65
totinfo 1052 124
67. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 67
En la implementación en R se han usado las primeras n variables del vector de
variables para los tests y las restantes m variables para los elementos a cubrir
Funciones relevantes:
• readTsmInstance(file, unitaryCost=FALSE): lee un fichero de instancia y devuelve
una lista con una representación interna
• ilpModel4Tsm(tsmInstance, costUpperBound=NULL, covLowerBound=NULL): toma
una instancia y una cota para coste o cobertura y crea un modelo ILP para la
instancia que optimiza el objetivo que no está acotado
• solveModel(model): resuelve el modelo ILP que se pasa como parámetro
Ejemplo:
Ejercicio
Tarea: resolver algunas
instancias con R
68. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 68
Complete la función computeParetoFront para calcular el frente complete de una
instancia
Ejemplo:
Ejercicio
Tarea: completar
computeParetoFront
69. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 69
Reduction in the Number of Test Cases
We can reduce the number of tests cases in the original test suite
If a test t1 covers more elements than another test t2 and has less cost, t2 can be
removed
e1 e2 e3 ... em
t1 1 0 0 … 1
t2 1 0 1 … 1
… … … … … …
tn 1 1 0 … 0
Test t1 can be
removed if c1 >= c2
70. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 70
Reduction in the Number of Test Cases
Instance Tests Reduced tests
printtokens1 4130
printtokens2 4115
replace 5542
schedule 2650
schedule2 2710
tcas 1608
totinfo 1052
Tarea: completar la tabla
Con la ayuda de reduceInstance complete la table.
¿Cuánto se tarda ahora en calcular el frente de Pareto? ¿Es igual?
71. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 71
72. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 72
Refactoring
Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4
G29: Avoid Negative Conditionals
Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be
expressed as positives. For example:
if((buffer.shouldCompact())
is preferable to
if((!buffer.shouldNotCompact())
G30: Functions Should Do One Thing
It is often tempting to create functions that have multiple sections that perform a series of operations.
Functions of this kind do more than one thing, and should be converted into many smaller functions, each of
which does one thing.
For example:
public(void(pay()({
((for((Employee(e(:(employees)({
((((if((e.isPayday())({
((((((Money(pay(=(e.calculatePay();
((((((e.deliverPay(pay);
((((}
((}
}
This bit of code does three things. It loops over all the employees, checks to
be paid, and then pays the employee. This code would be better written as:
public(void(pay()({
((for((Employee(e(:(employees)
((((payIfNecessary(e);
}
private(void(payIfNecessary(Employee(e)({
((if((e.isPayday())
((((calculateAndDeliverPay(e);
}
private(void(calculateAndDeliverPay(Employee(e)({
((Money(pay(=(e.calculatePay();
((e.deliverPay(pay);
}
Each of these functions does one thing. (See “Do One Thing” on page 35.)
G31: Hidden Temporal Couplings
Temporal couplings are often necessary, but you should not hide the couplin
Semantic-preserving change in the code
73. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 73
Anti-pattern
Common solution to a problem with bad consequences
74. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 74
Automatic Refactoring
Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4
Boolean logic is hard enough to understand without having to see it in the context of an if or while statement.
Extract functions that explain the intent of the conditional.
For example:
if((shouldBeDeleted(timer))
is preferable to
if((timer.hasExpired()(&&(!timer.isRecurrent())
G29: Avoid Negative Conditionals
Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be
expressed as positives. For example:
if((buffer.shouldCompact())
is preferable to
if((!buffer.shouldNotCompact())
G30: Functions Should Do One Thing
It is often tempting to create functions that have multiple sections that perform a series of operations.
Functions of this kind do more than one thing, and should be converted into many smaller functions, each of
which does one thing.
For example:
public(void(pay()({
((for((Employee(e(:(employees)({
((((if((e.isPayday())({
((((((Money(pay(=(e.calculatePay();
((((((e.deliverPay(pay);
((((}
((}
}
This bit of code does three things. It loops over all the employees, checks to s
be paid, and then pays the employee. This code would be better written as:
public(void(pay()({
((for((Employee(e(:(employees)
((((payIfNecessary(e);
}
private(void(payIfNecessary(Employee(e)({
((if((e.isPayday())
((((calculateAndDeliverPay(e);
}
private(void(calculateAndDeliverPay(Employee(e)({
((Money(pay(=(e.calculatePay();
((e.deliverPay(pay);
}
Each of these functions does one thing. (See “Do One Thing” on page 35.)
G31: Hidden Temporal Couplings
Temporal couplings are often necessary, but you should not hide the couplin
75. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 75
ential dependency conflicts and mutual exclusion
e more on these two kind of conflicts in the fol-
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
Listing 1. Example of classes to be refactored.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
be applied before r1 (inlining class Rectangle invalidates any move
method refactoring from/to that class). Hence, by removing redundant
solutions, and invalid solutions (solutions with elements that are con-
flicted) we can reduce the search-space size of the motivating example
by half (sequences 1, 2, 3, 4, 5, 6, 8 and 11). Thus, the value obtained
after applying Eq. (2) should be used as an upper bound of the search-
space size, as long as we assume that applying a refactoring sequence
code-ana
and a h
lationship
the lifetim
ships. He
relationsh
identified
contains
and anti-
nipulate
this step
matically
apply ref
quality o
design m
Gueheneu
Antoniol
3.2. Step
In thi
available
instances
that part
3.3. Step
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Me
r1 Move method Geometry cal
r2 Inline Class Rectangle All
r3 Introduce Parameter Object Geometry lon
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
R. Morales et al.
Example
i.e., the (1) detection of classes that contain anti-patterns; (2) the
generation of refactoring candidates to improve the design quality of
the classes detected in (1); (3) the search for an optimal refactoring
order; and (4) the application of the refactoring order from (3). To
achieve this goal, we propose a new heuristic approach called RePOR
(Refactoring approach based on Partial Order Reduction). Partial order
reduction is a popular technique for controlling state space explosion in
model checking (Lluch-Lafuente et al., 2002). The intuition is to reduce
the number of refactoring sequences to be explored by removing
equivalent sequences (i.e., refactoring sequences that leads to the same
design). As a result, less search effort is required than when using
metaheuristic algorithms. To evaluate RePOR, we conduct a series of
experiments over a testbed of five open source software systems (OSS)
and compare the results with Genetic Algorithm (GA) (Holland, 1975),
Ant Colony optimization (ACO) (Dorigo et al., 2006), the conflict-aware
refactoring scheduling approach proposed by Liu et al. (2008) (referred
to as LIU in this paper), and a new optimizer based on sampling (SWAY)
(Chen et al., 2018). We show that the solutions obtained by RePOR
overcome the ones obtained by the above-mentioned state-of-the-art
optimization techniques in terms of performance (i.e., execution time)
and effort (i.e., number of refactorings applied).
Tool and Data Replication. The Eclipse Plug-in and all the data
used in the experiments are available on the RePOR replication package
(Morales et al., 2017b).
The remainder of the paper is organized as follows: Section 2 dis-
cusses the formulation of the refactoring scheduling problem, and de-
scribes how to reduce the search-space size using partial order reduc-
tion. Section 3 describes RePOR in detail. Section 4 presents the case
study for evaluating our approach. Section 5 presents and discusses the
results obtained in our case study. Section 7 discloses the threats to the
validity of our study. Related work is discussed in Section 8. Finally, we
present our conclusions and lay out directions for future work in
Section 9.
2. Formulation of the refactoring scheduling problem
As a software system ages, its design quality deteriorates unless it is
continually maintained (Parnas, 1994). Refactoring is a software
maintenance activity that aims to keep the design quality of a software
system at an acceptable level, in order to ensure a normal evolution of
the system. Typically, refactoring is performed by applying small
transformation operations (e.g., moving a method/field to another
class) to a software system while preserving its original behavior. Since
there is a wide range of candidate refactorings that can be applied on a
system, depending on the domain of the system, an optimal solution
may be comprised of several refactorings that improve different quality
attributes. Hence, the refactoring scheduling problem consists of
finding the best combination of refactorings that maximizes the design
quality improvement of a software system. The problem of finding an
optimal order can be solved using search-based techniques. Search al-
gorithms start by generating one or more random sequences. Next, the
quality of each sequence is computed by applying it to the software
the number of occurrences of an
The outcome of Q(SR) is a nega
moves anti-patterns; zero if the
same, and positive otherwise. T
lated to the presence and the or
Hence, we suggest that refac
on the classes that they affect. I
parately. Since the order of app
ferent classes in a sequence is irr
refactoring operations that we n
that we have a set of refact
According to Morales et al. (2
quences (S) that we could gener
given by Eq. (2).
= ⎧
⎨⎩
⌊ ⌋ ∀ ≥
=
S
e n n
n
· ! 1
1 0
where e is the Euler constant,
available.
Applying Eq. (2) to our ex
(⌊ ⌋ =e·2! 5): < > , < A > , < B
if (iff) we assume that each per
(here the term solution refers to
sequence to a system, i.e., the re
and < B, A > are two different
and only 4 different solutions ex
In the case of refactorings th
design may vary depending on
factorings, as the application of
the rest of refactorings. We can
factorings as an undirected graph
ru, rv ∈ Rk. k ∈ K, where K is the
set of refactorings that affect cla
graph, is linked to the structure o
refactorings modify a class, an
factorings affect the number of
after refactoring.
We use GB to find the conne
component is a maximal subgra
connected by a path. Connecte
over the refactoring operations.
reduction from model checking (L
the removal of sequences of refa
Partial order reduction (POR) i
tativity of asynchronous systems
concurrent models impose an a
events, refactoring scheduling im
refactoring operations. The orde
instructions is meaningless (as th
Hence, we can consider just o
property since the other ordering
to construct a reduced state gra
Are all permutations relevant?
76. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 76
the
em.
re-
e to
-re-
the
ing,
the
y of
ring
To
∑= = ′ −
∈
Q SR Q sr Q sr AC k AC k( ) ( ); with ( ) ( ) ( )
k K
k k
In Eq. (1), SR is a subset of R; R is the set of refactorin
applied in a system SYS; K is the set of classes in SYS, K ∈ SYS
subset of SR that modifies class k (k ∈ K). Each sub-function
computed by subtracting the number of occurrences of anti-pa
class k after applying srk to k (i.e., AC(k′)) and the number o
rences of anti-patterns before refactoring (i.e., AC(k)). Note tha
the number of occurrences of anti-patterns as a proxy of design
The outcome of Q(SR) is a negative value when applying SR
moves anti-patterns; zero if the number of anti-patterns rem
same, and positive otherwise. The quality effect of applying
Objective Function
Class after refactoring
Class before refactoring
Anti-patterns count
me conclusions and future work.
UDO-BOOLEAN OPTIMIZATION
hod for identifying improving moves in the radius
g ball can be applied to all k-bounded pseudo-
ptimization problems. This makes our method
al: every compressible pseudo-Boolean Optimiza-
m can be transformed into a quadratic pseudo-
ptimization problem with k = 2.
ily of k-bounded pseudo-Boolean Optimization
ave also been described as an embedded landscape.
ed landscape [3] with bounded epistasis k is de-
function f(x) that can be written as the sum
nctions, each one depending at most on k input
That is:
f(x) =
mX
i=1
f(i)
(x), (1)
subfunctions f(i)
depend only on k components
dded Landscapes generalize NK-landscapes and
SAT problem. We will consider in this paper that
of subfunctions is linear in n, that is m 2 O(n).
dscapes m = n and is a common assumption in
T that m 2 O(n).
subfunctions f . Let us define w
such that the i-th element of wl is
on variable xi. The vector wl ca
that characterizes the variables t
has bounded epistasis k, the num
with |wl|, is at most k. By the
equalities immediately follow.
f(l)
(x v) = f(l)
(x) for all v
S(l)
v (x) =
⇢
0 if w
S
(l)
v^wl
(x) othe
Equation (5) claims that if n
change in the move characterize
f(l)
the Score of this subfunction
this subfunction will not change f
On the other hand, if f(l)
depend
we only need to consider for the
changed variables that a↵ect f(l)
acterized by the mask vector v ^
we can write (3) as:
Sv(x) =
mX
l=1
wl^v6=0
f = + + +f(1)(x) f(2)(x) f(3)(x) f(4)(x)
x1 x2 x3 x4
The structure is well-known in optimization…
x4 x3
x1 x2
Variable
Interaction Graph
77. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 77
Objective Function
x1
x2
x4
x3
x5
x6
If variable interaction graph has several connected componentes, we can
optimize each of them independently
78. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 78
Dependency Graph (GB)
r1
r2
r4
r3
r5
r6
Two refactoring operations are adjacent in GB when both touch the same class
We can optimize each connected component of GB independently, exploring all the
posible sequences in the component
79. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 79
Dependency Graph (GB): example
What is the dependency graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
dependency graph
80. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 80
Dependency Graph (GB): example
What is the dependency graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
dependency graph
81. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 81
Conflict Graph (GC)
r1
r2
r4
r3
r5
r6
Conflict graph is used to reduce the number sequences to explore in each
component
Sequential dependency conflict
Mutual exclusion conflict
82. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 82
What is the conflict graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
conflict graph
Conflict Graph (GC): example
83. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 83
What is the conflict graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
conflict graph
Conflict Graph (GC): example
84. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 84
Input : System to refactor (SYS), Maximum number of refactoring operations in a connected component subgraph (threshold)
Output: An optimal sequence of refactoring operations (S R)
1 Require Proc: extractBestPermutation, getFirstValidS equenceFromccap
2 Steps RePOR(SYS, threshold)
3 AM = code meta-model generation (SYS)
4 A = Detect Anti-patterns(AM)
5 R = Generate set of refactoring candidates(AM, A)
6 GB = Build Graph of dependencies between refactorings and anti-patterns(AM, R, A)
7 CCAP = Find connected components (GB)
8 GC = Build Graph of conflicts between refactorings (AM, LR)
9 S R = Schedule sequence of refactorings(CCAP, GC, AM)
10 Procedure Schedule sequence of refactorings(CCAP, GC, AM):
11 S R = 0
12 for each ccap ∈ CCAP do
13 ccap.RemoveInvalidRefactorings(S R)
14 if ccap.size == 0 then
15 continue
16 else
17 List permuts = enumeratePermutations(ccap)
18 if permuts ≤ threshold then
19 S R.addAll(extractBestPermutation(AM, GC, permuts))
20 else
21 S R.addAll(getFirstValidS equenceFromccap(AM, GC, ccap, R))
22 end if
23 end if
24 end for
25 return S R
26 end
Algorithm 1. RePOR.
RePOR
85. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 85
Experimental Setup
Subjects
Tools
• PADL to create a high level model of the software
• DECOR to detect and correct anti-patterns on the model
In Table 4 we describe the type of anti-patterns studied and
refactoring strategies used to remove them. Table 5 shows the num
of refactoring candidates that were automatically found in each sys
4.3. RePOR implementation
We instantiate RePOR as an eclipse plug-in and compared it
three refactoring approaches. Design improvement (DI) is meas
using Eq. (3). To determine the value of the parameter thres
Listing 2. Rule card of Blob anti-pattern from DECOR.
Table 3
Descriptive statistics about the studied systems.
System NOC KLOC BL LC LP SC SG Total
Apache Ant 1.8.2 697 191 57 40 35 3 6 141
ArgoUML 0.34 1754 183 131 25 281 1 19 457
GanttProject 1.10.2 188 44 47 4 68 5 6 130
JfreeChart 1.0.19 505 98 41 21 62 1 1 126
Xerces 2.7 540 71 56 25 119 2 3 205
Table 4
List of studied Anti-patterns and the refactorings used to correct them.
Type Description Refactoring(s) strategy
Blob (BL) (Brown et al., 1998) A large class that absorbs most of the functionality of the system with
very low cohesion between its constituents.
Move method (MM). Move the methods that does not seem to fit in
Blob class abstraction to more appropriate classes (Seng et al., 200
Lazy Class (LC) (Fowler, 1999a) Small classes with low complexity that do not justify their existence
in the system.
Inline class (IC). Move the attributes and methods of the LC to anot
class in the system.
Long Parameter List (LP)
(Fowler, 1999a)
A class with one or more methods having a long list of parameters,
specially when two or more methods are sharing a long list of
parameters that are semantically connected.
Introduce parameter object (IPO). Extract a new class with the long
of parameters and replace the method signature by a reference to
new object created. Then access to this parameters through the
parameter object.
Spaghetti Code (SC)
(Brown et al., 1998)
A class without structure that declares long methods without
parameters.
Replace method with method object (RMWO). Extract long methods i
new classes so all local variables become fields on that object.
Speculative Generality (SG) There is an abstract class created to anticipate further features, but it Collapse hierarchy (CH). Move the attributes and methods of the ch
described in Section 3.7, we executed 30 independent executions for
each of the systems studied in a Windows 10 64-bit, Intel Core 5 at 2.30
GHz, 12 GB of memory machine, and record the size of ccap, where the
performance of RePOR is acceptable, and found =threshold 10 to be the
best trade. The value of threshold indicates that for our experiments, we
only exhaustively explore the permutations of a ccap containing 10 or
less refactoring operations, and evaluate the resultant permutations
only after removing any conflicted refactoring operation.
The directed graph of conflicts (GC) is used for the three meta-
heuristics to avoid scheduling invalid refactorings. Due to the random
nature of the metaheuristics studied (i.e., ACO, GA, and SWAY) it is ne-
cessary to perform several independent runs to have an idea of the
behavior of the algorithms. Hence, we execute 30 independent runs for
all the approaches studied and for each system. This is a typical
minimum value (i.e., 30 runs) used in the search-based research com-
Table 5
Number of refactoring candidates automatically generated for each studied
system.
CH IC IPO MM RMWO Total
Ant
6 9 35 4269 3 4322
ArgoUML
19 25 281 2475 1 2800
Gantt Project
6 4 68 3861 5 3944
JfreeChart
1 21 62 4228 1 4313
Xerces
3 25 119 4118 2 4267
R. Morales et al.
86. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 86
Experimental Setup
Performance measures
• Design Improvement
• Execution time (ET): runtime of algorithms
• Refactoring Effort (RE): number of refactoring operations in the sequence
ment.1
For all statistical tests, we consider a significance level of
For RQ1, we measure the effectiveness of RePOR at removing a
patterns in software systems using the following dependent variable
• Design Improvement (DI). DI represents the delta of anti-patte
occurrences between the refactored system (SYS′) and the orig
system (SYS) and it is computed using the following formulatio
=
′ −
×DI SYS
AC SYS AC SYS
AC SYS
( )
( ) ( )
( )
100.
Where AC(SYS) is the number of anti-patterns in a system SYS
AC(SYS) ≥ 0. DI, which is a positive real number, represents
improvement amount in percentage, and high positive values
desired. Note that Eq. (3) assumes that ′ − <AC SYS AC SYS( ) ( ) 0
RePOR filters out solutions that make the design worse accordin
the desiredEffect threshold (cf., Algorithm 4).
The independent variable is the refactoring approach applied
each studied system. We statistically compare the number of
maining anti-patterns after refactoring a system using RePOR w
the number of remaining anti-patterns when using other refactor
approaches. Specifically, we test the following hypothesis H01: Th
is no difference between the number of remaining anti-patterns o
system refactored using RePOR, and a system refactored using o
refactoring approaches. We test the hypothesis using a non-p
metric test, i.e., the Mann–Whitney U test (Hollander et al., 201
For estimating the magnitude of the differences of means betw
Algorithms
• RePOR
• Conflict-aware scheduling of refactoring heuristic by Liu et al. (2008) (LIU)
• Ant Colony Optimization (ACO)
• Genetic Algorithm (GA)
• SWAY metaheuristic by Chen et al. (2018)
87. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 87
Results
RQ1: To what extent can RePOR remove anti-patterns?
We present in Table 7 the Design improvement (DI) in general and
the rest of the systems.
We reject the null hypothesis H01 for Ant, ArgoUML, Gantt,
JfreeChart, and Xerces. In these five systems, the number of re-
maining anti-patterns after refactoring using RePOR is significantly
lower than the number of anti-patterns remaining in the systems
after refactoring using the other refactoring approaches (i.e., ACO,
Table 7
Design Improvement (%) in general and for different anti-pattern types.
Metaheuristic DI DIBL DILC DILP DISC DISG
Ant
ACO 57.45 68.42 22.5 74.29 66.67 100
GA 58.16 68.42 22.5 74.29 66.67 100
LIU 58.87 54.39 22.5 100 66.67 100
RePOR 60.28 57.89 22.5 100 66.67 100
SWAY 45.36 57.89 20 60 66.67 83.33
ArgoUML
ACO 75.93 51.15 100 83.63 100 100
GA 76.59 51.15 100 84.7 100 100
LIU 81.40 50.38 100 92.88 100 100
RePOR 81.62 38.93 100 98.58 100 100
SWAY 62.91 48.09 84 66.01 100 86.84
Gantt Project
ACO 60 17.02 100 83.82 70 100
GA 60.77 14.89 100 85.29 80 100
LIU 63.85 14.89 100 92.65 60 100
RePOR 66.15 8.51 75 100 100 100
SWAY 50 8.51 100 70.59 60 100
JfreeChart
ACO 75.4 39.02 100 89.52 100 100
GA 75.4 39.02 100 90.32 100 100
LIU 72.22 31.71 100 88.71 100 100
RePOR 75.4 24.39 100 100 100 100
SWAY 61.90 36.59 90.48 73.39 100 100
Xerces
ACO 56.59 14.29 100 65.55 100 100
GA 57.56 14.29 100 67.23 100 100
LIU 64.39 16.07 100 78.99 50 100
RePOR 73.17 5.36 100 98.32 100 100
SWAY 41.87 14.29 68.00 49.58 50 100
Table 8
Pair-wise Mann–Whitney U Test for design improvement.
Pair −p value Cliff’s δ Magnitude
Ant
ACO-RePOR 2.561349e−12 1 Large
GA-RePOR 1.431438e−11 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.190193e−12 1 Large
ArgoUML
ACO-RePOR 1.176641e−12 1 Large
GA-RePOR 1.143381e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.206843e−12 1 Large
Gantt Project
ACO-RePOR 1.036681e−12 1 Large
GA-RePOR 1.086586e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.165138e−12 1 Large
JfreeChart
ACO-RePOR 0.06868602 0.2333333 Small
GA-RePOR 0.2771456 −0.1333333 Negligible
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.183399e−12 1 Large
Xerces
ACO-RePOR 1.0618e−12 1 Large
GA-RePOR 9.946555e−13 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.193116e−12 1 Large
R. Morales et al.
the rest of the systems.
We reject the null hypothesis H01 for Ant, ArgoUML, Gantt,
ble 7
sign Improvement (%) in general and for different anti-pattern types.
Metaheuristic DI DIBL DILC DILP DISC DISG
nt
CO 57.45 68.42 22.5 74.29 66.67 100
A 58.16 68.42 22.5 74.29 66.67 100
IU 58.87 54.39 22.5 100 66.67 100
ePOR 60.28 57.89 22.5 100 66.67 100
WAY 45.36 57.89 20 60 66.67 83.33
rgoUML
CO 75.93 51.15 100 83.63 100 100
A 76.59 51.15 100 84.7 100 100
IU 81.40 50.38 100 92.88 100 100
ePOR 81.62 38.93 100 98.58 100 100
WAY 62.91 48.09 84 66.01 100 86.84
antt Project
CO 60 17.02 100 83.82 70 100
A 60.77 14.89 100 85.29 80 100
IU 63.85 14.89 100 92.65 60 100
ePOR 66.15 8.51 75 100 100 100
WAY 50 8.51 100 70.59 60 100
freeChart
CO 75.4 39.02 100 89.52 100 100
A 75.4 39.02 100 90.32 100 100
IU 72.22 31.71 100 88.71 100 100
ePOR 75.4 24.39 100 100 100 100
WAY 61.90 36.59 90.48 73.39 100 100
erces
CO 56.59 14.29 100 65.55 100 100
A 57.56 14.29 100 67.23 100 100
IU 64.39 16.07 100 78.99 50 100
ePOR 73.17 5.36 100 98.32 100 100
WAY 41.87 14.29 68.00 49.58 50 100
Table 8
Pair-wise Mann–Whitney U Test for design improvement.
Pair −p value Cliff’s δ Magnitude
Ant
ACO-RePOR 2.561349e−12 1 Large
GA-RePOR 1.431438e−11 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.190193e−12 1 Large
ArgoUML
ACO-RePOR 1.176641e−12 1 Large
GA-RePOR 1.143381e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.206843e−12 1 Large
Gantt Project
ACO-RePOR 1.036681e−12 1 Large
GA-RePOR 1.086586e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.165138e−12 1 Large
JfreeChart
ACO-RePOR 0.06868602 0.2333333 Small
GA-RePOR 0.2771456 −0.1333333 Negligible
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.183399e−12 1 Large
Xerces
ACO-RePOR 1.0618e−12 1 Large
GA-RePOR 9.946555e−13 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.193116e−12 1 Large
Morales et al.