SlideShare a Scribd company logo
1 of 127
Download to read offline
Seminario-taller
Introducción a la Ingeniería del
Software Guiada por Búsqueda
Francisco Chicano
Departamento de Lenguajes y Ciencias de la Computación
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 2
chicano@lcc.uma.es
@francischicano
www.franciscochicano.es
José Francisco Chicano García
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 3
Planificación temporal
Hora Lunes 26 Martes 27
9:00-10:30 Introducción a SBSE y NRP Minimización de casos de prueba
10:30-10:45 Descanso Descanso
10:45-12:15 NRP (continuación) Refactorización
12:15-12:30 Descanso Descanso
12:30-14:00 Agrupamiento de módulos Planificación de proyectos y
prueba de conocimiento
Habrá una pequeña prueba de conocimiento el martes 27 en la última
franja
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 4
Materiales para seguir el taller
Software:
• RStudio (versión on-line en https://rstudio.cloud)
• Symphony (resolutor ILP open-source)
• Rsymphony (paquete de R para conectar con Symphony)
Código y ejemplos
• Disponibles en GitHub: https://github.com/jfrchicanog/TallerUAL2020
• Y en Rstudio.cloud: https://rstudio.cloud/project/1815713
Tarea: acceder a RStudio e
instalar Rsymphony
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 5
• Introducción a SBSE
• Requisitos para la Siguiente Versión (NRP)
• Programación Lineal Entera
• Optimización Multiobjetivo
• Agrupamiento de Módulos Software
• Minimización de Casos de Prueba
• Refactorización Automática de Software
• Planificación de Proyectos Software
• Conclusión
• Prueba de Conocimiento
Índice
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 6
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 7
Ingeniería del Software
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 8
Problemas de búsqueda
Un problema de búsqueda es una relación binaria R ⊆ X×Y, tal que dado un x ∈
X (instancia) estamos interesados en encontrar y ∈ Y (solución) con (x,y) ∈ R
Ejemplos de instancias de problemas de búsqueda:
- Encontrar los factores primos de 15
- Encontrar una cadena que case con la expresión regular a*b
- Encontrar un número real x que minimice la expresión (x-1)^2
Nos centraremos fundamentalmente en un subtipo de problemas de búsqueda:
los problemas de optimización
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 9
Un problema de optimización es un par: P = (S,f) donde:
S es un conjunto de soluciones (o espacio de búsqueda)
f: S → R es una función objetivo a minimizar o maximizar
Si nuestro objetivo es minimizar la función buscamos:
Máximo global
Máximo local
Mínimo global
Mínimo local
s’ Î S | f(s’) ≤ f(s), "s Î S
Problemas de optimización
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 10
Algoritmos de optimización
TÉCNICAS DE OPTIMIZACIÓN
EXACTAS APROXIMADAS
HEURÍSTICAS AD HOC METAHEURÍSTICAS
Gradiente
Mult. de Lagrange
Basadas en el cálculo
Programación dinámica
Ramificación y poda
Resolutor ILP
Exhaustivas
SA
VNS
TS
Trayectoria
EA
ACO
PSO
Población
Híbridos
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 11
Ingeniería del Software Guiada por Búsqueda
Máximo Global
Máximo Local
Mínimo Global
Mínimo Local
Problema de búsqueda
u optimización
Algoritmo de
búsqueda u
optimización
Solución
Término en inglés: Search-Based Software Engineering (SBSE)
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 12
Ingeniería del Software Guiada por Búsqueda
Término en inglés: Search-Based Software Engineering (SBSE)
Requisitos para la
siguiente versión
Agrupamiento de
módulos software
Minimización de
casos de prueba
Refactorización
automática
Planificación
de proyectos
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 13
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 14
Dados:
Ø Un conjunto de requisitos R = {r1, r2, ..., rn} …
Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes)
Ø Un conjunto de interacciones funcionales entre requisitos
Ø Implicación (ri antes que rj):
Ø Combinación (ri a la vez que rj):
Ø Exclusión (no a la vez):
Encontrar un subconjunto de requisitos que además de cumplir con las
interacciones minimice el coste y maximice el valor:
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas par
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no p
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido j
Si llamamos X ✓ R al conjunto de requisitos seleccionado
de X vienen dados por las funciones:
coste(X) =
nX
cj y valor(X) =
nX
ar como la suma ponderada de los valores de imporPm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos,
sarrollo determinado, lo que limita las alternativas p
teracciones funcionales entre requisitos se clasifican
mplicaci´on o precedencia. ri ) rj. Un requisito rj no
eviamente otro requisito ri no ha sido implementado
ombinaci´on o acoplamiento. ri rj. Los requisitos ri y
forma conjunta en el software.
xclusi´on. ri rj. El requisito ri no puede ser incluido
llamamos X ✓ R al conjunto de requisitos selecciona
vienen dados por las funciones:
nX nX
calcular como la suma ponderada de los va
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccion
de desarrollo determinado, lo que limita las
Las interacciones funcionales entre requisito
Implicaci´on o precedencia. ri ) rj. Un
previamente otro requisito ri no ha sido
Combinaci´on o acoplamiento. ri rj. Los
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no pu
Si llamamos X ✓ R al conjunto de requis
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y v
da requisito rj 2 R tiene un coste cj para la empresa si se
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas pa
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido
Si llamamos X ✓ R al conjunto de requisitos seleccionad
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y valor(X) =
nX
j,rj 2X
respectivamente. Consideraremos una versi´on multi-objetiv
minimice el coste y maximice el valor del conjunto de requi
min
max
Bagnall et al. van der Akker et al.
Next Release Problem (NRP)
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 15
Next Release Problem (NRP): ejemplo
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
coste({r1, r3})=
valor({r1, r3})=
coste({r1, r2, r3})=
valor({r1, r2, r3})=
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 16
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 17
Introducción a la programación lineal
Un problema en programación lineal tiene la forma
max
nX
j=1
cjxj
nX
j=1
a1jxj  b1
nX
j=1
a2jxj  b2
. . .
nX
j=1
amjxj  bm
xj 0 j = 1, 2, . . . , n
max
nX
cjxj
X
j=1
a2jxj  b2
. . .
nX
j=1
amjxj  bm
xj 0 j = 1, 2, . . . , n
max
nX
j=1
cjxj
sujeto a
nX
j=1
aijxj  bi i = 1, 2, . . . , m
xj 0 j = 1, 2, . . . , n
max c · x
sujeto a
Ax  b
x 0
j=1
sujeto a
nX
j=1
aijxj  bi i = 1, 2, . . . , m
xj 0 j = 1, 2, . . . , n
max c · x
sujeto a
Ax  b
x 0
1
Sujeto a: Sujeto a: Sujeto a:
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 18
Introducción a la programación lineal
Ejemplo:
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Región factible
x1+x2=cte
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 19
Introducción a la programación lineal
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 20
Introducción a la programación lineal
Con Rsymphony
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Región factible
Por defecto, las columnas
se rellenan primero
Tarea: resolver el
programa con RStudio
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 21
Programación lineal entera
Se añade la restricción de que las variables solo pueden tomar
valores enteros
Ejemplo:
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
x1, x2 enteros
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Soluciones factibles
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 22
Con Rsymphony
Maximizar x1+x2
Sujeto a:
– x1 + 9x2 ≤ 36
9x1 +x2 ≤ 45
x1, x2 ≥ 0
x1, x2 enteros
Tarea: resolver el
programa con RStudio
0 1 2 3 4 5
0
1
2
3
4
5
x1
x2
Soluciones factibles
Programación lineal entera
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 23
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 24
Dados:
Ø Un conjunto de requisitos R = {r1, r2, ..., rn} …
Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes)
Ø Un conjunto de interacciones funcionales entre requisitos
Ø Implicación (ri antes que rj):
Ø Combinación (ri a la vez que rj):
Ø Exclusión (no a la vez):
Encontrar un subconjunto de requisitos que además de cumplir con las
interacciones minimice el coste y maximice el valor:
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas par
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no p
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido j
Si llamamos X ✓ R al conjunto de requisitos seleccionado
de X vienen dados por las funciones:
coste(X) =
nX
cj y valor(X) =
nX
ar como la suma ponderada de los valores de imporPm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos,
sarrollo determinado, lo que limita las alternativas p
teracciones funcionales entre requisitos se clasifican
mplicaci´on o precedencia. ri ) rj. Un requisito rj no
eviamente otro requisito ri no ha sido implementado
ombinaci´on o acoplamiento. ri rj. Los requisitos ri y
forma conjunta en el software.
xclusi´on. ri rj. El requisito ri no puede ser incluido
llamamos X ✓ R al conjunto de requisitos selecciona
vienen dados por las funciones:
nX nX
calcular como la suma ponderada de los va
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccion
de desarrollo determinado, lo que limita las
Las interacciones funcionales entre requisito
Implicaci´on o precedencia. ri ) rj. Un
previamente otro requisito ri no ha sido
Combinaci´on o acoplamiento. ri rj. Los
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no pu
Si llamamos X ✓ R al conjunto de requis
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y v
da requisito rj 2 R tiene un coste cj para la empresa si se
del requisito rj para el cliente i se representa con vij 2 R. L
valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de
calcular como la suma ponderada de los valores de importa
sj =
Pm
i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im
de desarrollo determinado, lo que limita las alternativas pa
Las interacciones funcionales entre requisitos se clasifican en
Implicaci´on o precedencia. ri ) rj. Un requisito rj no
previamente otro requisito ri no ha sido implementado.
Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r
de forma conjunta en el software.
Exclusi´on. ri rj. El requisito ri no puede ser incluido
Si llamamos X ✓ R al conjunto de requisitos seleccionad
de X vienen dados por las funciones:
coste(X) =
nX
j,rj 2X
cj y valor(X) =
nX
j,rj 2X
respectivamente. Consideraremos una versi´on multi-objetiv
minimice el coste y maximice el valor del conjunto de requi
min
max
Bagnall et al. van der Akker et al.
Next Release Problem (NRP)
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 25
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: Objetivomax c · x
o a
Ax  b
x 0
max
mX
i=1
wisi
Tarea: hallar la
expresión objetivo
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 26
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: Objetivomax c · x
o a
Ax  b
x 0
max
mX
i=1
wisi
Tarea: hallar la
expresión objetivo
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 27
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: restricción de coste
max c · x
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
Tarea: hallar la
restricción de coste
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 28
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: restricción de coste
max c · x
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
Tarea: hallar la
restricción de coste
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 29
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
1
Tarea: hallar las restricciones
de dependencias entre
requisitos (implicación)
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 30
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
1
Tarea: hallar las restricciones
de dependencias entre
requisitos (implicación)
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 31
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
Tarea: hallar las restricciones
de dependencias entre
requisitos (combinación)
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
rj = ri 8(i, j) 2 C
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 32
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: dependencias
Tarea: hallar las restricciones
de dependencias entre
requisitos (combinación)
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
rj = ri 8(i, j) 2 C
valor( ˆR) =
mX
i=1
wi
Y
(j,i)2Q
h
j 2 ˆR
i
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 33
max c · x
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: satisfacción de clientes
Tarea: hallar las restricciones
de satisfacción de clientes
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 34
max c · x
sujeto a
Ax  b
x 0
max
mX
i=1
wisi
nX
i=1
ciri  B
sj  ri 8(i, j) 2 Q
rj  ri 8(i, j) 2 P
En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste
limitado por una fracción del coste total de implementación de todos los requisites
Definimos un conjunto de n variables ri para los requisitos y m variables si para los
clientes. Tomarán valores 0 y 1.
Si ri=1 el requisito i se implementa, si ri=0 no se implementa
Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan)
El valor del cliente i para la empresa es wi
El coste de implementar el requisito i es ci
El presupuesto es B
Modelo ILP de NRP: satisfacción de clientes
Tarea: hallar las restricciones
de satisfacción de clientes
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 35
En la implementación en R se han usado las primeras n variables del vector de
variables para los requisitos y las restantes m variables para los clientes
Funciones relevantes:
• readNrpInstance(file): lee un fichero de instancia y devuelve una lista con una
representación interna
• ilpModel(nrpInstance, budgetLimitFraction): toma una lista con una instancia y una
fracción (número real) y crea un modelo ILP para la instancia
Ejemplo:
Modelo ILP de NRP
Tarea: resolver algunas
instancias con R
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 36
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 37
• En un problema MO hay varios objetivos (funciones) que queremos optimizar
f1
f2 Soluciones eficientes
(no dominadas)
Soluciones débilmente
eficientes
Solución no
soportada
Optimización multiobjetivo
Solución
dominada
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 38
Si minimizamos ambos objetivos
f1
f2
Optimización multiobjetivo
f1
f2
Frente convexo
Frente cóncavo
Fácil de resolver con
sumas ponderadas
de objetivos
No se puede resolver
con sumas ponderadas
de objetivos
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 39
¿Cómo será el frente en NRP?
coste
valor
valor
coste
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 40
0
100
200
300
400
500
600
700
800
0 10 20 30 40 50 60
Valor
Coste
ACS
NSGAII
GRASP
Pareto
(a) dataset1
0
500
1000
1500
2000
0 100 200 300 400 500 600 700
Valor
Coste
ACS
NSGAII
GRASP
Pareto
(b) dataset2
Figura 1. Frente de Pareto y aproximaciones de los algoritmos metaheur´ısticos.
Hemos de indicar que estos tiempos se refieren de nuevo a una m´aquina
diferente (Pentium 4 a 3,2 GHz) y el objetivo no era encontrar el frente completo,
Algunos ejemplos
C., Domínguez-Ríos, del Águila, del Sagrado, Alba, JISBD 2016
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 41
NRP Multiobjetivo
Tarea: hallar manualmente el frente
de Pareto para nuestro ejemplo
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 42
NRP Multiobjetivo
Tarea: calcula el frente usando R
Clientes (importancia)
Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5)
r1 2 x x
r2 4 x
r3 3 x x
r4 5 x
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 43
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 44
Queremos encontrar una partición de un conjunto de módulos software de
manera que el software quede estructurado en subsistemas que permitan
una mejora en el desarrollo y mantenibilidad del mismo
Agrupamiento de módulos software
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 45
Cómo medir la calidad de la solución obtenida:
Intra-conectividad: mide la cohesión entre módulos pertenecientes
a un mismo subsistema.
Inter-conectividad: mide el acoplamiento existente entre módulos
que pertenecen a distintos subsistemas.
La calidad de modularización del sistema (Modularization Quality, MQ)
combina ambas.
Agrupamiento de módulos software
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 46
Dado un grafo de dependencias de módulos G = (V, A) , definimos un peso
w para cada arista. Llamamos n al número de nodos (módulos) y m al
número de aristas (número de relaciones o dependencias).
Se define la calidad de modularización del sistema como
El valor i (intra-conectividad) es la suma de los pesos de las aristas cuyos
extremos están ambos dentro del subsistema. Mide la cohesión.
El valor j (inter-conectividad) representa la suma de los pesos de las aristas con
un extremo en el subsistema y el otro no. Mide el acoplamiento.
Agrupamiento de módulos software
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 47
087631 ===== MFMFMFMFMF
2
1
21
1
2
15 =
×+
=MF
7
4
32
2
2
12 =
×+
=MF
7
6
13
3
2
14 =
×+
=MF
...928571.1
14
27
7
6
7
4
2
1
==++=MQ
Agrupamiento de módulos software: ejemplo
Tarea: hallar MQ
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 48
087631 ===== MFMFMFMFMF
2
1
21
1
2
15 =
×+
=MF
7
4
32
2
2
12 =
×+
=MF
7
6
13
3
2
14 =
×+
=MF
...928571.1
14
27
7
6
7
4
2
1
==++=MQ
Agrupamiento de módulos software: ejemplo
Tarea: hallar MQ
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 49
Agrupamiento de módulos software: preguntas
¿Cuánto vale MQ si todos los módulos
están en grupos diferentes?
¿Cuánto vale MQ si todos los módulos
están en el mismo grupo?
¿Qué valor máximo puede tomar MQ?
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 50
El número de particiones de un conjunto de n elementos es un número de Bell
1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975, …
¡Esto crece muy rápido!
Los algoritmos enumerativos son inviables para muchos módulos
El problema es no lineal (se descarta programación lineal entera)
Algoritmos exactos: ramificación y poda
Algoritmos aproximados: heurísticas y metaheurísticas
Agrupamiento de módulos software: resolución
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 51
Análisis del modelo:
- Si n = 1, MQ* = 0
- Si n = 2, MQ* = 1
- Si todos los nodos están aislados, MQ = 0
- Si hay un único subsistema (y más de un nodo), MQ = 1
- Para k subsistemas y n-k subsistemas: MQ <= k
- Experimentalmente se observa que el valor MQ* suele ser bajo en comparación
con el número de módulos
- Para k fijo, si hay gran diferencia de cardinalidad entre el grupo más grande y el
más pequeño, se obtiene un valor de MQ más bajo.
( )2,1,3,1,2,1* =xFormato de una solución:
[ ]1,0ÎiMF
Agrupamiento de módulos software: resolución
¿Por qué?
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 52
Agrupamiento de módulos software: resolución
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 53
Valor obtenido por el mejor algoritmo heurístico de Praditwong et al
MQ
Enumerativo Algoritmo B&B
Soluciones
visitadas
Tiempo (s)
Soluciones
visitadas
Tiempo (s)
MDG 8 1,92857 4140 0,09 6 0,10
MDG 10 2,5 115975 0,14 11 0,13
MDG 15 2,812 1382958545 226,00 24 23,00
mtunis 2,314* 2,314* 121,00*
Agrupamiento de módulos software: resolución
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 54
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 55
Test Suite Minimization
Given:
Ø A set of test cases T = {t1, t2, ..., tn}
Ø A set of software elements to be covered (e.g., use cases) E= {e1, e2, ..., ek}
Ø A coverage matrix
Find a subset of tests X Í T maximizing coverage and minimizing the testing cost
tests X ✓ T with minimum cost covering all the program elements. In formal
terms:
minimize cost(X) =
nX
i=1
ti2X
ci (2)
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mij = 1.
The multi-objective version of the TSMP does not impose the constraint of
full coverage, but it defines the coverage as the second objective to optimize,
leading to a bi-objective problem. In short, the bi-objective TSMP consists in
finding a subset of tests X ✓ T having minimum cost and maximum coverage.
Formally:
minimize cost(X) =
nX
i=1
ti2X
ci (3)
maximize cov(X) = |{ej 2 E|9ti 2 X with mij = 1}| (4)
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
M=
3 Test Suite Minimization Problem
When a piece of software is modified, the new software is tested using
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 56
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
 t1 + t2 + t4 + t5  4e1 (10)
 t2 + t6  4e2 (11)
 t1 + t3 + t6  4e3 (12)
 t5  4e4 (13)
Assume unitary cost for tests: ci=1
cost({t1, t5})=
cov({t1, t5})=
cost({t1, t2, t5})=
cov({t1, t2, t5})=
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 57
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: constraints relating
covered elements and tests
The single-objective formulation of TSMP is a p
formulation. Then, we can translate the 2-obj T
and then infer the translation of the 1-obj TSM
Let us introduce n binary variables ti 2 {0,
ti = 1 then the corresponding test case is inclu
the test case is not included. We also introduc
one for each program element to cover. If ej = 1
is covered by one of the selected test cases a
covered by a selected test case.
The values of the ej variables are not indepe
variable ej must be 1 if and only if there exist
and ti = 1. The dependence between both sets
the following 2m PB constraints:
ej 
nX
i=1
mijti  n · ej
We can see that if the sum in the middle
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 58
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: constraints relating
covered elements and tests
The single-objective formulation of TSMP is a p
formulation. Then, we can translate the 2-obj T
and then infer the translation of the 1-obj TSM
Let us introduce n binary variables ti 2 {0,
ti = 1 then the corresponding test case is inclu
the test case is not included. We also introduc
one for each program element to cover. If ej = 1
is covered by one of the selected test cases a
covered by a selected test case.
The values of the ej variables are not indepe
variable ej must be 1 if and only if there exist
and ti = 1. The dependence between both sets
the following 2m PB constraints:
ej 
nX
i=1
mijti  n · ej
We can see that if the sum in the middle
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 59
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for coverage
ej 
nX
i=1
mijti  n · ej 1  j
We can see that if the sum in the middle is zero
element ej) then the variable ej = 0. However, if the
ej = 1. Now we need to introduce a constraint related t
in order to transform the optimization problem in a
described in Section 2.2. These constraints are:
nX
i=1
citi  B,
mX
j=1
ej P,
where B 2 Z is the maximum allowed cost and P 2 {0, 1
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 60
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for coverage
ej 
nX
i=1
mijti  n · ej 1  j
We can see that if the sum in the middle is zero
element ej) then the variable ej = 0. However, if the
ej = 1. Now we need to introduce a constraint related t
in order to transform the optimization problem in a
described in Section 2.2. These constraints are:
nX
i=1
citi  B,
mX
j=1
ej P,
where B 2 Z is the maximum allowed cost and P 2 {0, 1
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 61
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for cost
riable ej must be 1 if and only if there exists a ti variable f
d ti = 1. The dependence between both sets of variables can
e following 2m PB constraints:
ej 
nX
i=1
mijti  n · ej 1  j  m.
We can see that if the sum in the middle is zero (no tes
ment ej) then the variable ej = 0. However, if the sum is
= 1. Now we need to introduce a constraint related to each o
order to transform the optimization problem in a decision
scribed in Section 2.2. These constraints are:
nX
i=1
citi  B,
mX
ej P,
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 62
Modelling the TSM Problem using ILP
M=
previous test cases in order to check if new errors were introduced. This
is known as regression testing. One problem related to regression testing
Test Suite Minimization Problem (TSMP). This problem is equivalent t
Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · ·
be a set of tests for a program where the cost of running test ti is ci an
E = {e1, e2, · · · , em} be a set of elements of the program that we want to
with the tests. After running all the tests T we find that each test can
several program elements. This information is stored in a matrix M = [m
dimension n ⇥ m that is defined as:
mij =
(
1 if element ej is covered by test ti
0 otherwise
The single-objective version of this problem consists in finding a subs
tests X ✓ T with minimum cost covering all the program elements. In fo
terms:
minimize cost(X) =
nX
i=1
ti2X
ci
subject to:
8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi
The multi-objective version of the TSMP does not impose the constra
full coverage, but it defines the coverage as the second objective to opti
leading to a bi-objective problem. In short, the bi-objective TSMP consi
finding a subset of tests X ✓ T having minimum cost and maximum cove
Formally:
n
e1 e2 e3 ... ek
t1 1 0 1 … 1
t2 0 0 1 … 0
… … … … … …
tn 1 1 0 … 0
Let us use n Boolean variables ti and m Boolean variables ei:
- ti=1 iff test i is selected
- ei=1 iff element i is covered (it depends on ti)
ci is the cost of test ti
Task: expression for cost
riable ej must be 1 if and only if there exists a ti variable f
d ti = 1. The dependence between both sets of variables can
e following 2m PB constraints:
ej 
nX
i=1
mijti  n · ej 1  j  m.
We can see that if the sum in the middle is zero (no tes
ment ej) then the variable ej = 0. However, if the sum is
= 1. Now we need to introduce a constraint related to each o
order to transform the optimization problem in a decision
scribed in Section 2.2. These constraints are:
nX
i=1
citi  B,
mX
ej P,
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 63
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
 t1 + t2 + t4 + t5  4e1 (10)
 t2 + t6  4e2 (11)
 t1 + t3 + t6  4e3 (12)
 t5  4e4 (13)
t5 1 0 0 1
t6 0 1 1 0
If we want to solve the 2-obj TSMP we need to instantiate E
(7). The result is:
e1  t1 + t2 + t4 + t5  4e1
e2  t2 + t6  4e2
e3  t1 + t3 + t6  4e3
e4  t5  4e4
t1 + t2 + t3 + t4 + t5 + t6  B
e1 + e2 + e3 + e4 P
where P, B 2 N.
If we are otherwise interested in the 1-obj version the formula
t1 + t2 + t4 + t5 1
t2 + t6 1
t1 + t3 + t6 1
t5 1
t1 + t2 + t3 + t4 + t5 + t6  B
f(x)  B
e1  t1 + t2 + t4 + t5  6e1
e2  t2 + t6  6e2
e3  t1 + t3 + t6  6e3
e4  t5  6e4
Task: find equations for
this example
min
max
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 64
Example
e
ough a small example how to model with PB con-
SMP according to the methodology above described.
E = {e1, e2, e3, e4} and M:
e1 e2 e3 e4
t1 1 0 1 0
t2 1 1 0 0
t3 0 0 1 0
t4 1 0 0 0
t5 1 0 0 1
t6 0 1 1 0
-obj TSMP we need to instantiate Eqs. (5), (6) and
 t1 + t2 + t4 + t5  4e1 (10)
 t2 + t6  4e2 (11)
 t1 + t3 + t6  4e3 (12)
 t5  4e4 (13)
t5 1 0 0 1
t6 0 1 1 0
If we want to solve the 2-obj TSMP we need to instantiate E
(7). The result is:
e1  t1 + t2 + t4 + t5  4e1
e2  t2 + t6  4e2
e3  t1 + t3 + t6  4e3
e4  t5  4e4
t1 + t2 + t3 + t4 + t5 + t6  B
e1 + e2 + e3 + e4 P
where P, B 2 N.
If we are otherwise interested in the 1-obj version the formula
t1 + t2 + t4 + t5 1
t2 + t6 1
t1 + t3 + t6 1
t5 1
t1 + t2 + t3 + t4 + t5 + t6  B
f(x)  B
e1  t1 + t2 + t4 + t5  6e1
e2  t2 + t6  6e2
e3  t1 + t3 + t6  6e3
e4  t5  6e4
Task: find equations for
this example
min
max
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 65
Algorithm for Solving the 2-obj TSM
Cost
Coverage
Max coverage
Find max coverage
Decrease cost and find
the maximum coverage
again
and again
min cost, keeping cov
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 66
Instances from the Software-artifact Infrastructure Repository (SIR)
TSM Instances
http://sir.unl.edu/portal/index.php
Instance Tests Elements to cover
printtokens1 4130 189
printtokens2 4115 199
replace 5542 242
schedule 2650 151
schedule2 2710 128
tcas 1608 65
totinfo 1052 124
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 67
En la implementación en R se han usado las primeras n variables del vector de
variables para los tests y las restantes m variables para los elementos a cubrir
Funciones relevantes:
• readTsmInstance(file, unitaryCost=FALSE): lee un fichero de instancia y devuelve
una lista con una representación interna
• ilpModel4Tsm(tsmInstance, costUpperBound=NULL, covLowerBound=NULL): toma
una instancia y una cota para coste o cobertura y crea un modelo ILP para la
instancia que optimiza el objetivo que no está acotado
• solveModel(model): resuelve el modelo ILP que se pasa como parámetro
Ejemplo:
Ejercicio
Tarea: resolver algunas
instancias con R
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 68
Complete la función computeParetoFront para calcular el frente complete de una
instancia
Ejemplo:
Ejercicio
Tarea: completar
computeParetoFront
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 69
Reduction in the Number of Test Cases
We can reduce the number of tests cases in the original test suite
If a test t1 covers more elements than another test t2 and has less cost, t2 can be
removed
e1 e2 e3 ... em
t1 1 0 0 … 1
t2 1 0 1 … 1
… … … … … …
tn 1 1 0 … 0
Test t1 can be
removed if c1 >= c2
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 70
Reduction in the Number of Test Cases
Instance Tests Reduced tests
printtokens1 4130
printtokens2 4115
replace 5542
schedule 2650
schedule2 2710
tcas 1608
totinfo 1052
Tarea: completar la tabla
Con la ayuda de reduceInstance complete la table.
¿Cuánto se tarda ahora en calcular el frente de Pareto? ¿Es igual?
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 71
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 72
Refactoring
Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4
G29: Avoid Negative Conditionals
Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be
expressed as positives. For example:
if((buffer.shouldCompact())
is preferable to
if((!buffer.shouldNotCompact())
G30: Functions Should Do One Thing
It is often tempting to create functions that have multiple sections that perform a series of operations.
Functions of this kind do more than one thing, and should be converted into many smaller functions, each of
which does one thing.
For example:
public(void(pay()({
((for((Employee(e(:(employees)({
((((if((e.isPayday())({
((((((Money(pay(=(e.calculatePay();
((((((e.deliverPay(pay);
((((}
((}
}
This bit of code does three things. It loops over all the employees, checks to
be paid, and then pays the employee. This code would be better written as:
public(void(pay()({
((for((Employee(e(:(employees)
((((payIfNecessary(e);
}
private(void(payIfNecessary(Employee(e)({
((if((e.isPayday())
((((calculateAndDeliverPay(e);
}
private(void(calculateAndDeliverPay(Employee(e)({
((Money(pay(=(e.calculatePay();
((e.deliverPay(pay);
}
Each of these functions does one thing. (See “Do One Thing” on page 35.)
G31: Hidden Temporal Couplings
Temporal couplings are often necessary, but you should not hide the couplin
Semantic-preserving change in the code
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 73
Anti-pattern
Common solution to a problem with bad consequences
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 74
Automatic Refactoring
Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4
Boolean logic is hard enough to understand without having to see it in the context of an if or while statement.
Extract functions that explain the intent of the conditional.
For example:
if((shouldBeDeleted(timer))
is preferable to
if((timer.hasExpired()(&&(!timer.isRecurrent())
G29: Avoid Negative Conditionals
Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be
expressed as positives. For example:
if((buffer.shouldCompact())
is preferable to
if((!buffer.shouldNotCompact())
G30: Functions Should Do One Thing
It is often tempting to create functions that have multiple sections that perform a series of operations.
Functions of this kind do more than one thing, and should be converted into many smaller functions, each of
which does one thing.
For example:
public(void(pay()({
((for((Employee(e(:(employees)({
((((if((e.isPayday())({
((((((Money(pay(=(e.calculatePay();
((((((e.deliverPay(pay);
((((}
((}
}
This bit of code does three things. It loops over all the employees, checks to s
be paid, and then pays the employee. This code would be better written as:
public(void(pay()({
((for((Employee(e(:(employees)
((((payIfNecessary(e);
}
private(void(payIfNecessary(Employee(e)({
((if((e.isPayday())
((((calculateAndDeliverPay(e);
}
private(void(calculateAndDeliverPay(Employee(e)({
((Money(pay(=(e.calculatePay();
((e.deliverPay(pay);
}
Each of these functions does one thing. (See “Do One Thing” on page 35.)
G31: Hidden Temporal Couplings
Temporal couplings are often necessary, but you should not hide the couplin
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 75
ential dependency conflicts and mutual exclusion
e more on these two kind of conflicts in the fol-
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
Listing 1. Example of classes to be refactored.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
be applied before r1 (inlining class Rectangle invalidates any move
method refactoring from/to that class). Hence, by removing redundant
solutions, and invalid solutions (solutions with elements that are con-
flicted) we can reduce the search-space size of the motivating example
by half (sequences 1, 2, 3, 4, 5, 6, 8 and 11). Thus, the value obtained
after applying Eq. (2) should be used as an upper bound of the search-
space size, as long as we assume that applying a refactoring sequence
code-ana
and a h
lationship
the lifetim
ships. He
relationsh
identified
contains
and anti-
nipulate
this step
matically
apply ref
quality o
design m
Gueheneu
Antoniol
3.2. Step
In thi
available
instances
that part
3.3. Step
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Me
r1 Move method Geometry cal
r2 Inline Class Rectangle All
r3 Introduce Parameter Object Geometry lon
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
R. Morales et al.
Example
i.e., the (1) detection of classes that contain anti-patterns; (2) the
generation of refactoring candidates to improve the design quality of
the classes detected in (1); (3) the search for an optimal refactoring
order; and (4) the application of the refactoring order from (3). To
achieve this goal, we propose a new heuristic approach called RePOR
(Refactoring approach based on Partial Order Reduction). Partial order
reduction is a popular technique for controlling state space explosion in
model checking (Lluch-Lafuente et al., 2002). The intuition is to reduce
the number of refactoring sequences to be explored by removing
equivalent sequences (i.e., refactoring sequences that leads to the same
design). As a result, less search effort is required than when using
metaheuristic algorithms. To evaluate RePOR, we conduct a series of
experiments over a testbed of five open source software systems (OSS)
and compare the results with Genetic Algorithm (GA) (Holland, 1975),
Ant Colony optimization (ACO) (Dorigo et al., 2006), the conflict-aware
refactoring scheduling approach proposed by Liu et al. (2008) (referred
to as LIU in this paper), and a new optimizer based on sampling (SWAY)
(Chen et al., 2018). We show that the solutions obtained by RePOR
overcome the ones obtained by the above-mentioned state-of-the-art
optimization techniques in terms of performance (i.e., execution time)
and effort (i.e., number of refactorings applied).
Tool and Data Replication. The Eclipse Plug-in and all the data
used in the experiments are available on the RePOR replication package
(Morales et al., 2017b).
The remainder of the paper is organized as follows: Section 2 dis-
cusses the formulation of the refactoring scheduling problem, and de-
scribes how to reduce the search-space size using partial order reduc-
tion. Section 3 describes RePOR in detail. Section 4 presents the case
study for evaluating our approach. Section 5 presents and discusses the
results obtained in our case study. Section 7 discloses the threats to the
validity of our study. Related work is discussed in Section 8. Finally, we
present our conclusions and lay out directions for future work in
Section 9.
2. Formulation of the refactoring scheduling problem
As a software system ages, its design quality deteriorates unless it is
continually maintained (Parnas, 1994). Refactoring is a software
maintenance activity that aims to keep the design quality of a software
system at an acceptable level, in order to ensure a normal evolution of
the system. Typically, refactoring is performed by applying small
transformation operations (e.g., moving a method/field to another
class) to a software system while preserving its original behavior. Since
there is a wide range of candidate refactorings that can be applied on a
system, depending on the domain of the system, an optimal solution
may be comprised of several refactorings that improve different quality
attributes. Hence, the refactoring scheduling problem consists of
finding the best combination of refactorings that maximizes the design
quality improvement of a software system. The problem of finding an
optimal order can be solved using search-based techniques. Search al-
gorithms start by generating one or more random sequences. Next, the
quality of each sequence is computed by applying it to the software
the number of occurrences of an
The outcome of Q(SR) is a nega
moves anti-patterns; zero if the
same, and positive otherwise. T
lated to the presence and the or
Hence, we suggest that refac
on the classes that they affect. I
parately. Since the order of app
ferent classes in a sequence is irr
refactoring operations that we n
that we have a set of refact
According to Morales et al. (2
quences (S) that we could gener
given by Eq. (2).
= ⎧
⎨⎩
⌊ ⌋ ∀ ≥
=
S
e n n
n
· ! 1
1 0
where e is the Euler constant,
available.
Applying Eq. (2) to our ex
(⌊ ⌋ =e·2! 5): < > , < A > , < B
if (iff) we assume that each per
(here the term solution refers to
sequence to a system, i.e., the re
and < B, A > are two different
and only 4 different solutions ex
In the case of refactorings th
design may vary depending on
factorings, as the application of
the rest of refactorings. We can
factorings as an undirected graph
ru, rv ∈ Rk. k ∈ K, where K is the
set of refactorings that affect cla
graph, is linked to the structure o
refactorings modify a class, an
factorings affect the number of
after refactoring.
We use GB to find the conne
component is a maximal subgra
connected by a path. Connecte
over the refactoring operations.
reduction from model checking (L
the removal of sequences of refa
Partial order reduction (POR) i
tativity of asynchronous systems
concurrent models impose an a
events, refactoring scheduling im
refactoring operations. The orde
instructions is meaningless (as th
Hence, we can consider just o
property since the other ordering
to construct a reduced state gra
Are all permutations relevant?
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 76
the
em.
re-
e to
-re-
the
ing,
the
y of
ring
To
∑= = ′ −
∈
Q SR Q sr Q sr AC k AC k( ) ( ); with ( ) ( ) ( )
k K
k k
In Eq. (1), SR is a subset of R; R is the set of refactorin
applied in a system SYS; K is the set of classes in SYS, K ∈ SYS
subset of SR that modifies class k (k ∈ K). Each sub-function
computed by subtracting the number of occurrences of anti-pa
class k after applying srk to k (i.e., AC(k′)) and the number o
rences of anti-patterns before refactoring (i.e., AC(k)). Note tha
the number of occurrences of anti-patterns as a proxy of design
The outcome of Q(SR) is a negative value when applying SR
moves anti-patterns; zero if the number of anti-patterns rem
same, and positive otherwise. The quality effect of applying
Objective Function
Class after refactoring
Class before refactoring
Anti-patterns count
me conclusions and future work.
UDO-BOOLEAN OPTIMIZATION
hod for identifying improving moves in the radius
g ball can be applied to all k-bounded pseudo-
ptimization problems. This makes our method
al: every compressible pseudo-Boolean Optimiza-
m can be transformed into a quadratic pseudo-
ptimization problem with k = 2.
ily of k-bounded pseudo-Boolean Optimization
ave also been described as an embedded landscape.
ed landscape [3] with bounded epistasis k is de-
function f(x) that can be written as the sum
nctions, each one depending at most on k input
That is:
f(x) =
mX
i=1
f(i)
(x), (1)
subfunctions f(i)
depend only on k components
dded Landscapes generalize NK-landscapes and
SAT problem. We will consider in this paper that
of subfunctions is linear in n, that is m 2 O(n).
dscapes m = n and is a common assumption in
T that m 2 O(n).
subfunctions f . Let us define w
such that the i-th element of wl is
on variable xi. The vector wl ca
that characterizes the variables t
has bounded epistasis k, the num
with |wl|, is at most k. By the
equalities immediately follow.
f(l)
(x v) = f(l)
(x) for all v
S(l)
v (x) =
⇢
0 if w
S
(l)
v^wl
(x) othe
Equation (5) claims that if n
change in the move characterize
f(l)
the Score of this subfunction
this subfunction will not change f
On the other hand, if f(l)
depend
we only need to consider for the
changed variables that a↵ect f(l)
acterized by the mask vector v ^
we can write (3) as:
Sv(x) =
mX
l=1
wl^v6=0
f = + + +f(1)(x) f(2)(x) f(3)(x) f(4)(x)
x1 x2 x3 x4
The structure is well-known in optimization…
x4 x3
x1 x2
Variable
Interaction Graph
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 77
Objective Function
x1
x2
x4
x3
x5
x6
If variable interaction graph has several connected componentes, we can
optimize each of them independently
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 78
Dependency Graph (GB)
r1
r2
r4
r3
r5
r6
Two refactoring operations are adjacent in GB when both touch the same class
We can optimize each connected component of GB independently, exploring all the
posible sequences in the component
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 79
Dependency Graph (GB): example
What is the dependency graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
dependency graph
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 80
Dependency Graph (GB): example
What is the dependency graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
dependency graph
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 81
Conflict Graph (GC)
r1
r2
r4
r3
r5
r6
Conflict graph is used to reduce the number sequences to explore in each
component
Sequential dependency conflict
Mutual exclusion conflict
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 82
What is the conflict graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
conflict graph
Conflict Graph (GC): example
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 83
What is the conflict graph in our example?
kind of conflicts, sequential dependency conflicts and mutual exclusion
conflicts. We elaborate more on these two kind of conflicts in the fol-
lowing.
• Given two refactorings ri and rj, ri has a sequential dependency
conflict with rj iff rj cannot be applied before ri. We represent se-
quential dependency conflicts as follows: r1 → r2, which means that
r1 can be followed by r2, but r2 cannot be followed by r1. Note that
conflicts are directional, i.e., the fact that applying rj disables ri does
not necessarily means that ri disables rj.
• Given two refactorings ri and rj, ri has a mutual exclusion conflict
with rj iff ri and rj cannot be applied together in any order. We re-
present mutual exclusion with the following notation: ¬ ↔r r1 2.
belongs to class B instead, if A is a subclass of B.
To better illustrate the refactoring scheduling problem, and the ef-
fect that the consideration of dependencies and conflicts between re-
factorings has on the size of the search-space, we present an example of
the problem in Listing 1.
The refactorings presented in Table 1 can be applied to refactor the
classes described in Listing 1.
Table 1 contains three type of refactorings from Fowler (1999b) that
we describe below:
1. Move method. Move a method from one class to another (e.g., to one
of its parameter types (Seng et al., 2006)).
2. Inline Class. If a class contains few responsibilities, move all its
features to another class and remove it.
Listing 1. Example of classes to be refactored.
R. Morales et al.
reduce even more the search-space by removing these permutations as
they lead to the same design (same solution). This occurs because they
affect different code segments (the method and target class is different
for r1 and r3) , i.e., they are unrelated.
In addition, when a conflict exists between refactorings, it is pos-
sible to reduce the size of the search space further. For example, con-
sider the sequential dependency conflict between r1, r2, that is r2 cannot
code-analyses with typically 100% precision and recall for associations
and a high precision and recall for aggregations. Composition re-
lationships cannot be entirely identified statically because they involve
the lifetime of the instances of the classes involved in such relation-
ships. Hence, idiom-level models include association and aggregation
relationships and only the few composition relationships that can be
identified with high precision and recall statically. A design-level model
contains information about occurrences of design motifs, code smells,
and anti-patterns. A code meta-model should provide methods to ma-
nipulate the design model and generate other models. The objective of
this step is to manipulate the design model of a system program-
matically. Hence, the code meta-model is used to detect anti-patterns,
apply refactoring sequences and evaluate their impact on the design
quality of a system. More information related to code meta-models,
design motifs and micro-architecture identification can be found in
Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and
Antoniol (2008).
3.2. Step 2: detect anti-patterns
In this step we detect anti-patterns in the meta-model using any
Table 1
List of refactorings candidates for the example from Listing 1.
ID Type Source class Method Target Class
r1 Move method Geometry calcAreaRectangle Rectangle
r2 Inline Class Rectangle All fields and methods Shape
r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new)
Table 2
Enumeration of possible refactoring sequences for the set of refactoring op-
erations {r1, r2, r3}.
sequence elements sequence elements
1. None 9. r3, r1
2. r1 10. r3, r2
3. r2 11. r1, r2, r3
4. r3 12. r1, r3, r2
5. r1, r2 13. r2, r1, r3
6. r1, r3 14. r2, r3, r1
7. r2, r1 15. r3, r2, r1
8. r2, r3 16. r3, r1, r2
R. Morales et al.
r1 r2
r3
Task: find the
conflict graph
Conflict Graph (GC): example
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 84
Input : System to refactor (SYS), Maximum number of refactoring operations in a connected component subgraph (threshold)
Output: An optimal sequence of refactoring operations (S R)
1 Require Proc: extractBestPermutation, getFirstValidS equenceFromccap
2 Steps RePOR(SYS, threshold)
3 AM = code meta-model generation (SYS)
4 A = Detect Anti-patterns(AM)
5 R = Generate set of refactoring candidates(AM, A)
6 GB = Build Graph of dependencies between refactorings and anti-patterns(AM, R, A)
7 CCAP = Find connected components (GB)
8 GC = Build Graph of conflicts between refactorings (AM, LR)
9 S R = Schedule sequence of refactorings(CCAP, GC, AM)
10 Procedure Schedule sequence of refactorings(CCAP, GC, AM):
11 S R = 0
12 for each ccap ∈ CCAP do
13 ccap.RemoveInvalidRefactorings(S R)
14 if ccap.size == 0 then
15 continue
16 else
17 List permuts = enumeratePermutations(ccap)
18 if permuts ≤ threshold then
19 S R.addAll(extractBestPermutation(AM, GC, permuts))
20 else
21 S R.addAll(getFirstValidS equenceFromccap(AM, GC, ccap, R))
22 end if
23 end if
24 end for
25 return S R
26 end
Algorithm 1. RePOR.
RePOR
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 85
Experimental Setup
Subjects
Tools
• PADL to create a high level model of the software
• DECOR to detect and correct anti-patterns on the model
In Table 4 we describe the type of anti-patterns studied and
refactoring strategies used to remove them. Table 5 shows the num
of refactoring candidates that were automatically found in each sys
4.3. RePOR implementation
We instantiate RePOR as an eclipse plug-in and compared it
three refactoring approaches. Design improvement (DI) is meas
using Eq. (3). To determine the value of the parameter thres
Listing 2. Rule card of Blob anti-pattern from DECOR.
Table 3
Descriptive statistics about the studied systems.
System NOC KLOC BL LC LP SC SG Total
Apache Ant 1.8.2 697 191 57 40 35 3 6 141
ArgoUML 0.34 1754 183 131 25 281 1 19 457
GanttProject 1.10.2 188 44 47 4 68 5 6 130
JfreeChart 1.0.19 505 98 41 21 62 1 1 126
Xerces 2.7 540 71 56 25 119 2 3 205
Table 4
List of studied Anti-patterns and the refactorings used to correct them.
Type Description Refactoring(s) strategy
Blob (BL) (Brown et al., 1998) A large class that absorbs most of the functionality of the system with
very low cohesion between its constituents.
Move method (MM). Move the methods that does not seem to fit in
Blob class abstraction to more appropriate classes (Seng et al., 200
Lazy Class (LC) (Fowler, 1999a) Small classes with low complexity that do not justify their existence
in the system.
Inline class (IC). Move the attributes and methods of the LC to anot
class in the system.
Long Parameter List (LP)
(Fowler, 1999a)
A class with one or more methods having a long list of parameters,
specially when two or more methods are sharing a long list of
parameters that are semantically connected.
Introduce parameter object (IPO). Extract a new class with the long
of parameters and replace the method signature by a reference to
new object created. Then access to this parameters through the
parameter object.
Spaghetti Code (SC)
(Brown et al., 1998)
A class without structure that declares long methods without
parameters.
Replace method with method object (RMWO). Extract long methods i
new classes so all local variables become fields on that object.
Speculative Generality (SG) There is an abstract class created to anticipate further features, but it Collapse hierarchy (CH). Move the attributes and methods of the ch
described in Section 3.7, we executed 30 independent executions for
each of the systems studied in a Windows 10 64-bit, Intel Core 5 at 2.30
GHz, 12 GB of memory machine, and record the size of ccap, where the
performance of RePOR is acceptable, and found =threshold 10 to be the
best trade. The value of threshold indicates that for our experiments, we
only exhaustively explore the permutations of a ccap containing 10 or
less refactoring operations, and evaluate the resultant permutations
only after removing any conflicted refactoring operation.
The directed graph of conflicts (GC) is used for the three meta-
heuristics to avoid scheduling invalid refactorings. Due to the random
nature of the metaheuristics studied (i.e., ACO, GA, and SWAY) it is ne-
cessary to perform several independent runs to have an idea of the
behavior of the algorithms. Hence, we execute 30 independent runs for
all the approaches studied and for each system. This is a typical
minimum value (i.e., 30 runs) used in the search-based research com-
Table 5
Number of refactoring candidates automatically generated for each studied
system.
CH IC IPO MM RMWO Total
Ant
6 9 35 4269 3 4322
ArgoUML
19 25 281 2475 1 2800
Gantt Project
6 4 68 3861 5 3944
JfreeChart
1 21 62 4228 1 4313
Xerces
3 25 119 4118 2 4267
R. Morales et al.
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 86
Experimental Setup
Performance measures
• Design Improvement
• Execution time (ET): runtime of algorithms
• Refactoring Effort (RE): number of refactoring operations in the sequence
ment.1
For all statistical tests, we consider a significance level of
For RQ1, we measure the effectiveness of RePOR at removing a
patterns in software systems using the following dependent variable
• Design Improvement (DI). DI represents the delta of anti-patte
occurrences between the refactored system (SYS′) and the orig
system (SYS) and it is computed using the following formulatio
=
′ −
×DI SYS
AC SYS AC SYS
AC SYS
( )
( ) ( )
( )
100.
Where AC(SYS) is the number of anti-patterns in a system SYS
AC(SYS) ≥ 0. DI, which is a positive real number, represents
improvement amount in percentage, and high positive values
desired. Note that Eq. (3) assumes that ′ − <AC SYS AC SYS( ) ( ) 0
RePOR filters out solutions that make the design worse accordin
the desiredEffect threshold (cf., Algorithm 4).
The independent variable is the refactoring approach applied
each studied system. We statistically compare the number of
maining anti-patterns after refactoring a system using RePOR w
the number of remaining anti-patterns when using other refactor
approaches. Specifically, we test the following hypothesis H01: Th
is no difference between the number of remaining anti-patterns o
system refactored using RePOR, and a system refactored using o
refactoring approaches. We test the hypothesis using a non-p
metric test, i.e., the Mann–Whitney U test (Hollander et al., 201
For estimating the magnitude of the differences of means betw
Algorithms
• RePOR
• Conflict-aware scheduling of refactoring heuristic by Liu et al. (2008) (LIU)
• Ant Colony Optimization (ACO)
• Genetic Algorithm (GA)
• SWAY metaheuristic by Chen et al. (2018)
Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda
Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 87
Results
RQ1: To what extent can RePOR remove anti-patterns?
We present in Table 7 the Design improvement (DI) in general and
the rest of the systems.
We reject the null hypothesis H01 for Ant, ArgoUML, Gantt,
JfreeChart, and Xerces. In these five systems, the number of re-
maining anti-patterns after refactoring using RePOR is significantly
lower than the number of anti-patterns remaining in the systems
after refactoring using the other refactoring approaches (i.e., ACO,
Table 7
Design Improvement (%) in general and for different anti-pattern types.
Metaheuristic DI DIBL DILC DILP DISC DISG
Ant
ACO 57.45 68.42 22.5 74.29 66.67 100
GA 58.16 68.42 22.5 74.29 66.67 100
LIU 58.87 54.39 22.5 100 66.67 100
RePOR 60.28 57.89 22.5 100 66.67 100
SWAY 45.36 57.89 20 60 66.67 83.33
ArgoUML
ACO 75.93 51.15 100 83.63 100 100
GA 76.59 51.15 100 84.7 100 100
LIU 81.40 50.38 100 92.88 100 100
RePOR 81.62 38.93 100 98.58 100 100
SWAY 62.91 48.09 84 66.01 100 86.84
Gantt Project
ACO 60 17.02 100 83.82 70 100
GA 60.77 14.89 100 85.29 80 100
LIU 63.85 14.89 100 92.65 60 100
RePOR 66.15 8.51 75 100 100 100
SWAY 50 8.51 100 70.59 60 100
JfreeChart
ACO 75.4 39.02 100 89.52 100 100
GA 75.4 39.02 100 90.32 100 100
LIU 72.22 31.71 100 88.71 100 100
RePOR 75.4 24.39 100 100 100 100
SWAY 61.90 36.59 90.48 73.39 100 100
Xerces
ACO 56.59 14.29 100 65.55 100 100
GA 57.56 14.29 100 67.23 100 100
LIU 64.39 16.07 100 78.99 50 100
RePOR 73.17 5.36 100 98.32 100 100
SWAY 41.87 14.29 68.00 49.58 50 100
Table 8
Pair-wise Mann–Whitney U Test for design improvement.
Pair −p value Cliff’s δ Magnitude
Ant
ACO-RePOR 2.561349e−12 1 Large
GA-RePOR 1.431438e−11 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.190193e−12 1 Large
ArgoUML
ACO-RePOR 1.176641e−12 1 Large
GA-RePOR 1.143381e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.206843e−12 1 Large
Gantt Project
ACO-RePOR 1.036681e−12 1 Large
GA-RePOR 1.086586e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.165138e−12 1 Large
JfreeChart
ACO-RePOR 0.06868602 0.2333333 Small
GA-RePOR 0.2771456 −0.1333333 Negligible
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.183399e−12 1 Large
Xerces
ACO-RePOR 1.0618e−12 1 Large
GA-RePOR 9.946555e−13 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.193116e−12 1 Large
R. Morales et al.
the rest of the systems.
We reject the null hypothesis H01 for Ant, ArgoUML, Gantt,
ble 7
sign Improvement (%) in general and for different anti-pattern types.
Metaheuristic DI DIBL DILC DILP DISC DISG
nt
CO 57.45 68.42 22.5 74.29 66.67 100
A 58.16 68.42 22.5 74.29 66.67 100
IU 58.87 54.39 22.5 100 66.67 100
ePOR 60.28 57.89 22.5 100 66.67 100
WAY 45.36 57.89 20 60 66.67 83.33
rgoUML
CO 75.93 51.15 100 83.63 100 100
A 76.59 51.15 100 84.7 100 100
IU 81.40 50.38 100 92.88 100 100
ePOR 81.62 38.93 100 98.58 100 100
WAY 62.91 48.09 84 66.01 100 86.84
antt Project
CO 60 17.02 100 83.82 70 100
A 60.77 14.89 100 85.29 80 100
IU 63.85 14.89 100 92.65 60 100
ePOR 66.15 8.51 75 100 100 100
WAY 50 8.51 100 70.59 60 100
freeChart
CO 75.4 39.02 100 89.52 100 100
A 75.4 39.02 100 90.32 100 100
IU 72.22 31.71 100 88.71 100 100
ePOR 75.4 24.39 100 100 100 100
WAY 61.90 36.59 90.48 73.39 100 100
erces
CO 56.59 14.29 100 65.55 100 100
A 57.56 14.29 100 67.23 100 100
IU 64.39 16.07 100 78.99 50 100
ePOR 73.17 5.36 100 98.32 100 100
WAY 41.87 14.29 68.00 49.58 50 100
Table 8
Pair-wise Mann–Whitney U Test for design improvement.
Pair −p value Cliff’s δ Magnitude
Ant
ACO-RePOR 2.561349e−12 1 Large
GA-RePOR 1.431438e−11 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.190193e−12 1 Large
ArgoUML
ACO-RePOR 1.176641e−12 1 Large
GA-RePOR 1.143381e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.206843e−12 1 Large
Gantt Project
ACO-RePOR 1.036681e−12 1 Large
GA-RePOR 1.086586e−12 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.165138e−12 1 Large
JfreeChart
ACO-RePOR 0.06868602 0.2333333 Small
GA-RePOR 0.2771456 −0.1333333 Negligible
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.183399e−12 1 Large
Xerces
ACO-RePOR 1.0618e−12 1 Large
GA-RePOR 9.946555e−13 1 Large
LIU-RePOR 1.685298e−14 1 Large
SWAY-RePOR 1.193116e−12 1 Large
Morales et al.
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda
Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda

More Related Content

Similar to Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda

NGRX Apps in Depth
NGRX Apps in DepthNGRX Apps in Depth
NGRX Apps in DepthTrayan Iliev
 
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...IJERA Editor
 
Chatzikonstantinou c ai-se2013_
Chatzikonstantinou c ai-se2013_Chatzikonstantinou c ai-se2013_
Chatzikonstantinou c ai-se2013_caise2013vlc
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Data analysis in R
Data analysis in RData analysis in R
Data analysis in RAndrew Lowe
 
Surrogate modeling for industrial design
Surrogate modeling for industrial designSurrogate modeling for industrial design
Surrogate modeling for industrial designShinwoo Jang
 
Class 26: Objectifying Objects
Class 26: Objectifying ObjectsClass 26: Objectifying Objects
Class 26: Objectifying ObjectsDavid Evans
 
DSDT meetup July 2021
DSDT meetup July 2021DSDT meetup July 2021
DSDT meetup July 2021DSDT_MTL
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
 
Recommending job ads to people
Recommending job ads to peopleRecommending job ads to people
Recommending job ads to peopleFabian Abel
 
Aggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsAggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsRoberto Casadei
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Object Orientation.pdf
Object Orientation.pdfObject Orientation.pdf
Object Orientation.pdfJutt21
 

Similar to Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda (20)

R studio
R studio R studio
R studio
 
NGRX Apps in Depth
NGRX Apps in DepthNGRX Apps in Depth
NGRX Apps in Depth
 
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
 
Chatzikonstantinou c ai-se2013_
Chatzikonstantinou c ai-se2013_Chatzikonstantinou c ai-se2013_
Chatzikonstantinou c ai-se2013_
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Dbms module ii
Dbms module iiDbms module ii
Dbms module ii
 
Data analysis in R
Data analysis in RData analysis in R
Data analysis in R
 
Surrogate modeling for industrial design
Surrogate modeling for industrial designSurrogate modeling for industrial design
Surrogate modeling for industrial design
 
Class 26: Objectifying Objects
Class 26: Objectifying ObjectsClass 26: Objectifying Objects
Class 26: Objectifying Objects
 
APSEC2020 Keynote
APSEC2020 KeynoteAPSEC2020 Keynote
APSEC2020 Keynote
 
DSDT meetup July 2021
DSDT meetup July 2021DSDT meetup July 2021
DSDT meetup July 2021
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
Recommending job ads to people
Recommending job ads to peopleRecommending job ads to people
Recommending job ads to people
 
Aggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the GapsAggregate Computing Platforms: Bridging the Gaps
Aggregate Computing Platforms: Bridging the Gaps
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Chap14
Chap14Chap14
Chap14
 
Object Orientation.pdf
Object Orientation.pdfObject Orientation.pdf
Object Orientation.pdf
 

More from jfrchicanog

Combinando algoritmos exactos y heurísticos para problemas en ISGB
Combinando algoritmos exactos y heurísticos para problemas en ISGBCombinando algoritmos exactos y heurísticos para problemas en ISGB
Combinando algoritmos exactos y heurísticos para problemas en ISGBjfrchicanog
 
Quasi-Optimal Recombination Operator
Quasi-Optimal Recombination OperatorQuasi-Optimal Recombination Operator
Quasi-Optimal Recombination Operatorjfrchicanog
 
Uso de CMSA para resolver el problema de selección de requisitos
Uso de CMSA para resolver el problema de selección de requisitosUso de CMSA para resolver el problema de selección de requisitos
Uso de CMSA para resolver el problema de selección de requisitosjfrchicanog
 
Enhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points AnalysisEnhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points Analysisjfrchicanog
 
Search-Based Software Project Scheduling
Search-Based Software Project SchedulingSearch-Based Software Project Scheduling
Search-Based Software Project Schedulingjfrchicanog
 
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...Dos estrategias de búsqueda anytime basadas en programación lineal entera par...
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...jfrchicanog
 
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization Problems
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization ProblemsEfficient Hill Climber for Constrained Pseudo-Boolean Optimization Problems
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization Problemsjfrchicanog
 
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean OptimizationEfficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimizationjfrchicanog
 
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing ProblemMixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problemjfrchicanog
 
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...jfrchicanog
 
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...jfrchicanog
 
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...Resolviendo in problema multi-objetivo de selección de requisitos mediante re...
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...jfrchicanog
 
On the application of SAT solvers for Search Based Software Testing
On the application of SAT solvers for Search Based Software TestingOn the application of SAT solvers for Search Based Software Testing
On the application of SAT solvers for Search Based Software Testingjfrchicanog
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problemjfrchicanog
 
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...jfrchicanog
 
Recent Research in Search Based Software Testing
Recent Research in Search Based Software TestingRecent Research in Search Based Software Testing
Recent Research in Search Based Software Testingjfrchicanog
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theoryjfrchicanog
 
Searching for Liveness Property Violations in Concurrent Systems with ACO
Searching for Liveness Property Violations in Concurrent Systems with ACOSearching for Liveness Property Violations in Concurrent Systems with ACO
Searching for Liveness Property Violations in Concurrent Systems with ACOjfrchicanog
 
Finding Safety Errors with ACO
Finding Safety Errors with ACOFinding Safety Errors with ACO
Finding Safety Errors with ACOjfrchicanog
 
Elementary Landscape Decomposition of Combinatorial Optimization Problems
Elementary Landscape Decomposition of Combinatorial Optimization ProblemsElementary Landscape Decomposition of Combinatorial Optimization Problems
Elementary Landscape Decomposition of Combinatorial Optimization Problemsjfrchicanog
 

More from jfrchicanog (20)

Combinando algoritmos exactos y heurísticos para problemas en ISGB
Combinando algoritmos exactos y heurísticos para problemas en ISGBCombinando algoritmos exactos y heurísticos para problemas en ISGB
Combinando algoritmos exactos y heurísticos para problemas en ISGB
 
Quasi-Optimal Recombination Operator
Quasi-Optimal Recombination OperatorQuasi-Optimal Recombination Operator
Quasi-Optimal Recombination Operator
 
Uso de CMSA para resolver el problema de selección de requisitos
Uso de CMSA para resolver el problema de selección de requisitosUso de CMSA para resolver el problema de selección de requisitos
Uso de CMSA para resolver el problema de selección de requisitos
 
Enhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points AnalysisEnhancing Partition Crossover with Articulation Points Analysis
Enhancing Partition Crossover with Articulation Points Analysis
 
Search-Based Software Project Scheduling
Search-Based Software Project SchedulingSearch-Based Software Project Scheduling
Search-Based Software Project Scheduling
 
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...Dos estrategias de búsqueda anytime basadas en programación lineal entera par...
Dos estrategias de búsqueda anytime basadas en programación lineal entera par...
 
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization Problems
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization ProblemsEfficient Hill Climber for Constrained Pseudo-Boolean Optimization Problems
Efficient Hill Climber for Constrained Pseudo-Boolean Optimization Problems
 
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean OptimizationEfficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
 
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing ProblemMixed Integer Linear Programming Formulation for the Taxi Sharing Problem
Mixed Integer Linear Programming Formulation for the Taxi Sharing Problem
 
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...
Descomposición en Landscapes Elementales del Problema de Diseño de Redes de R...
 
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...
Optimización Multi-objetivo Basada en Preferencias para la Planificación de P...
 
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...Resolviendo in problema multi-objetivo de selección de requisitos mediante re...
Resolviendo in problema multi-objetivo de selección de requisitos mediante re...
 
On the application of SAT solvers for Search Based Software Testing
On the application of SAT solvers for Search Based Software TestingOn the application of SAT solvers for Search Based Software Testing
On the application of SAT solvers for Search Based Software Testing
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
 
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...
Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Prob...
 
Recent Research in Search Based Software Testing
Recent Research in Search Based Software TestingRecent Research in Search Based Software Testing
Recent Research in Search Based Software Testing
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theory
 
Searching for Liveness Property Violations in Concurrent Systems with ACO
Searching for Liveness Property Violations in Concurrent Systems with ACOSearching for Liveness Property Violations in Concurrent Systems with ACO
Searching for Liveness Property Violations in Concurrent Systems with ACO
 
Finding Safety Errors with ACO
Finding Safety Errors with ACOFinding Safety Errors with ACO
Finding Safety Errors with ACO
 
Elementary Landscape Decomposition of Combinatorial Optimization Problems
Elementary Landscape Decomposition of Combinatorial Optimization ProblemsElementary Landscape Decomposition of Combinatorial Optimization Problems
Elementary Landscape Decomposition of Combinatorial Optimization Problems
 

Recently uploaded

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

Seminario-taller: Introducción a la Ingeniería del Software Guiada or Búsqueda

  • 1. Seminario-taller Introducción a la Ingeniería del Software Guiada por Búsqueda Francisco Chicano Departamento de Lenguajes y Ciencias de la Computación
  • 2. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 2 chicano@lcc.uma.es @francischicano www.franciscochicano.es José Francisco Chicano García
  • 3. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 3 Planificación temporal Hora Lunes 26 Martes 27 9:00-10:30 Introducción a SBSE y NRP Minimización de casos de prueba 10:30-10:45 Descanso Descanso 10:45-12:15 NRP (continuación) Refactorización 12:15-12:30 Descanso Descanso 12:30-14:00 Agrupamiento de módulos Planificación de proyectos y prueba de conocimiento Habrá una pequeña prueba de conocimiento el martes 27 en la última franja
  • 4. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 4 Materiales para seguir el taller Software: • RStudio (versión on-line en https://rstudio.cloud) • Symphony (resolutor ILP open-source) • Rsymphony (paquete de R para conectar con Symphony) Código y ejemplos • Disponibles en GitHub: https://github.com/jfrchicanog/TallerUAL2020 • Y en Rstudio.cloud: https://rstudio.cloud/project/1815713 Tarea: acceder a RStudio e instalar Rsymphony
  • 5. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 5 • Introducción a SBSE • Requisitos para la Siguiente Versión (NRP) • Programación Lineal Entera • Optimización Multiobjetivo • Agrupamiento de Módulos Software • Minimización de Casos de Prueba • Refactorización Automática de Software • Planificación de Proyectos Software • Conclusión • Prueba de Conocimiento Índice
  • 6. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 6
  • 7. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 7 Ingeniería del Software
  • 8. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 8 Problemas de búsqueda Un problema de búsqueda es una relación binaria R ⊆ X×Y, tal que dado un x ∈ X (instancia) estamos interesados en encontrar y ∈ Y (solución) con (x,y) ∈ R Ejemplos de instancias de problemas de búsqueda: - Encontrar los factores primos de 15 - Encontrar una cadena que case con la expresión regular a*b - Encontrar un número real x que minimice la expresión (x-1)^2 Nos centraremos fundamentalmente en un subtipo de problemas de búsqueda: los problemas de optimización
  • 9. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 9 Un problema de optimización es un par: P = (S,f) donde: S es un conjunto de soluciones (o espacio de búsqueda) f: S → R es una función objetivo a minimizar o maximizar Si nuestro objetivo es minimizar la función buscamos: Máximo global Máximo local Mínimo global Mínimo local s’ Î S | f(s’) ≤ f(s), "s Î S Problemas de optimización
  • 10. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 10 Algoritmos de optimización TÉCNICAS DE OPTIMIZACIÓN EXACTAS APROXIMADAS HEURÍSTICAS AD HOC METAHEURÍSTICAS Gradiente Mult. de Lagrange Basadas en el cálculo Programación dinámica Ramificación y poda Resolutor ILP Exhaustivas SA VNS TS Trayectoria EA ACO PSO Población Híbridos
  • 11. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 11 Ingeniería del Software Guiada por Búsqueda Máximo Global Máximo Local Mínimo Global Mínimo Local Problema de búsqueda u optimización Algoritmo de búsqueda u optimización Solución Término en inglés: Search-Based Software Engineering (SBSE)
  • 12. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 12 Ingeniería del Software Guiada por Búsqueda Término en inglés: Search-Based Software Engineering (SBSE) Requisitos para la siguiente versión Agrupamiento de módulos software Minimización de casos de prueba Refactorización automática Planificación de proyectos
  • 13. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 13
  • 14. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 14 Dados: Ø Un conjunto de requisitos R = {r1, r2, ..., rn} … Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes) Ø Un conjunto de interacciones funcionales entre requisitos Ø Implicación (ri antes que rj): Ø Combinación (ri a la vez que rj): Ø Exclusión (no a la vez): Encontrar un subconjunto de requisitos que además de cumplir con las interacciones minimice el coste y maximice el valor: del requisito rj para el cliente i se representa con vij 2 R. L valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de calcular como la suma ponderada de los valores de importa sj = Pm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im de desarrollo determinado, lo que limita las alternativas par Las interacciones funcionales entre requisitos se clasifican en Implicaci´on o precedencia. ri ) rj. Un requisito rj no p previamente otro requisito ri no ha sido implementado. Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no puede ser incluido j Si llamamos X ✓ R al conjunto de requisitos seleccionado de X vienen dados por las funciones: coste(X) = nX cj y valor(X) = nX ar como la suma ponderada de los valores de imporPm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, sarrollo determinado, lo que limita las alternativas p teracciones funcionales entre requisitos se clasifican mplicaci´on o precedencia. ri ) rj. Un requisito rj no eviamente otro requisito ri no ha sido implementado ombinaci´on o acoplamiento. ri rj. Los requisitos ri y forma conjunta en el software. xclusi´on. ri rj. El requisito ri no puede ser incluido llamamos X ✓ R al conjunto de requisitos selecciona vienen dados por las funciones: nX nX calcular como la suma ponderada de los va sj = Pm i=1 wi ⇤vij. Los requisitos interaccion de desarrollo determinado, lo que limita las Las interacciones funcionales entre requisito Implicaci´on o precedencia. ri ) rj. Un previamente otro requisito ri no ha sido Combinaci´on o acoplamiento. ri rj. Los de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no pu Si llamamos X ✓ R al conjunto de requis de X vienen dados por las funciones: coste(X) = nX j,rj 2X cj y v da requisito rj 2 R tiene un coste cj para la empresa si se del requisito rj para el cliente i se representa con vij 2 R. L valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de calcular como la suma ponderada de los valores de importa sj = Pm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im de desarrollo determinado, lo que limita las alternativas pa Las interacciones funcionales entre requisitos se clasifican en Implicaci´on o precedencia. ri ) rj. Un requisito rj no previamente otro requisito ri no ha sido implementado. Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no puede ser incluido Si llamamos X ✓ R al conjunto de requisitos seleccionad de X vienen dados por las funciones: coste(X) = nX j,rj 2X cj y valor(X) = nX j,rj 2X respectivamente. Consideraremos una versi´on multi-objetiv minimice el coste y maximice el valor del conjunto de requi min max Bagnall et al. van der Akker et al. Next Release Problem (NRP) sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P valor( ˆR) = mX i=1 wi Y (j,i)2Q h j 2 ˆR i
  • 15. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 15 Next Release Problem (NRP): ejemplo Clientes (importancia) Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5) r1 2 x x r2 4 x r3 3 x x r4 5 x coste({r1, r3})= valor({r1, r3})= coste({r1, r2, r3})= valor({r1, r2, r3})=
  • 16. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 16
  • 17. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 17 Introducción a la programación lineal Un problema en programación lineal tiene la forma max nX j=1 cjxj nX j=1 a1jxj  b1 nX j=1 a2jxj  b2 . . . nX j=1 amjxj  bm xj 0 j = 1, 2, . . . , n max nX cjxj X j=1 a2jxj  b2 . . . nX j=1 amjxj  bm xj 0 j = 1, 2, . . . , n max nX j=1 cjxj sujeto a nX j=1 aijxj  bi i = 1, 2, . . . , m xj 0 j = 1, 2, . . . , n max c · x sujeto a Ax  b x 0 j=1 sujeto a nX j=1 aijxj  bi i = 1, 2, . . . , m xj 0 j = 1, 2, . . . , n max c · x sujeto a Ax  b x 0 1 Sujeto a: Sujeto a: Sujeto a:
  • 18. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 18 Introducción a la programación lineal Ejemplo: Maximizar x1+x2 Sujeto a: – x1 + 9x2 ≤ 36 9x1 +x2 ≤ 45 x1, x2 ≥ 0 0 1 2 3 4 5 0 1 2 3 4 5 x1 x2 Región factible x1+x2=cte
  • 19. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 19 Introducción a la programación lineal
  • 20. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 20 Introducción a la programación lineal Con Rsymphony Maximizar x1+x2 Sujeto a: – x1 + 9x2 ≤ 36 9x1 +x2 ≤ 45 x1, x2 ≥ 0 0 1 2 3 4 5 0 1 2 3 4 5 x1 x2 Región factible Por defecto, las columnas se rellenan primero Tarea: resolver el programa con RStudio
  • 21. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 21 Programación lineal entera Se añade la restricción de que las variables solo pueden tomar valores enteros Ejemplo: Maximizar x1+x2 Sujeto a: – x1 + 9x2 ≤ 36 9x1 +x2 ≤ 45 x1, x2 ≥ 0 x1, x2 enteros 0 1 2 3 4 5 0 1 2 3 4 5 x1 x2 Soluciones factibles
  • 22. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 22 Con Rsymphony Maximizar x1+x2 Sujeto a: – x1 + 9x2 ≤ 36 9x1 +x2 ≤ 45 x1, x2 ≥ 0 x1, x2 enteros Tarea: resolver el programa con RStudio 0 1 2 3 4 5 0 1 2 3 4 5 x1 x2 Soluciones factibles Programación lineal entera
  • 23. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 23
  • 24. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 24 Dados: Ø Un conjunto de requisitos R = {r1, r2, ..., rn} … Ø … cada uno con un coste cj y un valor sj (Bagnall et al.→ clientes) Ø Un conjunto de interacciones funcionales entre requisitos Ø Implicación (ri antes que rj): Ø Combinación (ri a la vez que rj): Ø Exclusión (no a la vez): Encontrar un subconjunto de requisitos que además de cumplir con las interacciones minimice el coste y maximice el valor: del requisito rj para el cliente i se representa con vij 2 R. L valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de calcular como la suma ponderada de los valores de importa sj = Pm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im de desarrollo determinado, lo que limita las alternativas par Las interacciones funcionales entre requisitos se clasifican en Implicaci´on o precedencia. ri ) rj. Un requisito rj no p previamente otro requisito ri no ha sido implementado. Combinaci´on o acoplamiento. ri rj. Los requisitos ri y rj de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no puede ser incluido j Si llamamos X ✓ R al conjunto de requisitos seleccionado de X vienen dados por las funciones: coste(X) = nX cj y valor(X) = nX ar como la suma ponderada de los valores de imporPm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, sarrollo determinado, lo que limita las alternativas p teracciones funcionales entre requisitos se clasifican mplicaci´on o precedencia. ri ) rj. Un requisito rj no eviamente otro requisito ri no ha sido implementado ombinaci´on o acoplamiento. ri rj. Los requisitos ri y forma conjunta en el software. xclusi´on. ri rj. El requisito ri no puede ser incluido llamamos X ✓ R al conjunto de requisitos selecciona vienen dados por las funciones: nX nX calcular como la suma ponderada de los va sj = Pm i=1 wi ⇤vij. Los requisitos interaccion de desarrollo determinado, lo que limita las Las interacciones funcionales entre requisito Implicaci´on o precedencia. ri ) rj. Un previamente otro requisito ri no ha sido Combinaci´on o acoplamiento. ri rj. Los de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no pu Si llamamos X ✓ R al conjunto de requis de X vienen dados por las funciones: coste(X) = nX j,rj 2X cj y v da requisito rj 2 R tiene un coste cj para la empresa si se del requisito rj para el cliente i se representa con vij 2 R. L valor a˜nadido por la inclusi´on de rj en la siguiente versi´on de calcular como la suma ponderada de los valores de importa sj = Pm i=1 wi ⇤vij. Los requisitos interaccionan entre ellos, im de desarrollo determinado, lo que limita las alternativas pa Las interacciones funcionales entre requisitos se clasifican en Implicaci´on o precedencia. ri ) rj. Un requisito rj no previamente otro requisito ri no ha sido implementado. Combinaci´on o acoplamiento. ri rj. Los requisitos ri y r de forma conjunta en el software. Exclusi´on. ri rj. El requisito ri no puede ser incluido Si llamamos X ✓ R al conjunto de requisitos seleccionad de X vienen dados por las funciones: coste(X) = nX j,rj 2X cj y valor(X) = nX j,rj 2X respectivamente. Consideraremos una versi´on multi-objetiv minimice el coste y maximice el valor del conjunto de requi min max Bagnall et al. van der Akker et al. Next Release Problem (NRP) sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P valor( ˆR) = mX i=1 wi Y (j,i)2Q h j 2 ˆR i
  • 25. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 25 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: Objetivomax c · x o a Ax  b x 0 max mX i=1 wisi Tarea: hallar la expresión objetivo
  • 26. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 26 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: Objetivomax c · x o a Ax  b x 0 max mX i=1 wisi Tarea: hallar la expresión objetivo
  • 27. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 27 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: restricción de coste max c · x sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q Tarea: hallar la restricción de coste
  • 28. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 28 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: restricción de coste max c · x sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q Tarea: hallar la restricción de coste
  • 29. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 29 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: dependencias sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P 1 Tarea: hallar las restricciones de dependencias entre requisitos (implicación)
  • 30. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 30 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: dependencias sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P 1 Tarea: hallar las restricciones de dependencias entre requisitos (implicación)
  • 31. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 31 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: dependencias Tarea: hallar las restricciones de dependencias entre requisitos (combinación) sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P rj = ri 8(i, j) 2 C valor( ˆR) = mX i=1 wi Y (j,i)2Q h j 2 ˆR i
  • 32. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 32 En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: dependencias Tarea: hallar las restricciones de dependencias entre requisitos (combinación) sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P rj = ri 8(i, j) 2 C valor( ˆR) = mX i=1 wi Y (j,i)2Q h j 2 ˆR i
  • 33. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 33 max c · x sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: satisfacción de clientes Tarea: hallar las restricciones de satisfacción de clientes
  • 34. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 34 max c · x sujeto a Ax  b x 0 max mX i=1 wisi nX i=1 ciri  B sj  ri 8(i, j) 2 Q rj  ri 8(i, j) 2 P En nuestro caso resolveremos la version de Bagnall et al. mono-objetivo, con el coste limitado por una fracción del coste total de implementación de todos los requisites Definimos un conjunto de n variables ri para los requisitos y m variables si para los clientes. Tomarán valores 0 y 1. Si ri=1 el requisito i se implementa, si ri=0 no se implementa Si si=1 el cliente i está satisfecho (todos sus requisitos se implementan) El valor del cliente i para la empresa es wi El coste de implementar el requisito i es ci El presupuesto es B Modelo ILP de NRP: satisfacción de clientes Tarea: hallar las restricciones de satisfacción de clientes
  • 35. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 35 En la implementación en R se han usado las primeras n variables del vector de variables para los requisitos y las restantes m variables para los clientes Funciones relevantes: • readNrpInstance(file): lee un fichero de instancia y devuelve una lista con una representación interna • ilpModel(nrpInstance, budgetLimitFraction): toma una lista con una instancia y una fracción (número real) y crea un modelo ILP para la instancia Ejemplo: Modelo ILP de NRP Tarea: resolver algunas instancias con R
  • 36. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 36
  • 37. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 37 • En un problema MO hay varios objetivos (funciones) que queremos optimizar f1 f2 Soluciones eficientes (no dominadas) Soluciones débilmente eficientes Solución no soportada Optimización multiobjetivo Solución dominada
  • 38. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 38 Si minimizamos ambos objetivos f1 f2 Optimización multiobjetivo f1 f2 Frente convexo Frente cóncavo Fácil de resolver con sumas ponderadas de objetivos No se puede resolver con sumas ponderadas de objetivos
  • 39. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 39 ¿Cómo será el frente en NRP? coste valor valor coste
  • 40. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 40 0 100 200 300 400 500 600 700 800 0 10 20 30 40 50 60 Valor Coste ACS NSGAII GRASP Pareto (a) dataset1 0 500 1000 1500 2000 0 100 200 300 400 500 600 700 Valor Coste ACS NSGAII GRASP Pareto (b) dataset2 Figura 1. Frente de Pareto y aproximaciones de los algoritmos metaheur´ısticos. Hemos de indicar que estos tiempos se refieren de nuevo a una m´aquina diferente (Pentium 4 a 3,2 GHz) y el objetivo no era encontrar el frente completo, Algunos ejemplos C., Domínguez-Ríos, del Águila, del Sagrado, Alba, JISBD 2016
  • 41. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 41 NRP Multiobjetivo Tarea: hallar manualmente el frente de Pareto para nuestro ejemplo Clientes (importancia) Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5) r1 2 x x r2 4 x r3 3 x x r4 5 x
  • 42. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 42 NRP Multiobjetivo Tarea: calcula el frente usando R Clientes (importancia) Requisito Coste Cliente 1 (4) Cliente 2 (2) Cliente 3 (5) r1 2 x x r2 4 x r3 3 x x r4 5 x
  • 43. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 43
  • 44. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 44 Queremos encontrar una partición de un conjunto de módulos software de manera que el software quede estructurado en subsistemas que permitan una mejora en el desarrollo y mantenibilidad del mismo Agrupamiento de módulos software
  • 45. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 45 Cómo medir la calidad de la solución obtenida: Intra-conectividad: mide la cohesión entre módulos pertenecientes a un mismo subsistema. Inter-conectividad: mide el acoplamiento existente entre módulos que pertenecen a distintos subsistemas. La calidad de modularización del sistema (Modularization Quality, MQ) combina ambas. Agrupamiento de módulos software
  • 46. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 46 Dado un grafo de dependencias de módulos G = (V, A) , definimos un peso w para cada arista. Llamamos n al número de nodos (módulos) y m al número de aristas (número de relaciones o dependencias). Se define la calidad de modularización del sistema como El valor i (intra-conectividad) es la suma de los pesos de las aristas cuyos extremos están ambos dentro del subsistema. Mide la cohesión. El valor j (inter-conectividad) representa la suma de los pesos de las aristas con un extremo en el subsistema y el otro no. Mide el acoplamiento. Agrupamiento de módulos software
  • 47. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 47 087631 ===== MFMFMFMFMF 2 1 21 1 2 15 = ×+ =MF 7 4 32 2 2 12 = ×+ =MF 7 6 13 3 2 14 = ×+ =MF ...928571.1 14 27 7 6 7 4 2 1 ==++=MQ Agrupamiento de módulos software: ejemplo Tarea: hallar MQ
  • 48. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 48 087631 ===== MFMFMFMFMF 2 1 21 1 2 15 = ×+ =MF 7 4 32 2 2 12 = ×+ =MF 7 6 13 3 2 14 = ×+ =MF ...928571.1 14 27 7 6 7 4 2 1 ==++=MQ Agrupamiento de módulos software: ejemplo Tarea: hallar MQ
  • 49. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 49 Agrupamiento de módulos software: preguntas ¿Cuánto vale MQ si todos los módulos están en grupos diferentes? ¿Cuánto vale MQ si todos los módulos están en el mismo grupo? ¿Qué valor máximo puede tomar MQ?
  • 50. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 50 El número de particiones de un conjunto de n elementos es un número de Bell 1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975, … ¡Esto crece muy rápido! Los algoritmos enumerativos son inviables para muchos módulos El problema es no lineal (se descarta programación lineal entera) Algoritmos exactos: ramificación y poda Algoritmos aproximados: heurísticas y metaheurísticas Agrupamiento de módulos software: resolución
  • 51. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 51 Análisis del modelo: - Si n = 1, MQ* = 0 - Si n = 2, MQ* = 1 - Si todos los nodos están aislados, MQ = 0 - Si hay un único subsistema (y más de un nodo), MQ = 1 - Para k subsistemas y n-k subsistemas: MQ <= k - Experimentalmente se observa que el valor MQ* suele ser bajo en comparación con el número de módulos - Para k fijo, si hay gran diferencia de cardinalidad entre el grupo más grande y el más pequeño, se obtiene un valor de MQ más bajo. ( )2,1,3,1,2,1* =xFormato de una solución: [ ]1,0ÎiMF Agrupamiento de módulos software: resolución ¿Por qué?
  • 52. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 52 Agrupamiento de módulos software: resolución
  • 53. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 53 Valor obtenido por el mejor algoritmo heurístico de Praditwong et al MQ Enumerativo Algoritmo B&B Soluciones visitadas Tiempo (s) Soluciones visitadas Tiempo (s) MDG 8 1,92857 4140 0,09 6 0,10 MDG 10 2,5 115975 0,14 11 0,13 MDG 15 2,812 1382958545 226,00 24 23,00 mtunis 2,314* 2,314* 121,00* Agrupamiento de módulos software: resolución
  • 54. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 54
  • 55. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 55 Test Suite Minimization Given: Ø A set of test cases T = {t1, t2, ..., tn} Ø A set of software elements to be covered (e.g., use cases) E= {e1, e2, ..., ek} Ø A coverage matrix Find a subset of tests X Í T maximizing coverage and minimizing the testing cost tests X ✓ T with minimum cost covering all the program elements. In formal terms: minimize cost(X) = nX i=1 ti2X ci (2) subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mij = 1. The multi-objective version of the TSMP does not impose the constraint of full coverage, but it defines the coverage as the second objective to optimize, leading to a bi-objective problem. In short, the bi-objective TSMP consists in finding a subset of tests X ✓ T having minimum cost and maximum coverage. Formally: minimize cost(X) = nX i=1 ti2X ci (3) maximize cov(X) = |{ej 2 E|9ti 2 X with mij = 1}| (4) e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 M= 3 Test Suite Minimization Problem When a piece of software is modified, the new software is tested using previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra
  • 56. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 56 Example e ough a small example how to model with PB con- SMP according to the methodology above described. E = {e1, e2, e3, e4} and M: e1 e2 e3 e4 t1 1 0 1 0 t2 1 1 0 0 t3 0 0 1 0 t4 1 0 0 0 t5 1 0 0 1 t6 0 1 1 0 -obj TSMP we need to instantiate Eqs. (5), (6) and  t1 + t2 + t4 + t5  4e1 (10)  t2 + t6  4e2 (11)  t1 + t3 + t6  4e3 (12)  t5  4e4 (13) Assume unitary cost for tests: ci=1 cost({t1, t5})= cov({t1, t5})= cost({t1, t2, t5})= cov({t1, t2, t5})=
  • 57. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 57 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: constraints relating covered elements and tests The single-objective formulation of TSMP is a p formulation. Then, we can translate the 2-obj T and then infer the translation of the 1-obj TSM Let us introduce n binary variables ti 2 {0, ti = 1 then the corresponding test case is inclu the test case is not included. We also introduc one for each program element to cover. If ej = 1 is covered by one of the selected test cases a covered by a selected test case. The values of the ej variables are not indepe variable ej must be 1 if and only if there exist and ti = 1. The dependence between both sets the following 2m PB constraints: ej  nX i=1 mijti  n · ej We can see that if the sum in the middle
  • 58. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 58 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: constraints relating covered elements and tests The single-objective formulation of TSMP is a p formulation. Then, we can translate the 2-obj T and then infer the translation of the 1-obj TSM Let us introduce n binary variables ti 2 {0, ti = 1 then the corresponding test case is inclu the test case is not included. We also introduc one for each program element to cover. If ej = 1 is covered by one of the selected test cases a covered by a selected test case. The values of the ej variables are not indepe variable ej must be 1 if and only if there exist and ti = 1. The dependence between both sets the following 2m PB constraints: ej  nX i=1 mijti  n · ej We can see that if the sum in the middle
  • 59. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 59 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: expression for coverage ej  nX i=1 mijti  n · ej 1  j We can see that if the sum in the middle is zero element ej) then the variable ej = 0. However, if the ej = 1. Now we need to introduce a constraint related t in order to transform the optimization problem in a described in Section 2.2. These constraints are: nX i=1 citi  B, mX j=1 ej P, where B 2 Z is the maximum allowed cost and P 2 {0, 1
  • 60. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 60 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: expression for coverage ej  nX i=1 mijti  n · ej 1  j We can see that if the sum in the middle is zero element ej) then the variable ej = 0. However, if the ej = 1. Now we need to introduce a constraint related t in order to transform the optimization problem in a described in Section 2.2. These constraints are: nX i=1 citi  B, mX j=1 ej P, where B 2 Z is the maximum allowed cost and P 2 {0, 1
  • 61. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 61 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: expression for cost riable ej must be 1 if and only if there exists a ti variable f d ti = 1. The dependence between both sets of variables can e following 2m PB constraints: ej  nX i=1 mijti  n · ej 1  j  m. We can see that if the sum in the middle is zero (no tes ment ej) then the variable ej = 0. However, if the sum is = 1. Now we need to introduce a constraint related to each o order to transform the optimization problem in a decision scribed in Section 2.2. These constraints are: nX i=1 citi  B, mX ej P,
  • 62. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 62 Modelling the TSM Problem using ILP M= previous test cases in order to check if new errors were introduced. This is known as regression testing. One problem related to regression testing Test Suite Minimization Problem (TSMP). This problem is equivalent t Minimal Hitting Set Problem which is NP-hard [17]. Let T = {t1, t2, · · · be a set of tests for a program where the cost of running test ti is ci an E = {e1, e2, · · · , em} be a set of elements of the program that we want to with the tests. After running all the tests T we find that each test can several program elements. This information is stored in a matrix M = [m dimension n ⇥ m that is defined as: mij = ( 1 if element ej is covered by test ti 0 otherwise The single-objective version of this problem consists in finding a subs tests X ✓ T with minimum cost covering all the program elements. In fo terms: minimize cost(X) = nX i=1 ti2X ci subject to: 8ej 2 E, 9ti 2 X such that element ej is covered by test ti, that is, mi The multi-objective version of the TSMP does not impose the constra full coverage, but it defines the coverage as the second objective to opti leading to a bi-objective problem. In short, the bi-objective TSMP consi finding a subset of tests X ✓ T having minimum cost and maximum cove Formally: n e1 e2 e3 ... ek t1 1 0 1 … 1 t2 0 0 1 … 0 … … … … … … tn 1 1 0 … 0 Let us use n Boolean variables ti and m Boolean variables ei: - ti=1 iff test i is selected - ei=1 iff element i is covered (it depends on ti) ci is the cost of test ti Task: expression for cost riable ej must be 1 if and only if there exists a ti variable f d ti = 1. The dependence between both sets of variables can e following 2m PB constraints: ej  nX i=1 mijti  n · ej 1  j  m. We can see that if the sum in the middle is zero (no tes ment ej) then the variable ej = 0. However, if the sum is = 1. Now we need to introduce a constraint related to each o order to transform the optimization problem in a decision scribed in Section 2.2. These constraints are: nX i=1 citi  B, mX ej P,
  • 63. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 63 Example e ough a small example how to model with PB con- SMP according to the methodology above described. E = {e1, e2, e3, e4} and M: e1 e2 e3 e4 t1 1 0 1 0 t2 1 1 0 0 t3 0 0 1 0 t4 1 0 0 0 t5 1 0 0 1 t6 0 1 1 0 -obj TSMP we need to instantiate Eqs. (5), (6) and  t1 + t2 + t4 + t5  4e1 (10)  t2 + t6  4e2 (11)  t1 + t3 + t6  4e3 (12)  t5  4e4 (13) t5 1 0 0 1 t6 0 1 1 0 If we want to solve the 2-obj TSMP we need to instantiate E (7). The result is: e1  t1 + t2 + t4 + t5  4e1 e2  t2 + t6  4e2 e3  t1 + t3 + t6  4e3 e4  t5  4e4 t1 + t2 + t3 + t4 + t5 + t6  B e1 + e2 + e3 + e4 P where P, B 2 N. If we are otherwise interested in the 1-obj version the formula t1 + t2 + t4 + t5 1 t2 + t6 1 t1 + t3 + t6 1 t5 1 t1 + t2 + t3 + t4 + t5 + t6  B f(x)  B e1  t1 + t2 + t4 + t5  6e1 e2  t2 + t6  6e2 e3  t1 + t3 + t6  6e3 e4  t5  6e4 Task: find equations for this example min max
  • 64. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 64 Example e ough a small example how to model with PB con- SMP according to the methodology above described. E = {e1, e2, e3, e4} and M: e1 e2 e3 e4 t1 1 0 1 0 t2 1 1 0 0 t3 0 0 1 0 t4 1 0 0 0 t5 1 0 0 1 t6 0 1 1 0 -obj TSMP we need to instantiate Eqs. (5), (6) and  t1 + t2 + t4 + t5  4e1 (10)  t2 + t6  4e2 (11)  t1 + t3 + t6  4e3 (12)  t5  4e4 (13) t5 1 0 0 1 t6 0 1 1 0 If we want to solve the 2-obj TSMP we need to instantiate E (7). The result is: e1  t1 + t2 + t4 + t5  4e1 e2  t2 + t6  4e2 e3  t1 + t3 + t6  4e3 e4  t5  4e4 t1 + t2 + t3 + t4 + t5 + t6  B e1 + e2 + e3 + e4 P where P, B 2 N. If we are otherwise interested in the 1-obj version the formula t1 + t2 + t4 + t5 1 t2 + t6 1 t1 + t3 + t6 1 t5 1 t1 + t2 + t3 + t4 + t5 + t6  B f(x)  B e1  t1 + t2 + t4 + t5  6e1 e2  t2 + t6  6e2 e3  t1 + t3 + t6  6e3 e4  t5  6e4 Task: find equations for this example min max
  • 65. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 65 Algorithm for Solving the 2-obj TSM Cost Coverage Max coverage Find max coverage Decrease cost and find the maximum coverage again and again min cost, keeping cov
  • 66. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 66 Instances from the Software-artifact Infrastructure Repository (SIR) TSM Instances http://sir.unl.edu/portal/index.php Instance Tests Elements to cover printtokens1 4130 189 printtokens2 4115 199 replace 5542 242 schedule 2650 151 schedule2 2710 128 tcas 1608 65 totinfo 1052 124
  • 67. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 67 En la implementación en R se han usado las primeras n variables del vector de variables para los tests y las restantes m variables para los elementos a cubrir Funciones relevantes: • readTsmInstance(file, unitaryCost=FALSE): lee un fichero de instancia y devuelve una lista con una representación interna • ilpModel4Tsm(tsmInstance, costUpperBound=NULL, covLowerBound=NULL): toma una instancia y una cota para coste o cobertura y crea un modelo ILP para la instancia que optimiza el objetivo que no está acotado • solveModel(model): resuelve el modelo ILP que se pasa como parámetro Ejemplo: Ejercicio Tarea: resolver algunas instancias con R
  • 68. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 68 Complete la función computeParetoFront para calcular el frente complete de una instancia Ejemplo: Ejercicio Tarea: completar computeParetoFront
  • 69. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 69 Reduction in the Number of Test Cases We can reduce the number of tests cases in the original test suite If a test t1 covers more elements than another test t2 and has less cost, t2 can be removed e1 e2 e3 ... em t1 1 0 0 … 1 t2 1 0 1 … 1 … … … … … … tn 1 1 0 … 0 Test t1 can be removed if c1 >= c2
  • 70. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 70 Reduction in the Number of Test Cases Instance Tests Reduced tests printtokens1 4130 printtokens2 4115 replace 5542 schedule 2650 schedule2 2710 tcas 1608 totinfo 1052 Tarea: completar la tabla Con la ayuda de reduceInstance complete la table. ¿Cuánto se tarda ahora en calcular el frente de Pareto? ¿Es igual?
  • 71. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 71
  • 72. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 72 Refactoring Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4 G29: Avoid Negative Conditionals Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be expressed as positives. For example: if((buffer.shouldCompact()) is preferable to if((!buffer.shouldNotCompact()) G30: Functions Should Do One Thing It is often tempting to create functions that have multiple sections that perform a series of operations. Functions of this kind do more than one thing, and should be converted into many smaller functions, each of which does one thing. For example: public(void(pay()({ ((for((Employee(e(:(employees)({ ((((if((e.isPayday())({ ((((((Money(pay(=(e.calculatePay(); ((((((e.deliverPay(pay); ((((} ((} } This bit of code does three things. It loops over all the employees, checks to be paid, and then pays the employee. This code would be better written as: public(void(pay()({ ((for((Employee(e(:(employees) ((((payIfNecessary(e); } private(void(payIfNecessary(Employee(e)({ ((if((e.isPayday()) ((((calculateAndDeliverPay(e); } private(void(calculateAndDeliverPay(Employee(e)({ ((Money(pay(=(e.calculatePay(); ((e.deliverPay(pay); } Each of these functions does one thing. (See “Do One Thing” on page 35.) G31: Hidden Temporal Couplings Temporal couplings are often necessary, but you should not hide the couplin Semantic-preserving change in the code
  • 73. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 73 Anti-pattern Common solution to a problem with bad consequences
  • 74. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 74 Automatic Refactoring Página 13 de 18http://0-proquestcombo.safaribooksonline.com.jabega.uma.es/print?xmlid=9780136083238%2Fch17lev1sec4 Boolean logic is hard enough to understand without having to see it in the context of an if or while statement. Extract functions that explain the intent of the conditional. For example: if((shouldBeDeleted(timer)) is preferable to if((timer.hasExpired()(&&(!timer.isRecurrent()) G29: Avoid Negative Conditionals Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be expressed as positives. For example: if((buffer.shouldCompact()) is preferable to if((!buffer.shouldNotCompact()) G30: Functions Should Do One Thing It is often tempting to create functions that have multiple sections that perform a series of operations. Functions of this kind do more than one thing, and should be converted into many smaller functions, each of which does one thing. For example: public(void(pay()({ ((for((Employee(e(:(employees)({ ((((if((e.isPayday())({ ((((((Money(pay(=(e.calculatePay(); ((((((e.deliverPay(pay); ((((} ((} } This bit of code does three things. It loops over all the employees, checks to s be paid, and then pays the employee. This code would be better written as: public(void(pay()({ ((for((Employee(e(:(employees) ((((payIfNecessary(e); } private(void(payIfNecessary(Employee(e)({ ((if((e.isPayday()) ((((calculateAndDeliverPay(e); } private(void(calculateAndDeliverPay(Employee(e)({ ((Money(pay(=(e.calculatePay(); ((e.deliverPay(pay); } Each of these functions does one thing. (See “Do One Thing” on page 35.) G31: Hidden Temporal Couplings Temporal couplings are often necessary, but you should not hide the couplin
  • 75. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 75 ential dependency conflicts and mutual exclusion e more on these two kind of conflicts in the fol- belongs to class B instead, if A is a subclass of B. To better illustrate the refactoring scheduling problem, and the ef- fect that the consideration of dependencies and conflicts between re- factorings has on the size of the search-space, we present an example of Listing 1. Example of classes to be refactored. reduce even more the search-space by removing these permutations as they lead to the same design (same solution). This occurs because they affect different code segments (the method and target class is different for r1 and r3) , i.e., they are unrelated. In addition, when a conflict exists between refactorings, it is pos- sible to reduce the size of the search space further. For example, con- sider the sequential dependency conflict between r1, r2, that is r2 cannot be applied before r1 (inlining class Rectangle invalidates any move method refactoring from/to that class). Hence, by removing redundant solutions, and invalid solutions (solutions with elements that are con- flicted) we can reduce the search-space size of the motivating example by half (sequences 1, 2, 3, 4, 5, 6, 8 and 11). Thus, the value obtained after applying Eq. (2) should be used as an upper bound of the search- space size, as long as we assume that applying a refactoring sequence code-ana and a h lationship the lifetim ships. He relationsh identified contains and anti- nipulate this step matically apply ref quality o design m Gueheneu Antoniol 3.2. Step In thi available instances that part 3.3. Step Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Me r1 Move method Geometry cal r2 Inline Class Rectangle All r3 Introduce Parameter Object Geometry lon Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 3. r2 11. r1, r2, r3 4. r3 12. r1, r3, r2 5. r1, r2 13. r2, r1, r3 6. r1, r3 14. r2, r3, r1 7. r2, r1 15. r3, r2, r1 8. r2, r3 16. r3, r1, r2 R. Morales et al. code-analyses with typically 100% precision and recall for associations and a high precision and recall for aggregations. Composition re- lationships cannot be entirely identified statically because they involve the lifetime of the instances of the classes involved in such relation- ships. Hence, idiom-level models include association and aggregation relationships and only the few composition relationships that can be identified with high precision and recall statically. A design-level model contains information about occurrences of design motifs, code smells, Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Method Target Class r1 Move method Geometry calcAreaRectangle Rectangle r2 Inline Class Rectangle All fields and methods Shape r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new) Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 R. Morales et al. Example i.e., the (1) detection of classes that contain anti-patterns; (2) the generation of refactoring candidates to improve the design quality of the classes detected in (1); (3) the search for an optimal refactoring order; and (4) the application of the refactoring order from (3). To achieve this goal, we propose a new heuristic approach called RePOR (Refactoring approach based on Partial Order Reduction). Partial order reduction is a popular technique for controlling state space explosion in model checking (Lluch-Lafuente et al., 2002). The intuition is to reduce the number of refactoring sequences to be explored by removing equivalent sequences (i.e., refactoring sequences that leads to the same design). As a result, less search effort is required than when using metaheuristic algorithms. To evaluate RePOR, we conduct a series of experiments over a testbed of five open source software systems (OSS) and compare the results with Genetic Algorithm (GA) (Holland, 1975), Ant Colony optimization (ACO) (Dorigo et al., 2006), the conflict-aware refactoring scheduling approach proposed by Liu et al. (2008) (referred to as LIU in this paper), and a new optimizer based on sampling (SWAY) (Chen et al., 2018). We show that the solutions obtained by RePOR overcome the ones obtained by the above-mentioned state-of-the-art optimization techniques in terms of performance (i.e., execution time) and effort (i.e., number of refactorings applied). Tool and Data Replication. The Eclipse Plug-in and all the data used in the experiments are available on the RePOR replication package (Morales et al., 2017b). The remainder of the paper is organized as follows: Section 2 dis- cusses the formulation of the refactoring scheduling problem, and de- scribes how to reduce the search-space size using partial order reduc- tion. Section 3 describes RePOR in detail. Section 4 presents the case study for evaluating our approach. Section 5 presents and discusses the results obtained in our case study. Section 7 discloses the threats to the validity of our study. Related work is discussed in Section 8. Finally, we present our conclusions and lay out directions for future work in Section 9. 2. Formulation of the refactoring scheduling problem As a software system ages, its design quality deteriorates unless it is continually maintained (Parnas, 1994). Refactoring is a software maintenance activity that aims to keep the design quality of a software system at an acceptable level, in order to ensure a normal evolution of the system. Typically, refactoring is performed by applying small transformation operations (e.g., moving a method/field to another class) to a software system while preserving its original behavior. Since there is a wide range of candidate refactorings that can be applied on a system, depending on the domain of the system, an optimal solution may be comprised of several refactorings that improve different quality attributes. Hence, the refactoring scheduling problem consists of finding the best combination of refactorings that maximizes the design quality improvement of a software system. The problem of finding an optimal order can be solved using search-based techniques. Search al- gorithms start by generating one or more random sequences. Next, the quality of each sequence is computed by applying it to the software the number of occurrences of an The outcome of Q(SR) is a nega moves anti-patterns; zero if the same, and positive otherwise. T lated to the presence and the or Hence, we suggest that refac on the classes that they affect. I parately. Since the order of app ferent classes in a sequence is irr refactoring operations that we n that we have a set of refact According to Morales et al. (2 quences (S) that we could gener given by Eq. (2). = ⎧ ⎨⎩ ⌊ ⌋ ∀ ≥ = S e n n n · ! 1 1 0 where e is the Euler constant, available. Applying Eq. (2) to our ex (⌊ ⌋ =e·2! 5): < > , < A > , < B if (iff) we assume that each per (here the term solution refers to sequence to a system, i.e., the re and < B, A > are two different and only 4 different solutions ex In the case of refactorings th design may vary depending on factorings, as the application of the rest of refactorings. We can factorings as an undirected graph ru, rv ∈ Rk. k ∈ K, where K is the set of refactorings that affect cla graph, is linked to the structure o refactorings modify a class, an factorings affect the number of after refactoring. We use GB to find the conne component is a maximal subgra connected by a path. Connecte over the refactoring operations. reduction from model checking (L the removal of sequences of refa Partial order reduction (POR) i tativity of asynchronous systems concurrent models impose an a events, refactoring scheduling im refactoring operations. The orde instructions is meaningless (as th Hence, we can consider just o property since the other ordering to construct a reduced state gra Are all permutations relevant?
  • 76. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 76 the em. re- e to -re- the ing, the y of ring To ∑= = ′ − ∈ Q SR Q sr Q sr AC k AC k( ) ( ); with ( ) ( ) ( ) k K k k In Eq. (1), SR is a subset of R; R is the set of refactorin applied in a system SYS; K is the set of classes in SYS, K ∈ SYS subset of SR that modifies class k (k ∈ K). Each sub-function computed by subtracting the number of occurrences of anti-pa class k after applying srk to k (i.e., AC(k′)) and the number o rences of anti-patterns before refactoring (i.e., AC(k)). Note tha the number of occurrences of anti-patterns as a proxy of design The outcome of Q(SR) is a negative value when applying SR moves anti-patterns; zero if the number of anti-patterns rem same, and positive otherwise. The quality effect of applying Objective Function Class after refactoring Class before refactoring Anti-patterns count me conclusions and future work. UDO-BOOLEAN OPTIMIZATION hod for identifying improving moves in the radius g ball can be applied to all k-bounded pseudo- ptimization problems. This makes our method al: every compressible pseudo-Boolean Optimiza- m can be transformed into a quadratic pseudo- ptimization problem with k = 2. ily of k-bounded pseudo-Boolean Optimization ave also been described as an embedded landscape. ed landscape [3] with bounded epistasis k is de- function f(x) that can be written as the sum nctions, each one depending at most on k input That is: f(x) = mX i=1 f(i) (x), (1) subfunctions f(i) depend only on k components dded Landscapes generalize NK-landscapes and SAT problem. We will consider in this paper that of subfunctions is linear in n, that is m 2 O(n). dscapes m = n and is a common assumption in T that m 2 O(n). subfunctions f . Let us define w such that the i-th element of wl is on variable xi. The vector wl ca that characterizes the variables t has bounded epistasis k, the num with |wl|, is at most k. By the equalities immediately follow. f(l) (x v) = f(l) (x) for all v S(l) v (x) = ⇢ 0 if w S (l) v^wl (x) othe Equation (5) claims that if n change in the move characterize f(l) the Score of this subfunction this subfunction will not change f On the other hand, if f(l) depend we only need to consider for the changed variables that a↵ect f(l) acterized by the mask vector v ^ we can write (3) as: Sv(x) = mX l=1 wl^v6=0 f = + + +f(1)(x) f(2)(x) f(3)(x) f(4)(x) x1 x2 x3 x4 The structure is well-known in optimization… x4 x3 x1 x2 Variable Interaction Graph
  • 77. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 77 Objective Function x1 x2 x4 x3 x5 x6 If variable interaction graph has several connected componentes, we can optimize each of them independently
  • 78. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 78 Dependency Graph (GB) r1 r2 r4 r3 r5 r6 Two refactoring operations are adjacent in GB when both touch the same class We can optimize each connected component of GB independently, exploring all the posible sequences in the component
  • 79. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 79 Dependency Graph (GB): example What is the dependency graph in our example? kind of conflicts, sequential dependency conflicts and mutual exclusion conflicts. We elaborate more on these two kind of conflicts in the fol- lowing. • Given two refactorings ri and rj, ri has a sequential dependency conflict with rj iff rj cannot be applied before ri. We represent se- quential dependency conflicts as follows: r1 → r2, which means that r1 can be followed by r2, but r2 cannot be followed by r1. Note that conflicts are directional, i.e., the fact that applying rj disables ri does not necessarily means that ri disables rj. • Given two refactorings ri and rj, ri has a mutual exclusion conflict with rj iff ri and rj cannot be applied together in any order. We re- present mutual exclusion with the following notation: ¬ ↔r r1 2. belongs to class B instead, if A is a subclass of B. To better illustrate the refactoring scheduling problem, and the ef- fect that the consideration of dependencies and conflicts between re- factorings has on the size of the search-space, we present an example of the problem in Listing 1. The refactorings presented in Table 1 can be applied to refactor the classes described in Listing 1. Table 1 contains three type of refactorings from Fowler (1999b) that we describe below: 1. Move method. Move a method from one class to another (e.g., to one of its parameter types (Seng et al., 2006)). 2. Inline Class. If a class contains few responsibilities, move all its features to another class and remove it. Listing 1. Example of classes to be refactored. R. Morales et al. reduce even more the search-space by removing these permutations as they lead to the same design (same solution). This occurs because they affect different code segments (the method and target class is different for r1 and r3) , i.e., they are unrelated. In addition, when a conflict exists between refactorings, it is pos- sible to reduce the size of the search space further. For example, con- sider the sequential dependency conflict between r1, r2, that is r2 cannot code-analyses with typically 100% precision and recall for associations and a high precision and recall for aggregations. Composition re- lationships cannot be entirely identified statically because they involve the lifetime of the instances of the classes involved in such relation- ships. Hence, idiom-level models include association and aggregation relationships and only the few composition relationships that can be identified with high precision and recall statically. A design-level model contains information about occurrences of design motifs, code smells, and anti-patterns. A code meta-model should provide methods to ma- nipulate the design model and generate other models. The objective of this step is to manipulate the design model of a system program- matically. Hence, the code meta-model is used to detect anti-patterns, apply refactoring sequences and evaluate their impact on the design quality of a system. More information related to code meta-models, design motifs and micro-architecture identification can be found in Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and Antoniol (2008). 3.2. Step 2: detect anti-patterns In this step we detect anti-patterns in the meta-model using any Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Method Target Class r1 Move method Geometry calcAreaRectangle Rectangle r2 Inline Class Rectangle All fields and methods Shape r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new) Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 3. r2 11. r1, r2, r3 4. r3 12. r1, r3, r2 5. r1, r2 13. r2, r1, r3 6. r1, r3 14. r2, r3, r1 7. r2, r1 15. r3, r2, r1 8. r2, r3 16. r3, r1, r2 R. Morales et al. r1 r2 r3 Task: find the dependency graph
  • 80. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 80 Dependency Graph (GB): example What is the dependency graph in our example? kind of conflicts, sequential dependency conflicts and mutual exclusion conflicts. We elaborate more on these two kind of conflicts in the fol- lowing. • Given two refactorings ri and rj, ri has a sequential dependency conflict with rj iff rj cannot be applied before ri. We represent se- quential dependency conflicts as follows: r1 → r2, which means that r1 can be followed by r2, but r2 cannot be followed by r1. Note that conflicts are directional, i.e., the fact that applying rj disables ri does not necessarily means that ri disables rj. • Given two refactorings ri and rj, ri has a mutual exclusion conflict with rj iff ri and rj cannot be applied together in any order. We re- present mutual exclusion with the following notation: ¬ ↔r r1 2. belongs to class B instead, if A is a subclass of B. To better illustrate the refactoring scheduling problem, and the ef- fect that the consideration of dependencies and conflicts between re- factorings has on the size of the search-space, we present an example of the problem in Listing 1. The refactorings presented in Table 1 can be applied to refactor the classes described in Listing 1. Table 1 contains three type of refactorings from Fowler (1999b) that we describe below: 1. Move method. Move a method from one class to another (e.g., to one of its parameter types (Seng et al., 2006)). 2. Inline Class. If a class contains few responsibilities, move all its features to another class and remove it. Listing 1. Example of classes to be refactored. R. Morales et al. reduce even more the search-space by removing these permutations as they lead to the same design (same solution). This occurs because they affect different code segments (the method and target class is different for r1 and r3) , i.e., they are unrelated. In addition, when a conflict exists between refactorings, it is pos- sible to reduce the size of the search space further. For example, con- sider the sequential dependency conflict between r1, r2, that is r2 cannot code-analyses with typically 100% precision and recall for associations and a high precision and recall for aggregations. Composition re- lationships cannot be entirely identified statically because they involve the lifetime of the instances of the classes involved in such relation- ships. Hence, idiom-level models include association and aggregation relationships and only the few composition relationships that can be identified with high precision and recall statically. A design-level model contains information about occurrences of design motifs, code smells, and anti-patterns. A code meta-model should provide methods to ma- nipulate the design model and generate other models. The objective of this step is to manipulate the design model of a system program- matically. Hence, the code meta-model is used to detect anti-patterns, apply refactoring sequences and evaluate their impact on the design quality of a system. More information related to code meta-models, design motifs and micro-architecture identification can be found in Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and Antoniol (2008). 3.2. Step 2: detect anti-patterns In this step we detect anti-patterns in the meta-model using any Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Method Target Class r1 Move method Geometry calcAreaRectangle Rectangle r2 Inline Class Rectangle All fields and methods Shape r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new) Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 3. r2 11. r1, r2, r3 4. r3 12. r1, r3, r2 5. r1, r2 13. r2, r1, r3 6. r1, r3 14. r2, r3, r1 7. r2, r1 15. r3, r2, r1 8. r2, r3 16. r3, r1, r2 R. Morales et al. r1 r2 r3 Task: find the dependency graph
  • 81. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 81 Conflict Graph (GC) r1 r2 r4 r3 r5 r6 Conflict graph is used to reduce the number sequences to explore in each component Sequential dependency conflict Mutual exclusion conflict
  • 82. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 82 What is the conflict graph in our example? kind of conflicts, sequential dependency conflicts and mutual exclusion conflicts. We elaborate more on these two kind of conflicts in the fol- lowing. • Given two refactorings ri and rj, ri has a sequential dependency conflict with rj iff rj cannot be applied before ri. We represent se- quential dependency conflicts as follows: r1 → r2, which means that r1 can be followed by r2, but r2 cannot be followed by r1. Note that conflicts are directional, i.e., the fact that applying rj disables ri does not necessarily means that ri disables rj. • Given two refactorings ri and rj, ri has a mutual exclusion conflict with rj iff ri and rj cannot be applied together in any order. We re- present mutual exclusion with the following notation: ¬ ↔r r1 2. belongs to class B instead, if A is a subclass of B. To better illustrate the refactoring scheduling problem, and the ef- fect that the consideration of dependencies and conflicts between re- factorings has on the size of the search-space, we present an example of the problem in Listing 1. The refactorings presented in Table 1 can be applied to refactor the classes described in Listing 1. Table 1 contains three type of refactorings from Fowler (1999b) that we describe below: 1. Move method. Move a method from one class to another (e.g., to one of its parameter types (Seng et al., 2006)). 2. Inline Class. If a class contains few responsibilities, move all its features to another class and remove it. Listing 1. Example of classes to be refactored. R. Morales et al. reduce even more the search-space by removing these permutations as they lead to the same design (same solution). This occurs because they affect different code segments (the method and target class is different for r1 and r3) , i.e., they are unrelated. In addition, when a conflict exists between refactorings, it is pos- sible to reduce the size of the search space further. For example, con- sider the sequential dependency conflict between r1, r2, that is r2 cannot code-analyses with typically 100% precision and recall for associations and a high precision and recall for aggregations. Composition re- lationships cannot be entirely identified statically because they involve the lifetime of the instances of the classes involved in such relation- ships. Hence, idiom-level models include association and aggregation relationships and only the few composition relationships that can be identified with high precision and recall statically. A design-level model contains information about occurrences of design motifs, code smells, and anti-patterns. A code meta-model should provide methods to ma- nipulate the design model and generate other models. The objective of this step is to manipulate the design model of a system program- matically. Hence, the code meta-model is used to detect anti-patterns, apply refactoring sequences and evaluate their impact on the design quality of a system. More information related to code meta-models, design motifs and micro-architecture identification can be found in Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and Antoniol (2008). 3.2. Step 2: detect anti-patterns In this step we detect anti-patterns in the meta-model using any Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Method Target Class r1 Move method Geometry calcAreaRectangle Rectangle r2 Inline Class Rectangle All fields and methods Shape r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new) Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 3. r2 11. r1, r2, r3 4. r3 12. r1, r3, r2 5. r1, r2 13. r2, r1, r3 6. r1, r3 14. r2, r3, r1 7. r2, r1 15. r3, r2, r1 8. r2, r3 16. r3, r1, r2 R. Morales et al. r1 r2 r3 Task: find the conflict graph Conflict Graph (GC): example
  • 83. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 83 What is the conflict graph in our example? kind of conflicts, sequential dependency conflicts and mutual exclusion conflicts. We elaborate more on these two kind of conflicts in the fol- lowing. • Given two refactorings ri and rj, ri has a sequential dependency conflict with rj iff rj cannot be applied before ri. We represent se- quential dependency conflicts as follows: r1 → r2, which means that r1 can be followed by r2, but r2 cannot be followed by r1. Note that conflicts are directional, i.e., the fact that applying rj disables ri does not necessarily means that ri disables rj. • Given two refactorings ri and rj, ri has a mutual exclusion conflict with rj iff ri and rj cannot be applied together in any order. We re- present mutual exclusion with the following notation: ¬ ↔r r1 2. belongs to class B instead, if A is a subclass of B. To better illustrate the refactoring scheduling problem, and the ef- fect that the consideration of dependencies and conflicts between re- factorings has on the size of the search-space, we present an example of the problem in Listing 1. The refactorings presented in Table 1 can be applied to refactor the classes described in Listing 1. Table 1 contains three type of refactorings from Fowler (1999b) that we describe below: 1. Move method. Move a method from one class to another (e.g., to one of its parameter types (Seng et al., 2006)). 2. Inline Class. If a class contains few responsibilities, move all its features to another class and remove it. Listing 1. Example of classes to be refactored. R. Morales et al. reduce even more the search-space by removing these permutations as they lead to the same design (same solution). This occurs because they affect different code segments (the method and target class is different for r1 and r3) , i.e., they are unrelated. In addition, when a conflict exists between refactorings, it is pos- sible to reduce the size of the search space further. For example, con- sider the sequential dependency conflict between r1, r2, that is r2 cannot code-analyses with typically 100% precision and recall for associations and a high precision and recall for aggregations. Composition re- lationships cannot be entirely identified statically because they involve the lifetime of the instances of the classes involved in such relation- ships. Hence, idiom-level models include association and aggregation relationships and only the few composition relationships that can be identified with high precision and recall statically. A design-level model contains information about occurrences of design motifs, code smells, and anti-patterns. A code meta-model should provide methods to ma- nipulate the design model and generate other models. The objective of this step is to manipulate the design model of a system program- matically. Hence, the code meta-model is used to detect anti-patterns, apply refactoring sequences and evaluate their impact on the design quality of a system. More information related to code meta-models, design motifs and micro-architecture identification can be found in Gueheneuc and Albin-Amiot (2004) and Guéhéneuc and Antoniol (2008). 3.2. Step 2: detect anti-patterns In this step we detect anti-patterns in the meta-model using any Table 1 List of refactorings candidates for the example from Listing 1. ID Type Source class Method Target Class r1 Move method Geometry calcAreaRectangle Rectangle r2 Inline Class Rectangle All fields and methods Shape r3 Introduce Parameter Object Geometry longParameterListMethod GeometryParamObj (new) Table 2 Enumeration of possible refactoring sequences for the set of refactoring op- erations {r1, r2, r3}. sequence elements sequence elements 1. None 9. r3, r1 2. r1 10. r3, r2 3. r2 11. r1, r2, r3 4. r3 12. r1, r3, r2 5. r1, r2 13. r2, r1, r3 6. r1, r3 14. r2, r3, r1 7. r2, r1 15. r3, r2, r1 8. r2, r3 16. r3, r1, r2 R. Morales et al. r1 r2 r3 Task: find the conflict graph Conflict Graph (GC): example
  • 84. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 84 Input : System to refactor (SYS), Maximum number of refactoring operations in a connected component subgraph (threshold) Output: An optimal sequence of refactoring operations (S R) 1 Require Proc: extractBestPermutation, getFirstValidS equenceFromccap 2 Steps RePOR(SYS, threshold) 3 AM = code meta-model generation (SYS) 4 A = Detect Anti-patterns(AM) 5 R = Generate set of refactoring candidates(AM, A) 6 GB = Build Graph of dependencies between refactorings and anti-patterns(AM, R, A) 7 CCAP = Find connected components (GB) 8 GC = Build Graph of conflicts between refactorings (AM, LR) 9 S R = Schedule sequence of refactorings(CCAP, GC, AM) 10 Procedure Schedule sequence of refactorings(CCAP, GC, AM): 11 S R = 0 12 for each ccap ∈ CCAP do 13 ccap.RemoveInvalidRefactorings(S R) 14 if ccap.size == 0 then 15 continue 16 else 17 List permuts = enumeratePermutations(ccap) 18 if permuts ≤ threshold then 19 S R.addAll(extractBestPermutation(AM, GC, permuts)) 20 else 21 S R.addAll(getFirstValidS equenceFromccap(AM, GC, ccap, R)) 22 end if 23 end if 24 end for 25 return S R 26 end Algorithm 1. RePOR. RePOR
  • 85. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 85 Experimental Setup Subjects Tools • PADL to create a high level model of the software • DECOR to detect and correct anti-patterns on the model In Table 4 we describe the type of anti-patterns studied and refactoring strategies used to remove them. Table 5 shows the num of refactoring candidates that were automatically found in each sys 4.3. RePOR implementation We instantiate RePOR as an eclipse plug-in and compared it three refactoring approaches. Design improvement (DI) is meas using Eq. (3). To determine the value of the parameter thres Listing 2. Rule card of Blob anti-pattern from DECOR. Table 3 Descriptive statistics about the studied systems. System NOC KLOC BL LC LP SC SG Total Apache Ant 1.8.2 697 191 57 40 35 3 6 141 ArgoUML 0.34 1754 183 131 25 281 1 19 457 GanttProject 1.10.2 188 44 47 4 68 5 6 130 JfreeChart 1.0.19 505 98 41 21 62 1 1 126 Xerces 2.7 540 71 56 25 119 2 3 205 Table 4 List of studied Anti-patterns and the refactorings used to correct them. Type Description Refactoring(s) strategy Blob (BL) (Brown et al., 1998) A large class that absorbs most of the functionality of the system with very low cohesion between its constituents. Move method (MM). Move the methods that does not seem to fit in Blob class abstraction to more appropriate classes (Seng et al., 200 Lazy Class (LC) (Fowler, 1999a) Small classes with low complexity that do not justify their existence in the system. Inline class (IC). Move the attributes and methods of the LC to anot class in the system. Long Parameter List (LP) (Fowler, 1999a) A class with one or more methods having a long list of parameters, specially when two or more methods are sharing a long list of parameters that are semantically connected. Introduce parameter object (IPO). Extract a new class with the long of parameters and replace the method signature by a reference to new object created. Then access to this parameters through the parameter object. Spaghetti Code (SC) (Brown et al., 1998) A class without structure that declares long methods without parameters. Replace method with method object (RMWO). Extract long methods i new classes so all local variables become fields on that object. Speculative Generality (SG) There is an abstract class created to anticipate further features, but it Collapse hierarchy (CH). Move the attributes and methods of the ch described in Section 3.7, we executed 30 independent executions for each of the systems studied in a Windows 10 64-bit, Intel Core 5 at 2.30 GHz, 12 GB of memory machine, and record the size of ccap, where the performance of RePOR is acceptable, and found =threshold 10 to be the best trade. The value of threshold indicates that for our experiments, we only exhaustively explore the permutations of a ccap containing 10 or less refactoring operations, and evaluate the resultant permutations only after removing any conflicted refactoring operation. The directed graph of conflicts (GC) is used for the three meta- heuristics to avoid scheduling invalid refactorings. Due to the random nature of the metaheuristics studied (i.e., ACO, GA, and SWAY) it is ne- cessary to perform several independent runs to have an idea of the behavior of the algorithms. Hence, we execute 30 independent runs for all the approaches studied and for each system. This is a typical minimum value (i.e., 30 runs) used in the search-based research com- Table 5 Number of refactoring candidates automatically generated for each studied system. CH IC IPO MM RMWO Total Ant 6 9 35 4269 3 4322 ArgoUML 19 25 281 2475 1 2800 Gantt Project 6 4 68 3861 5 3944 JfreeChart 1 21 62 4228 1 4313 Xerces 3 25 119 4118 2 4267 R. Morales et al.
  • 86. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 86 Experimental Setup Performance measures • Design Improvement • Execution time (ET): runtime of algorithms • Refactoring Effort (RE): number of refactoring operations in the sequence ment.1 For all statistical tests, we consider a significance level of For RQ1, we measure the effectiveness of RePOR at removing a patterns in software systems using the following dependent variable • Design Improvement (DI). DI represents the delta of anti-patte occurrences between the refactored system (SYS′) and the orig system (SYS) and it is computed using the following formulatio = ′ − ×DI SYS AC SYS AC SYS AC SYS ( ) ( ) ( ) ( ) 100. Where AC(SYS) is the number of anti-patterns in a system SYS AC(SYS) ≥ 0. DI, which is a positive real number, represents improvement amount in percentage, and high positive values desired. Note that Eq. (3) assumes that ′ − <AC SYS AC SYS( ) ( ) 0 RePOR filters out solutions that make the design worse accordin the desiredEffect threshold (cf., Algorithm 4). The independent variable is the refactoring approach applied each studied system. We statistically compare the number of maining anti-patterns after refactoring a system using RePOR w the number of remaining anti-patterns when using other refactor approaches. Specifically, we test the following hypothesis H01: Th is no difference between the number of remaining anti-patterns o system refactored using RePOR, and a system refactored using o refactoring approaches. We test the hypothesis using a non-p metric test, i.e., the Mann–Whitney U test (Hollander et al., 201 For estimating the magnitude of the differences of means betw Algorithms • RePOR • Conflict-aware scheduling of refactoring heuristic by Liu et al. (2008) (LIU) • Ant Colony Optimization (ACO) • Genetic Algorithm (GA) • SWAY metaheuristic by Chen et al. (2018)
  • 87. Seminario-taller: Introducción a la Ingeniería del Software Guiada por Búsqueda Universidad de Almería, 26 y 27 de Octubre de 2020 (on-line) 87 Results RQ1: To what extent can RePOR remove anti-patterns? We present in Table 7 the Design improvement (DI) in general and the rest of the systems. We reject the null hypothesis H01 for Ant, ArgoUML, Gantt, JfreeChart, and Xerces. In these five systems, the number of re- maining anti-patterns after refactoring using RePOR is significantly lower than the number of anti-patterns remaining in the systems after refactoring using the other refactoring approaches (i.e., ACO, Table 7 Design Improvement (%) in general and for different anti-pattern types. Metaheuristic DI DIBL DILC DILP DISC DISG Ant ACO 57.45 68.42 22.5 74.29 66.67 100 GA 58.16 68.42 22.5 74.29 66.67 100 LIU 58.87 54.39 22.5 100 66.67 100 RePOR 60.28 57.89 22.5 100 66.67 100 SWAY 45.36 57.89 20 60 66.67 83.33 ArgoUML ACO 75.93 51.15 100 83.63 100 100 GA 76.59 51.15 100 84.7 100 100 LIU 81.40 50.38 100 92.88 100 100 RePOR 81.62 38.93 100 98.58 100 100 SWAY 62.91 48.09 84 66.01 100 86.84 Gantt Project ACO 60 17.02 100 83.82 70 100 GA 60.77 14.89 100 85.29 80 100 LIU 63.85 14.89 100 92.65 60 100 RePOR 66.15 8.51 75 100 100 100 SWAY 50 8.51 100 70.59 60 100 JfreeChart ACO 75.4 39.02 100 89.52 100 100 GA 75.4 39.02 100 90.32 100 100 LIU 72.22 31.71 100 88.71 100 100 RePOR 75.4 24.39 100 100 100 100 SWAY 61.90 36.59 90.48 73.39 100 100 Xerces ACO 56.59 14.29 100 65.55 100 100 GA 57.56 14.29 100 67.23 100 100 LIU 64.39 16.07 100 78.99 50 100 RePOR 73.17 5.36 100 98.32 100 100 SWAY 41.87 14.29 68.00 49.58 50 100 Table 8 Pair-wise Mann–Whitney U Test for design improvement. Pair −p value Cliff’s δ Magnitude Ant ACO-RePOR 2.561349e−12 1 Large GA-RePOR 1.431438e−11 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.190193e−12 1 Large ArgoUML ACO-RePOR 1.176641e−12 1 Large GA-RePOR 1.143381e−12 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.206843e−12 1 Large Gantt Project ACO-RePOR 1.036681e−12 1 Large GA-RePOR 1.086586e−12 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.165138e−12 1 Large JfreeChart ACO-RePOR 0.06868602 0.2333333 Small GA-RePOR 0.2771456 −0.1333333 Negligible LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.183399e−12 1 Large Xerces ACO-RePOR 1.0618e−12 1 Large GA-RePOR 9.946555e−13 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.193116e−12 1 Large R. Morales et al. the rest of the systems. We reject the null hypothesis H01 for Ant, ArgoUML, Gantt, ble 7 sign Improvement (%) in general and for different anti-pattern types. Metaheuristic DI DIBL DILC DILP DISC DISG nt CO 57.45 68.42 22.5 74.29 66.67 100 A 58.16 68.42 22.5 74.29 66.67 100 IU 58.87 54.39 22.5 100 66.67 100 ePOR 60.28 57.89 22.5 100 66.67 100 WAY 45.36 57.89 20 60 66.67 83.33 rgoUML CO 75.93 51.15 100 83.63 100 100 A 76.59 51.15 100 84.7 100 100 IU 81.40 50.38 100 92.88 100 100 ePOR 81.62 38.93 100 98.58 100 100 WAY 62.91 48.09 84 66.01 100 86.84 antt Project CO 60 17.02 100 83.82 70 100 A 60.77 14.89 100 85.29 80 100 IU 63.85 14.89 100 92.65 60 100 ePOR 66.15 8.51 75 100 100 100 WAY 50 8.51 100 70.59 60 100 freeChart CO 75.4 39.02 100 89.52 100 100 A 75.4 39.02 100 90.32 100 100 IU 72.22 31.71 100 88.71 100 100 ePOR 75.4 24.39 100 100 100 100 WAY 61.90 36.59 90.48 73.39 100 100 erces CO 56.59 14.29 100 65.55 100 100 A 57.56 14.29 100 67.23 100 100 IU 64.39 16.07 100 78.99 50 100 ePOR 73.17 5.36 100 98.32 100 100 WAY 41.87 14.29 68.00 49.58 50 100 Table 8 Pair-wise Mann–Whitney U Test for design improvement. Pair −p value Cliff’s δ Magnitude Ant ACO-RePOR 2.561349e−12 1 Large GA-RePOR 1.431438e−11 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.190193e−12 1 Large ArgoUML ACO-RePOR 1.176641e−12 1 Large GA-RePOR 1.143381e−12 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.206843e−12 1 Large Gantt Project ACO-RePOR 1.036681e−12 1 Large GA-RePOR 1.086586e−12 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.165138e−12 1 Large JfreeChart ACO-RePOR 0.06868602 0.2333333 Small GA-RePOR 0.2771456 −0.1333333 Negligible LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.183399e−12 1 Large Xerces ACO-RePOR 1.0618e−12 1 Large GA-RePOR 9.946555e−13 1 Large LIU-RePOR 1.685298e−14 1 Large SWAY-RePOR 1.193116e−12 1 Large Morales et al.