Generative AI for Reengineering Variants into Software Product Lines: An Experience Report

Variants of code (e.g., Java or C)
Variants of user interfaces
Variants of video sequences
Variants of models (e.g., UML or
SysML)
Variants of « things » (3D
models)
…
Variability Models (feature models)
Reverse engineering
variability and
reusable assets e4CompareFramework
ECCO tool
3

Problem: Given a set of variants, how to synthesize a
program that can be conﬁgured to retrieve original variants?
?????????
4

SPL = feature model + annotated program (eg template-based generator)
Expected properties: soundness/completeness/meaningful set of features
(for conﬁguration or maintenance/evolution/expansion)
Reverse engineering
variability and
reusable assets Error-prone
Time-consuming
e4CompareFramework
ECCO tool
5

SPL = feature model + annotated program (eg template-based generator)
Expected properties: soundness/completeness/meaningful set of features
(for conﬁguration or maintenance/evolution/expansion)
Hypothesis: large language models (LLMs)
can assist domain experts and developers in
some re-engineering activities Error-prone
Time-consuming
6

Research method
LLM = ChatGPT4 (no API)
Session: sequence of prompts until getting a result
Within a session, we, as users, asked for different tasks such as:
● domain analysis, summary of commonalities and differences, as plain languages or as tables,
● identification of features,
● synthesis of an integrated, 150% model or code (and possibly a
● visualization)
● synthesis of a template-based generator
● refactoring of models or code, corrections of code, model, or explanations that were perceived as inaccurate or simply
unusable.
Highly-interactive
Repetition of some sessions (with some prompts’ variations)
Qualitative assessment 7

Research method: 5 cases with ChatGPT4
We consider five cases:
● Java variants
● UML variants
● state chart variants
● state machine variants
● PNG image variants
We revisit some of the cases considered by
https://github.com/but4reuse/but4reuse/wiki/Examples, this time with generative AI
Qualitative assessment
e4CompareFramework
ECCO tool
8

Case #1: BANKING SYSTEMS VARIANTS IN JAVA
9

Case #2: BANKING SYSTEMS VARIANTS IN UML
14

Case #2: BANKING SYSTEMS VARIANTS IN UML
15

Case #4: STATECHARTS VARIANTS
18

Case #5: PGN VARIANTS
With current LLMs, we confirm (1) the impossibility to analyze PNG,
binary files and compute a relevant diff – there are no magical
solutions involved;
(2) the provision of pixels location to LLM does not scale: there is
simply too much information to fed in; (3) LLM is not helpful to
find meaningful features: the information is too low-level and there
is no semantic to infer.
19

Discussion
Limitations:
● size of input variants
○ Limited context-window and long-term memory
○ Impossible to fed a large XML ﬁle or an entire project (eg ArgoUML case)
● inaccuracies
○ See eg Case 2
● similarity function can work “out of the box” (language agnostic)
● …but we know that similarity function is usually highly speciﬁc to a
programming/modelling languages
Threats: prompts used, LLM (= ChatGPT4)
Challenge: a re-engineering benchmark for generative AI/LLM
20

Positive and negative experiences
Potential: automation of template synthesis, feature identiﬁcation/naming, refactoring, etc.
Strong limitations: prompt sensitivity, size of the input, inaccuracies
Instead of thinking of ChatGPT as a replacement of BUT4Reuse or similar frameworks,
the integration of both could be a more beneﬁcial solution for the moment.
Error-prone
Time-consuming
Hypothesis: large language models (LLMs)
can assist domain experts and developers in
some re-engineering activities
21

https://github.com/acherm/variantsGPT
Paper available: https://inria.hal.science/hal-04160693/
22

I’d like to change the color of the cat’s tail
On
programming
variability
with LLMs
Here is a TikZ code:
tikzset{%
cat/.pic={
tikzset{x=3cm/5,y=3cm/5,shift={(0,-1/3)}}
useasboundingbox (-1,-1) (1,2);
fill [BlueGrey900] (0,-2)
.. controls ++(180:3) and ++(0:5/4) .. (-2,0)
arc (270:90:1/5)
.. controls ++(0:2) and ++(180:11/4) .. (0,-2+2/5);
foreach i in {-1,1}
scoped[shift={(1/2*i,9/4)}, rotate=45*i]{
clip [overlay] (0, 5/9) ellipse [radius=8/9];
clip [overlay] (0,-5/9) ellipse [radius=8/9];
fill [BlueGrey900] ellipse [radius=1];
fill [Purple100] ellipse [radius=1];
};
fill [BlueGrey900] ellipse [x radius=3/4, y radius=2];
fill [BlueGrey100] ellipse [x radius=1/3, y radius=1];
fill [BlueGrey900]
(0,15/8) ellipse [x radius=1, y radius=5/6]
(0, 8/6) ellipse [x radius=1/2, y radius=1/2]
{[shift={(-1/2,-2)}, rotate= 10] ellipse [x radius=1/3, y
radius=5/4]}
{[shift={( 1/2,-2)}, rotate=-10] ellipse [x radius=1/3, y
radius=5/4]};
fill [BlueGrey500]
(-1/9,11/8) ellipse [x radius=1/5, y radius=1/5]
( 1/9,11/8) ellipse [x radius=1/5, y radius=1/5];
fill [Purple100]
(0,12/8) ellipse [x radius=1/10, y radius=1/5]
(0,12/8+1/9) ellipse [x radius=1/5 , y radius=1/10];
foreach i in {-1,1}
scoped[shift={(1/2*i,2)}, rotate=35*i]{
fill [Yellow50] ellipse [radius=1];
};
scoped{
clip (-1,-2) rectangle ++(2,1);
fill [BlueGrey900] (0,-2) ellipse [radius=1/2];
fill [Grey100]
(-1/2,-2) ellipse [x radius=1/3, y radius=1/4]
( 1/2,-2) ellipse [x radius=1/3, y radius=1/4];
};
foreach i in {-1,1}
23

https://github.com/acherm/variantsGPT
25

Code synthesis
with LLMs
vs Codex
https://arxiv.org/abs/2210.14699
26

Generative programming [Czarnecki2000], Model-driven engineering:
automatically generate variants from a specification written in one or more textual or
graphical domain-specific languages
Accidental and
essential complexity
Variability further
increases this
software
complexity: multiple
features, code
variations, and an
exponential number
of possible variants
27

LLM
Hypothesis: Large language models (LLMs) act as a new variability compiler capable of
transforming a high-level specification (“prompt”) into variable code, features, generators,
configurable systems, etc. written in a given technological space.
Motto: “features as prompts” 28

Generative AI for Reengineering Variants into Software Product Lines: An Experience Report

Recommended

Recommended

More Related Content

Similar to Generative AI for Reengineering Variants into Software Product Lines: An Experience Report

Similar to Generative AI for Reengineering Variants into Software Product Lines: An Experience Report (20)

More from University of Rennes, INSA Rennes, Inria/IRISA, CNRS

More from University of Rennes, INSA Rennes, Inria/IRISA, CNRS (20)

Recently uploaded

Recently uploaded (20)

Generative AI for Reengineering Variants into Software Product Lines: An Experience Report