AIPyCraft
AI-Assisted Software Development Lifecycle for 6G Blockchain Oracle
Validation
Antonio M. Alberti - School of Computer Science, Univ. of Leeds, UK
Alexis Leal, Ariel Dalla Costa, Cristiano Bonato Both - Unisinos, Brazil
Outline
• Motivations
• Contextualization
• Contributions
• Architecture and Workflow
• Evaluation
• Lessons Learned
• Conclusion and Future Work
Antonio Alberti 2025 AIPyCraft 2 / 16
Motivations
• ChatGPT 3.5 sparked public interest in LLMs for code generation.
• However, manual prompting, copying, and testing is tedious and unscalable.
• Need for autonomous agents to generate, test, and refine code incrementally.
• AIPyCraft proposes an LLM-powered tool for modular, incremental development.
• Research question:
◦ Can a tool like this actually help in the development of code components for 6G?
Antonio Alberti 2025 AIPyCraft 3 / 16
Disrutive 6G (D6G) Architecture to Integrate On-chain and Off-chain 6G
Services
Operator 1
Node
Operator N
Node
Operator 3
Node
Operator 2
Node
DLT (On-Chain)
Off-Chain
Oracle
Jobs
External
Data
External
Data
External
Data
External
Data
External
Data
External
Data
Figure: On the left, public DLT nodes register smart contracts that run code in an immutable way
when monetized. On the center, an off-chain oracle receives data requests (“jobs”) from the DLT
ledger and dispatches them to various external off-chain data sources in the right — including
network slicing as a service. The oracle then returns verified responses back to the DLT.
Antonio Alberti 2025 AIPyCraft 4 / 16
Contributions
• Open-source development automation: uses LLM APIs to generate code, manage
virtual environments, iteratively correct errors, and integrate components.
• Innovative application on 6G component design: explores the role of LLMs on the
development of next generation virtual network functions.
• New evaluation tools and methodology: assesses whether LLMs can indeed help on
6G software components development.
Antonio Alberti 2025 AIPyCraft 5 / 16
AIPyCraft Architecture
AIConnector
GPT-o3-
mini
Gemini
2.5
LLM APIs
CLI
Queries and
responses
from LLM API
AICodeParser
InstallationScriptGenerator
SolutionCreator SolutionLoader
SolutionRunner
SolutionDisplay
SolutionCorrecting,
ComponentCorrector
SolutionImporter
SolutionUpdater
SolutionFeatureAdding
Dispatcher
Operating system
call for main.py
and feedback
Second method
of correcting a solution.
Install Python
virtual
environment in a
solution.
Get user prompts
and show results.
User asks for a new
feature in a solution.
First method of
correcting a solution
or component.
Show the solution code
in the screen.
Creates a new solution
and its components.
...
Solutions
Folder
AIPyCraft
Load a solution from
file.
Import an existing
folder as a solution
Component
Solution
Component
Component
...
Component
Solution
Component
Component
...
Antonio Alberti 2025 AIPyCraft 6 / 16
AIPyCraft Workflow
Start
SolutionCreator
No Approve?
Create components
Determine required virtual
environment
InstallationScriptGenerator
SolutionRunner
No
Yes, user explains
the problem
Yes
Problem?
SolutionCorrecting,
SolutionUpdate or
ComponentCorrector
Solution
Feature
Adding
End
No
Yes
New
feature?
• User inputs solution/feature
description.
• LLM generates components
incrementally.
• Virtual environments created per
solution.
• Code run and errors corrected
iteratively.
• Human-in-the-loop for validation and
feature additions.
Antonio Alberti 2025 AIPyCraft 7 / 16
Proof-of-Concept: TOML Script Correction
• Tested with Google Gemini 2.5 Pro on Chainlink off-chain oracle jobs.
• Automated multi-trial testing with syntax and semantic scenarios.
• Two TOML Script Tester were developed to test:
◦ Scenario 1: TOML syntax evaluation using Tomli Python package.
◦ Scenario 2: TOML semantic evaluation using a real Chainlink node.
Antonio Alberti 2025 AIPyCraft 8 / 16
Proof-of-Concept Evaluation
ComponentCorrector
CLI
run_tester_multiple
Run
experiment
tester.py
20 trials
AIConnector
SolutionLoader
SolutionRunner
SolutionUpdater
AIPyCraft
Solution
config.toml
main.py
main.py
tomli
1
2
Scenario 1
Scenario 2
TOML Script Tester
1
2
Job
Chainlink
Node
LLM APIs
Up to 20 correction
tentatives
Figure: Automated testing setup with AIPyCraft, Gemini 2.5 Pro, and Chainlink node.
Antonio Alberti 2025 AIPyCraft 9 / 16
Proof-of-Concept Results: Scenario 1 - Syntax Correction
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
Trial Number
0.0
0.2
0.4
0.6
0.8
1.0
Iterations
to
Success
Failed Trials: 0
Mean Iterations (95% CI)
Figure: Number of interactions required to correct TOML job syntax using tomli.
Antonio Alberti 2025 AIPyCraft 10 / 16
Proof-of-Concept Results: Scenario 1 - Time required to correct
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
Trial Number
0
10
20
30
40
50
Total
Test
Duration
(s)
Failed Trials: 0
Mean Duration (95% CI)
Figure: Time required to correct TOML job syntax using tomli.
Antonio Alberti 2025 AIPyCraft 11 / 16
Proof-of-Concept Results: Scenario 2 - Semantic Correction
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
Trial Number
0.0
0.2
0.5
0.8
1.0
Iterations
to
Success
Failed Trials: 0
Mean Iterations (95% CI)
Figure: Number of interactions required to correct TOML job semantics on Chainlink node.
Antonio Alberti 2025 AIPyCraft 12 / 16
Proof-of-Concept Results: Scenario 2 - Time required to correct
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
Trial Number
0
50
100
150
200
250
Total
Test
Duration
(s)
Failed Trials: 0
Mean Duration (95% CI)
Figure: Time required to correct TOML job semantics on Chainlink node.
Antonio Alberti 2025 AIPyCraft 13 / 16
Lessons Learned
• Prompting LLMs to create modular solutions is challenging.
• Hallucinations and redundant code require iterative/interative corrections.
• To employ modular design with LLMs enforces reusability and debugging.
• Human validation remains critical in AI-assisted coding. At least for now... Who
knows?
• Virtual environments separation prevents dependency conflicts, which is difficult to be
solve by LLMs.
Antonio Alberti 2025 AIPyCraft 14 / 16
Conclusion
• AIPyCraft successfully integrated LLMs for modular 6G component development.
• Simple TOML jobs generated and corrected with acceptable performance.
• LLMs supports human-machine collaboration and incremental development.
• Future work: compare LLMs, more complex Jobs, other D6G components, security,
other languages.
• Emergence of an ’understand-by-building’ software engineering, in which strategy,
design, coding, debugging, and optimization are rapidly co-created with machines.
Antonio Alberti 2025 AIPyCraft 15 / 16
Thank you!
Thank you!
Antonio Alberti
A.M.Alberti@leeds.ac.uk
Antonio Alberti 2025 AIPyCraft 16 / 16

AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracle Validation

  • 1.
    AIPyCraft AI-Assisted Software DevelopmentLifecycle for 6G Blockchain Oracle Validation Antonio M. Alberti - School of Computer Science, Univ. of Leeds, UK Alexis Leal, Ariel Dalla Costa, Cristiano Bonato Both - Unisinos, Brazil
  • 2.
    Outline • Motivations • Contextualization •Contributions • Architecture and Workflow • Evaluation • Lessons Learned • Conclusion and Future Work Antonio Alberti 2025 AIPyCraft 2 / 16
  • 3.
    Motivations • ChatGPT 3.5sparked public interest in LLMs for code generation. • However, manual prompting, copying, and testing is tedious and unscalable. • Need for autonomous agents to generate, test, and refine code incrementally. • AIPyCraft proposes an LLM-powered tool for modular, incremental development. • Research question: ◦ Can a tool like this actually help in the development of code components for 6G? Antonio Alberti 2025 AIPyCraft 3 / 16
  • 4.
    Disrutive 6G (D6G)Architecture to Integrate On-chain and Off-chain 6G Services Operator 1 Node Operator N Node Operator 3 Node Operator 2 Node DLT (On-Chain) Off-Chain Oracle Jobs External Data External Data External Data External Data External Data External Data Figure: On the left, public DLT nodes register smart contracts that run code in an immutable way when monetized. On the center, an off-chain oracle receives data requests (“jobs”) from the DLT ledger and dispatches them to various external off-chain data sources in the right — including network slicing as a service. The oracle then returns verified responses back to the DLT. Antonio Alberti 2025 AIPyCraft 4 / 16
  • 5.
    Contributions • Open-source developmentautomation: uses LLM APIs to generate code, manage virtual environments, iteratively correct errors, and integrate components. • Innovative application on 6G component design: explores the role of LLMs on the development of next generation virtual network functions. • New evaluation tools and methodology: assesses whether LLMs can indeed help on 6G software components development. Antonio Alberti 2025 AIPyCraft 5 / 16
  • 6.
    AIPyCraft Architecture AIConnector GPT-o3- mini Gemini 2.5 LLM APIs CLI Queriesand responses from LLM API AICodeParser InstallationScriptGenerator SolutionCreator SolutionLoader SolutionRunner SolutionDisplay SolutionCorrecting, ComponentCorrector SolutionImporter SolutionUpdater SolutionFeatureAdding Dispatcher Operating system call for main.py and feedback Second method of correcting a solution. Install Python virtual environment in a solution. Get user prompts and show results. User asks for a new feature in a solution. First method of correcting a solution or component. Show the solution code in the screen. Creates a new solution and its components. ... Solutions Folder AIPyCraft Load a solution from file. Import an existing folder as a solution Component Solution Component Component ... Component Solution Component Component ... Antonio Alberti 2025 AIPyCraft 6 / 16
  • 7.
    AIPyCraft Workflow Start SolutionCreator No Approve? Createcomponents Determine required virtual environment InstallationScriptGenerator SolutionRunner No Yes, user explains the problem Yes Problem? SolutionCorrecting, SolutionUpdate or ComponentCorrector Solution Feature Adding End No Yes New feature? • User inputs solution/feature description. • LLM generates components incrementally. • Virtual environments created per solution. • Code run and errors corrected iteratively. • Human-in-the-loop for validation and feature additions. Antonio Alberti 2025 AIPyCraft 7 / 16
  • 8.
    Proof-of-Concept: TOML ScriptCorrection • Tested with Google Gemini 2.5 Pro on Chainlink off-chain oracle jobs. • Automated multi-trial testing with syntax and semantic scenarios. • Two TOML Script Tester were developed to test: ◦ Scenario 1: TOML syntax evaluation using Tomli Python package. ◦ Scenario 2: TOML semantic evaluation using a real Chainlink node. Antonio Alberti 2025 AIPyCraft 8 / 16
  • 9.
    Proof-of-Concept Evaluation ComponentCorrector CLI run_tester_multiple Run experiment tester.py 20 trials AIConnector SolutionLoader SolutionRunner SolutionUpdater AIPyCraft Solution config.toml main.py main.py tomli 1 2 Scenario1 Scenario 2 TOML Script Tester 1 2 Job Chainlink Node LLM APIs Up to 20 correction tentatives Figure: Automated testing setup with AIPyCraft, Gemini 2.5 Pro, and Chainlink node. Antonio Alberti 2025 AIPyCraft 9 / 16
  • 10.
    Proof-of-Concept Results: Scenario1 - Syntax Correction 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 Trial Number 0.0 0.2 0.4 0.6 0.8 1.0 Iterations to Success Failed Trials: 0 Mean Iterations (95% CI) Figure: Number of interactions required to correct TOML job syntax using tomli. Antonio Alberti 2025 AIPyCraft 10 / 16
  • 11.
    Proof-of-Concept Results: Scenario1 - Time required to correct 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 Trial Number 0 10 20 30 40 50 Total Test Duration (s) Failed Trials: 0 Mean Duration (95% CI) Figure: Time required to correct TOML job syntax using tomli. Antonio Alberti 2025 AIPyCraft 11 / 16
  • 12.
    Proof-of-Concept Results: Scenario2 - Semantic Correction 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 Trial Number 0.0 0.2 0.5 0.8 1.0 Iterations to Success Failed Trials: 0 Mean Iterations (95% CI) Figure: Number of interactions required to correct TOML job semantics on Chainlink node. Antonio Alberti 2025 AIPyCraft 12 / 16
  • 13.
    Proof-of-Concept Results: Scenario2 - Time required to correct 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 Trial Number 0 50 100 150 200 250 Total Test Duration (s) Failed Trials: 0 Mean Duration (95% CI) Figure: Time required to correct TOML job semantics on Chainlink node. Antonio Alberti 2025 AIPyCraft 13 / 16
  • 14.
    Lessons Learned • PromptingLLMs to create modular solutions is challenging. • Hallucinations and redundant code require iterative/interative corrections. • To employ modular design with LLMs enforces reusability and debugging. • Human validation remains critical in AI-assisted coding. At least for now... Who knows? • Virtual environments separation prevents dependency conflicts, which is difficult to be solve by LLMs. Antonio Alberti 2025 AIPyCraft 14 / 16
  • 15.
    Conclusion • AIPyCraft successfullyintegrated LLMs for modular 6G component development. • Simple TOML jobs generated and corrected with acceptable performance. • LLMs supports human-machine collaboration and incremental development. • Future work: compare LLMs, more complex Jobs, other D6G components, security, other languages. • Emergence of an ’understand-by-building’ software engineering, in which strategy, design, coding, debugging, and optimization are rapidly co-created with machines. Antonio Alberti 2025 AIPyCraft 15 / 16
  • 16.
    Thank you! Thank you! AntonioAlberti A.M.Alberti@leeds.ac.uk Antonio Alberti 2025 AIPyCraft 16 / 16