How AI supports
software testing
TestFormation 2025
MARCH 27TH, 2025
Kari Kakkonen
SERVICE OWNER, CUSTOMER EXPERTISE DEVELOPMENT AT GOFORE
SECRETARY, TMMI FOUNDATION, BOARD OF DIRECTORS
https://www.linkedin.com/in/karikakkonen
TestFormation 2025 | TMMi Foundation America | #testformation2025
Agenda
• Why we need AI
• How AI works in the NASA case
• Journey through AI opportunities for testing
TestFormation 2025 | TMMi Foundation America | #testformation2025
ROLES
• Gofore, Service Owner, Customer Expertise
Development
• Dragons Out Oy, CEO, Children’s and testing
author
• TMMi, Secretary, Board of Directors
• Finnish Software Testing Board (FiSTB),
Treasurer
ACHIEVEMENTS
• Tester of the Year in Finland 2021
• EuroSTAR Testing Excellence Award 2021
• Exemplary DevOps Instructor Award 2023 by
DASA
• ISTQB Executive Committee 2015-2021
• Influencing testing since 1996
• Ranked in 100 most influential IT persons in
Finland (Tivi magazine)
• Great number of presentations in Finnish and
international conferences
• TestausOSY/FAST founding member.
• Author of Dragons Out testing book
• Co-author of ACT2LEAD software testing
leadership handbook
• Co-author of Agile Testing Foundations book
• Regular blogger in Tivi-magazine
Kari Kakkonen
Service Owner, Customer Expertise Development
SERVICES
• ISTQB Advanced, Foundation, Agile Testing, AI Testing
• Quality Professional
• DASA DevOps, Agile, Scrum, Product Management
• Quality & Test process and organization development,
Metrics, TMMi and other assessments
• Agile testing, Scrum, Kanban, Lean
• Leadership
• Test automation, Mobile, Cloud, DevOps, AI
• Quality, cost, benefits
EDUCATION
• ISTQB Expert Level Test Management & Advanced Full
& Agile Tester certified
• DASA DevOps, Scrum Master and SAFe certified
• TMMi Professional, Assessor, Process Improver certified
• SPICE provisionary assessor certified
• M.Sc.(Eng), Helsinki University of Technology (present Aalto
University), Otaniemi, Espoo
• Marketing studies, University of Wisconsin-Madison,
the USA.
BUSINESS DOMAINS
Wide spread of business domain knowledge: Embedded,
industry, public, training, telecommunications, commerce,
Insurance, banking, pension.
4
www.gofore.com
www.dragonsout.com
www.act2lead.net
MORE INFORMATION
linkedin.com/in/karikakkonen/
The book project ”Dragons Out!”
• Mission
• “Software testing brought to children”
• Book
• Author Kari Kakkonen
• Illustrator Adrienn Széll
• Text and illustration rights Dragons Out Oy
• In Finnish, English, Polish, French and growing
• For ages of 10-99
• Free “Dragon lesson in software testing”
presentation under Creative Commons –
license
• Translated to 20 languages!
• More info: www.dragonsout.com
6
ACT 2 LEAD software testing leadership handbook
● Easy to read - chapters can be read in any order.
● Structure: questions, answers and cases.
○ 34 main chapters = questions, see next page.
● 270+ pages,
○ in Finnish (softback, e-book)
○ in English (e-book).
● For people like CxO, director, head of, manager, product
owner, designer, developer, test manager, tester and
student.
● Teaches to lead testing, not to test.
Buy the book: https://leanpub.com/act2lead
Book website: www.act2lead.net 7
ISTQB GLOBAL PRESENCE
• Number of exams
administered: over 1,3 million
• Number of certifications
issued: 957,000
• In 130 countries
8
9
Why we need AI
CONFIDENTIAL
11
Computation growth due to general purpose GPUs The rise of Big data
Community based achievements in Deep learning Open source tools and frameworks
•
•
•
•
•
•
CONFIDENTIAL
12
•
•
• “Software engineering leaders are now prioritizing development productivity to
enhance market responsiveness and build software applications more
efficiently, while also aiming to maintain high quality. To meet this challenge
they are increasingly turning to AI-augmented testing tools.”
CONFIDENTIAL
13
How AI works in a NASA case
•
C
O
N
F
I
D
E
N
T
I
A
L
16
Figure: Margaret Hamilton standing next
to source code of guidance system of
Apollo spacecraft.
www.nasa.gov/
• Several code quality metrics are
currently being used. The most popular
ones have been introduced in
1970.[1,2]
• Goal is to provide feedback:
1. for developers about code quality.
2. for testers about needed test coverage.
3. for management about the
development project
• Although traditional metrics are
extensively used, benefit of using them
has been questioned.[3]
1 T.J. McCabe, A Complexity Measure, IEEE Transactions on Software Engineering (1976) SE-2, 4, 308.
2 M.H. Halstead, Elements of Software Science, Elsevier (1977).
3 N.E. Fenton, S.L. Pfleeger, Software Metrics: A Rigorous & Practical Approach, International Thompson Press (1997).
4 S.J. Sayyad, T.J. Menzies, The PROMISE Repository of Software Engineering Databases (2005).
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 100 120
Number
of
lines.
Cyclomatic complexity
Scattering of defects in NASA CM1 project
OK Defected
Figure: NASA Metrics Data Program, CM1 Project[4]
In real applications defected modules do not separate from
non-defected by standard metrics.
C
O
N
F
I
D
E
N
T
I
A
L
17
1
2
𝑤 2
+ 𝐶 ෍
𝑖=1
𝑁
ξ𝑖
1-𝑦𝑖 𝑤, 𝑓(𝑥𝑖) + 𝑏 ≤ ξ𝑖
Minimize
Subject to with all values of i.
Quality metric 1
Quality
metric
2
•
•
•
•
•
CONFIDENTIAL
18
0,0
0,2
0,4
0,6
0,8
Tested on real data against NASA CM1 project[1]
Performance
(F)
Training set: 1595
Test set: 400
𝐹 =
2 ∗ 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 + 𝑟𝑒𝑐𝑎𝑙𝑙
CONFIDENTIAL
19
Anomaly detection
•
•
•
•
CONFIDENTIAL
21
•
•
•
•
CONFIDENTIAL
22
Figure: Example of time series
•
•
•
•
•
CONFIDENTIAL
23
Source: STA Consulting
•
CONFIDENTIAL
24
Ref: Kim, D.; Wang, X.; Kim, S.; Zeller, A.; Cheung, S.C.; Park, S. (2011). “Which Crashes Should I Fix First? Predicting Top Crashes at an
Early Stage to Prioritize Debugging Efforts,” in the IEEE Transactions on Software Engineering, volume 37
Source: STA Consulting
Prediction
•
•
•
•
CONFIDENTIAL
26
AI
AI review
Requirement
management
Code quality
analysis
Version
control
Test
management
Defect
management
CONFIDENTIAL
27
Developer
Product
Owner
Scrum team
Management
AI
AI review
CONFIDENTIAL
28
•
CONFIDENTIAL
29
Requires that there is link from components (version control) and tests (test management) to feature descriptions and from reported defects to modules (observation control)
SCRUM TEAM
•
CONFIDENTIAL
30
Interpretation:
• Module 5 has high defect probability and is
related to five most important features of
the application. This results into high risk in
release at this moment.
• Defect probability of module 10 has fallen
to acceptable level.
Start-up
phase
Development
phase
Maturing
phase
MANAGEMENT
•
•
•
•
•
CONFIDENTIAL
31
•
•
•
•
•
CONFIDENTIAL
32
•
•
•
•
•
CONFIDENTIAL
33
Fault-tolerant test automation
•
•
•
CONFIDENTIAL
35
•
•
•
•
•
•
•
•
•
•
•
•
•
•
CONFIDENTIAL
37
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Faster test automation scripting
with AI assistants
•
•
•
CONFIDENTIAL
43
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• Gehtsoft USA
•
CONFIDENTIAL
47
•
•
•
•
CONFIDENTIAL
48
Autonomous testing
•
•
•
CONFIDENTIAL
50
•
•
•
•
•
•
CONFIDENTIAL
51
•
•
•
•
CONFIDENTIAL
52
•
•
•
•
CONFIDENTIAL
53
AI integrated into all testing
(AI augmented testing)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Read more
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
CONFIDENTIAL
57
• Devs gaining little (if anything) from AI coding assistants
- https://www.cio.com/article/3540579/devs-gaining-little-if-anything-from-ai-coding-
assistants.html
• AI in Testing: Impact, Problems, Challenges and Prospect
- https://www.researchgate.net/publication/357876318_Artificial_Intelligence_in_Soft
ware_Testing_Impact_Problems_Challenges_and_Prospect
•
•
•
•
CONFIDENTIAL
58
TMMi & AI
TMMi & AI
Testing AI-systems
▪ Ensure quality of AI-driven software
▪ Testing AI algorithms
▪ Input data testing
▪ Model testing
▪ Integration testing
▪ Ethical considerations:
• Bias
• Fairness
Using AI in testing
▪ Leverage AI to improve efficiency and
effectiveness of the testing process
▪ Test estimation
▪ Defect prediction
▪ Test Design Automation
▪ More intelligent Test execution
automation
▪ Test result analysis
▪ Prioritize based on risk and historical data
Testing AI-based systems vs. using AI for testing
Using AI in Testing (next step in Test Automation?)
± 9 months
± 6 months
priorities and focus the
test improvement process
Challenges with testing AI systems
▪ AI systems are being used to deal with complex problems where the number of
possibilities is huge (even compared to existing software)
▪ Data has biases, and these can very easily come embedded in AI systems
▪ AI-based systems may experience model drift, data drift and/or concept drift
▪ Lack of test oracles that are able to determine expected results
▪ Performance metrics for the AI model are typically very different from traditional
system performance metrics and may sometimes not be positively related
▪ Traditional test techniques will often not be applicable anymore. We need to re-
think our test techniques. We also need to re-think how we measure test
coverage.
Examples AI & TMMi LEVEL 2 MANAGED
▪ Test Policy and Strategy
▪ More explicit focus on ethics, bias, data & regulations
▪ Distinction between Off-line and On-line testing
▪ Define specific test performance indicators
▪ Test Planning
▪ More specialized testing added specific risks for AI components
▪ Off-line phase with data preparation, data filtering and data preprocessing
▪ On-line phase with focus on non-AI components and their interaction with AI
▪ More complex test effort estimation with AI-specific factors
▪ Test Monitoring and Control
▪ Monitor model-, data- & concept-drift
▪ Continuous monitor model performance metrics
▪ Test Design and Execution
▪ Focus on data preparation.
▪ Testcases that train, validate and test tuned model
▪ Test Environment
▪ Complex, Self-learning, autonomy, multi-agency and big data
CONFIDENTIAL
65
CONFIDENTIAL
66
CONFIDENTIAL
67
•
•
•
•
•
•
•
CONFIDENTIAL
68
•
•
•
•
•
•
•
CONFIDENTIAL
69
Thank You

How AI supports testing Kari Kakkonen TestFormation 2025

  • 1.
    How AI supports softwaretesting TestFormation 2025 MARCH 27TH, 2025
  • 2.
    Kari Kakkonen SERVICE OWNER,CUSTOMER EXPERTISE DEVELOPMENT AT GOFORE SECRETARY, TMMI FOUNDATION, BOARD OF DIRECTORS https://www.linkedin.com/in/karikakkonen TestFormation 2025 | TMMi Foundation America | #testformation2025
  • 3.
    Agenda • Why weneed AI • How AI works in the NASA case • Journey through AI opportunities for testing TestFormation 2025 | TMMi Foundation America | #testformation2025
  • 4.
    ROLES • Gofore, ServiceOwner, Customer Expertise Development • Dragons Out Oy, CEO, Children’s and testing author • TMMi, Secretary, Board of Directors • Finnish Software Testing Board (FiSTB), Treasurer ACHIEVEMENTS • Tester of the Year in Finland 2021 • EuroSTAR Testing Excellence Award 2021 • Exemplary DevOps Instructor Award 2023 by DASA • ISTQB Executive Committee 2015-2021 • Influencing testing since 1996 • Ranked in 100 most influential IT persons in Finland (Tivi magazine) • Great number of presentations in Finnish and international conferences • TestausOSY/FAST founding member. • Author of Dragons Out testing book • Co-author of ACT2LEAD software testing leadership handbook • Co-author of Agile Testing Foundations book • Regular blogger in Tivi-magazine Kari Kakkonen Service Owner, Customer Expertise Development SERVICES • ISTQB Advanced, Foundation, Agile Testing, AI Testing • Quality Professional • DASA DevOps, Agile, Scrum, Product Management • Quality & Test process and organization development, Metrics, TMMi and other assessments • Agile testing, Scrum, Kanban, Lean • Leadership • Test automation, Mobile, Cloud, DevOps, AI • Quality, cost, benefits EDUCATION • ISTQB Expert Level Test Management & Advanced Full & Agile Tester certified • DASA DevOps, Scrum Master and SAFe certified • TMMi Professional, Assessor, Process Improver certified • SPICE provisionary assessor certified • M.Sc.(Eng), Helsinki University of Technology (present Aalto University), Otaniemi, Espoo • Marketing studies, University of Wisconsin-Madison, the USA. BUSINESS DOMAINS Wide spread of business domain knowledge: Embedded, industry, public, training, telecommunications, commerce, Insurance, banking, pension. 4 www.gofore.com www.dragonsout.com www.act2lead.net MORE INFORMATION linkedin.com/in/karikakkonen/
  • 6.
    The book project”Dragons Out!” • Mission • “Software testing brought to children” • Book • Author Kari Kakkonen • Illustrator Adrienn Széll • Text and illustration rights Dragons Out Oy • In Finnish, English, Polish, French and growing • For ages of 10-99 • Free “Dragon lesson in software testing” presentation under Creative Commons – license • Translated to 20 languages! • More info: www.dragonsout.com 6
  • 7.
    ACT 2 LEADsoftware testing leadership handbook ● Easy to read - chapters can be read in any order. ● Structure: questions, answers and cases. ○ 34 main chapters = questions, see next page. ● 270+ pages, ○ in Finnish (softback, e-book) ○ in English (e-book). ● For people like CxO, director, head of, manager, product owner, designer, developer, test manager, tester and student. ● Teaches to lead testing, not to test. Buy the book: https://leanpub.com/act2lead Book website: www.act2lead.net 7
  • 8.
    ISTQB GLOBAL PRESENCE •Number of exams administered: over 1,3 million • Number of certifications issued: 957,000 • In 130 countries 8
  • 9.
  • 10.
  • 11.
    CONFIDENTIAL 11 Computation growth dueto general purpose GPUs The rise of Big data Community based achievements in Deep learning Open source tools and frameworks
  • 12.
  • 13.
    • • • “Software engineeringleaders are now prioritizing development productivity to enhance market responsiveness and build software applications more efficiently, while also aiming to maintain high quality. To meet this challenge they are increasingly turning to AI-augmented testing tools.” CONFIDENTIAL 13
  • 15.
    How AI worksin a NASA case
  • 16.
    • C O N F I D E N T I A L 16 Figure: Margaret Hamiltonstanding next to source code of guidance system of Apollo spacecraft. www.nasa.gov/
  • 17.
    • Several codequality metrics are currently being used. The most popular ones have been introduced in 1970.[1,2] • Goal is to provide feedback: 1. for developers about code quality. 2. for testers about needed test coverage. 3. for management about the development project • Although traditional metrics are extensively used, benefit of using them has been questioned.[3] 1 T.J. McCabe, A Complexity Measure, IEEE Transactions on Software Engineering (1976) SE-2, 4, 308. 2 M.H. Halstead, Elements of Software Science, Elsevier (1977). 3 N.E. Fenton, S.L. Pfleeger, Software Metrics: A Rigorous & Practical Approach, International Thompson Press (1997). 4 S.J. Sayyad, T.J. Menzies, The PROMISE Repository of Software Engineering Databases (2005). 0 50 100 150 200 250 300 350 400 450 0 20 40 60 80 100 120 Number of lines. Cyclomatic complexity Scattering of defects in NASA CM1 project OK Defected Figure: NASA Metrics Data Program, CM1 Project[4] In real applications defected modules do not separate from non-defected by standard metrics. C O N F I D E N T I A L 17
  • 18.
    1 2 𝑤 2 + 𝐶෍ 𝑖=1 𝑁 ξ𝑖 1-𝑦𝑖 𝑤, 𝑓(𝑥𝑖) + 𝑏 ≤ ξ𝑖 Minimize Subject to with all values of i. Quality metric 1 Quality metric 2 • • • • • CONFIDENTIAL 18
  • 19.
    0,0 0,2 0,4 0,6 0,8 Tested on realdata against NASA CM1 project[1] Performance (F) Training set: 1595 Test set: 400 𝐹 = 2 ∗ 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 + 𝑟𝑒𝑐𝑎𝑙𝑙 CONFIDENTIAL 19
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
    • CONFIDENTIAL 24 Ref: Kim, D.;Wang, X.; Kim, S.; Zeller, A.; Cheung, S.C.; Park, S. (2011). “Which Crashes Should I Fix First? Predicting Top Crashes at an Early Stage to Prioritize Debugging Efforts,” in the IEEE Transactions on Software Engineering, volume 37 Source: STA Consulting
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    • CONFIDENTIAL 29 Requires that thereis link from components (version control) and tests (test management) to feature descriptions and from reported defects to modules (observation control) SCRUM TEAM
  • 30.
    • CONFIDENTIAL 30 Interpretation: • Module 5has high defect probability and is related to five most important features of the application. This results into high risk in release at this moment. • Defect probability of module 10 has fallen to acceptable level. Start-up phase Development phase Maturing phase MANAGEMENT
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
    Faster test automationscripting with AI assistants
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
    AI integrated intoall testing (AI augmented testing)
  • 55.
  • 57.
  • 58.
    • Devs gaininglittle (if anything) from AI coding assistants - https://www.cio.com/article/3540579/devs-gaining-little-if-anything-from-ai-coding- assistants.html • AI in Testing: Impact, Problems, Challenges and Prospect - https://www.researchgate.net/publication/357876318_Artificial_Intelligence_in_Soft ware_Testing_Impact_Problems_Challenges_and_Prospect • • • • CONFIDENTIAL 58
  • 59.
  • 60.
  • 61.
    Testing AI-systems ▪ Ensurequality of AI-driven software ▪ Testing AI algorithms ▪ Input data testing ▪ Model testing ▪ Integration testing ▪ Ethical considerations: • Bias • Fairness Using AI in testing ▪ Leverage AI to improve efficiency and effectiveness of the testing process ▪ Test estimation ▪ Defect prediction ▪ Test Design Automation ▪ More intelligent Test execution automation ▪ Test result analysis ▪ Prioritize based on risk and historical data Testing AI-based systems vs. using AI for testing
  • 62.
    Using AI inTesting (next step in Test Automation?) ± 9 months ± 6 months priorities and focus the test improvement process
  • 63.
    Challenges with testingAI systems ▪ AI systems are being used to deal with complex problems where the number of possibilities is huge (even compared to existing software) ▪ Data has biases, and these can very easily come embedded in AI systems ▪ AI-based systems may experience model drift, data drift and/or concept drift ▪ Lack of test oracles that are able to determine expected results ▪ Performance metrics for the AI model are typically very different from traditional system performance metrics and may sometimes not be positively related ▪ Traditional test techniques will often not be applicable anymore. We need to re- think our test techniques. We also need to re-think how we measure test coverage.
  • 64.
    Examples AI &TMMi LEVEL 2 MANAGED ▪ Test Policy and Strategy ▪ More explicit focus on ethics, bias, data & regulations ▪ Distinction between Off-line and On-line testing ▪ Define specific test performance indicators ▪ Test Planning ▪ More specialized testing added specific risks for AI components ▪ Off-line phase with data preparation, data filtering and data preprocessing ▪ On-line phase with focus on non-AI components and their interaction with AI ▪ More complex test effort estimation with AI-specific factors ▪ Test Monitoring and Control ▪ Monitor model-, data- & concept-drift ▪ Continuous monitor model performance metrics ▪ Test Design and Execution ▪ Focus on data preparation. ▪ Testcases that train, validate and test tuned model ▪ Test Environment ▪ Complex, Self-learning, autonomy, multi-agency and big data
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.