Choreo: Empowering the Future of Enterprise Software Engineering
Chatzikonstantinou c ai-se2013_
1. Introduction Modeling Alchemy Training Inference Case Study Conclusion
A Goal Driven Framework for Software Project
Data Analytics
George Chatzikonstantinou1, Kostas Kontogiannis1,
Ioanna-Maria Attarian2
1
National Technical University of Athens, Greece
2
IBM Toronto Laboratory, Canada
CAiSE’13, Valencia, Spain
MINISTRY OF EDUCATION & RELIGIOUS AFFAIRS, CULTURE & SPORTS
2. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Problem Description (Software Development Analytics)
Software engineering is a data-rich/data-intensive activity
Large collections of project related information are stored in
specialized repositories
How can those data be leveraged to help managers identify
possible risks in order to better plan a software project?
Software
Project Data
?
draw conclusions
about the project
(e.g. budget overruns,
schedule delays)
3. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Quantitative Approaches
Software
Project Data
draw conclusions
about the project
(e.g. budget overruns,
schedule delays)
cost = f(x1, x2, … xn)
Most software analytics models are based on numerical formulas
(e.g. COCOMO II by B. Boehm et al.)
Such approaches fail to take into account:
experience captured from past similar projects
contextual information that leads to different views of analysis
qualitative assessment of project data
4. Introduction Modeling Alchemy Training Inference Case Study Conclusion
The Proposed Approach
Software
Project Data
draw conclusions
about the project
(e.g. budget overruns,
schedule delays)
Project
Analytics
Model
Past Project
Data
Uses qualitative models that can capture different views of
analysis
Allows for past cases to be used for training the models
Can yield results even with incomplete or partial data
5. Introduction Modeling Alchemy Training Inference Case Study Conclusion
The Proposed Approach
Software
Project Data
draw conclusions
about the project
(e.g. budget overruns,
schedule delays)
Project
Analytics
Model
Past Project
Data
i) modeling ii) training
iii) inference
Uses qualitative models that can capture different views of
analysis
Allows for past cases to be used for training the models
Can yield results even with incomplete or partial data
6. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics
Project Analytics are modeled in terms of AND/OR Goal Trees
used extensively in RE
a visual notation with well defined semantics
Advantages of the selected notation :
can capture the views of different stakeholders
can capture various dependency types
is extensible and customizable for different project types and
organizations
7. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
High Software
Product
Complexity
b
Low Effort
a
Each root node corresponds to
a desired state/risk
8. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
Low Effort
AND
OR
a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
High Software
Product
Complexity
b
Nodes are reduced to simpler
ones with:
AND-decompositions
Sat(c) ∧ Sat(d) → Sat(a)
OR-decompositions
Sat(e) → Sat(d)
Sat(f ) → Sat(d)
Sat(a) : goal node a is satisfied
9. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
Low Effort
AND
OR
++S / ++D
- - D /- -S a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
Support by
Technical
People
g
High Software
Product
Complexity
b
Dependencies are depicted as
contribution links :
++S(g, d)
p1 : Sat(g) → Sat(d)
++D(g, d)
p2 : ¬Sat(g) → ¬Sat(d)
−−S(b, a)
p3 : Sat(b) → ¬Sat(a)
−−D(b, a)
p4 : ¬Sat(b) → Sat(a)
10. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
Low Effort
AND
OR
++S / ++D
- - D /- -S a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
Support by
Technical
People
g
High Software
Product
Complexity
b
Dependencies are depicted as
contribution links :
++S(g, d)
p1 : Sat(g) → Sat(d)
++D(g, d)
p2 : ¬Sat(g) → ¬Sat(d)
−−S(b, a)
p3 : Sat(b) → ¬Sat(a)
−−D(b, a)
p4 : ¬Sat(b) → Sat(a)
11. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
Low Effort
AND
OR
++S / ++D
- - S {PSS}
- - D /- -S
PSS: Strict Schedule Compliance
PDR: Disciplined Requirements Management
a
- - S{PDR}
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
Support by
Technical
People
g
High Software
Product
Complexity
b
Requirements
Controllability
h
Development
Schedule
Constraints
i Multiple views are modeled
using conditional contributions
−−S(h, a){PDR}
if policy PDR holds
q1 : Sat(h) → ¬Sat(a)
−−S(i, a){PSS }
if policy PSS holds
q2 : Sat(i) → ¬Sat(a)
12. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Modeling Project Analytics (Example & Semantics)
Low Effort
AND
OR
++S / ++D
- - S {PSS}
- - D /- -S
PSS: Strict Schedule Compliance
PDR: Disciplined Requirements Management
a
- - S{PDR}
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
Support by
Technical
People
g
High Software
Product
Complexity
b
Requirements
Controllability
h
Development
Schedule
Constraints
i Multiple views are modeled
using conditional contributions
−−S(h, a){PDR}
if policy PDR holds
q1 : Sat(h) → ¬Sat(a)
−−S(i, a){PSS }
if policy PSS holds
q2 : Sat(i) → ¬Sat(a)
13. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Leaf Nodes
Low Effort
AND
OR
++S / ++D
- - S {PSS}
- - D /- -S
PSS: Strict Schedule Compliance
PDR: Disciplined Requirements Management
a
- - S{PDR}
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Application
Domain
Experience and
Knowledge
e
Platform
Experience and
Knowledge
f
Support by
Technical
People
g
Requirements
Controllability
h
Development
Schedule
Constraints
i
High Software
Product
Complexity
b
There are nodes in the model
that have zero in-degree (leafs)
Leaf nodes in the model are
facts and should be :
either added as input by
the user
or obtained by the
available repositories
14. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Learning/Inference Engine
Having considered Project Analytics models as rules we need an
inference engine to be able to make deductions
Alchemy (http://alchemy.cs.washington.edu/)
A statistical learning and probabilistic inference engine based on
Markov Logic Networks (MLNs).
Markov Logic
A probabilistic logic which combines FOL and Markov
networks enabling uncertain inference.
An assignment may hold with a non-zero probability even if
some of the formulas in the underlying KB are violated.
Weights on formulas reflect the strength of the corresponding
constraint.
15. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as a Learning Engine
Project
Analytics
Goal Model
Training
MLN Rules
Generation
Interpretations
Alchemy
PAG Model with
Weights
on Contributions
Low Effort
AND
++S / ++D
- - S {PSS}
a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
Sat(c)˄Sat(d)→Sat(a).
p1 : Sat(g)→Sat(d)
p2 : ¬Sat(g)→¬Sat(d)
q1 : Sat(i)→¬Sat(a)
16. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as a Learning Engine
Past Project
Data
Project
Analytics
Goal Model
Training
MLN Rules
Generation
Ground Atoms
Generation
Alchemy
PAG Model with
Weights
on Contributions
Low Effort
AND
++S / ++D
- - S {PSS}
a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
Sat(c),Sat(g),Sat(i)
Pr1
Sat(c),!Sat(g),Sat(i)
Pr2
Sat(c),Sat(g),Sat(i)
Prn
...
17. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as a Learning Engine
Past Project
Data
Project
Analytics
Goal Model
Training
MLN Rules
Generation
Ground Atoms
Generation
Alchemy
PAG Model with
Weights
on Contributions
Low Effort
AND
++S, p1/ ++D, p2
- - S, q1 {PSS}a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
18. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as an Inference Engine
Current
Project Data
MLN Rules
Generation
Ground Atoms
Generation
Alchemy
Active
Policies Set
PAG Model with
Weights
on Contributions
Project Analytics
Satisfaction Probabilities
Low Effort
AND
++S, p1/ ++D, p2
- - S, q1 {PSS}a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
Sat(c)˄Sat(d)→Sat(a).
p1 : Sat(g)→Sat(d)
p2 : ¬Sat(g)→¬Sat(d)
Sat(i)˄Uses(PSS)→Sat(a’).
q1 : Sat(a’)→¬Sat(a)
19. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as an Inference Engine
MLN Rules
Generation
Ground Atoms
Generation
Alchemy
Active
Policies Set
PAG Model with
Weights
on Contributions
Project Analytics
Satisfaction Probabilities
Current
Project Data
Low Effort
AND
++S, p1/ ++D, p2
- - S, q1 {PSS}a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
Current Project Data :
Sat(c), Sat(i)
Active Policies :
Uses(PDR)
20. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Alchemy as an Inference Engine
MLN Rules
Generation
Ground Atoms
Generation
Alchemy
Active
Policies Set
PAG Model with
Weights
on Contributions
Project Analytics
Satisfaction Probabilities
Current
Project Data
Low Effort
AND
++S, p1/ ++D, p2
- - S, q1 {PSS}a
High Level of
Experience and
Knowledge
d
Clarity of Project
Team Roles and
Responsibilities
c
Support by
Technical
People
g
Development
Schedule
Constraints
i
Calculate Satisfaction
Probability
21. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Dataset
The ISBSG Dataset
ISBSG (http://www.isbsg.org/)
A non-profit organization that maintains and exploits a repository
of history data related to software projects.
The ISBSG Dataset in numbers
data for 5,000 software projects
submitted from 24 countries
covers 15 major industry types (e.g banking, insurance)
over 100 features for each project
22. Introduction Modeling Alchemy Training Inference Case Study Conclusion
PAG Modeling
Compiling the PAG Model
We considered information from the following sources :
assertions from related literature
existing standards and tools (e.g. ISO 9126, COCOMO II)
data available from ISBSG
The PAG model of the case study has :
3 root goals : “High Effort”, “Low Cost”, “High Product
Quality”
96 nodes (50 leaf nodes)
12 OR-decompositions / 10 AND-decompositions
25 contribution links (12 conditional)
23. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Evaluation
Correctness
Objective Correct FP FN
High Effort 73.6 % 11.8 % 14.6 %
Low Cost 67.9 % 14.5 % 17.6 %
High Product Quality 60.6 % 11.4 % 28.0 %
24. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Evaluation
Stability
0 2 4 6 8 10 12 14 16 18 20 22
0.4
0.5
0.6
0.7
0.8
0.9
1
# of Errors
Probabilityofanobjectivetobetrue
Low Cost
High Effort
High Product Quality
26. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Conclusion & Future Work
The proposed approach :
uses qualitative models that can capture different views of
analysis
allows for past cases to be used for training the models
allows for reasoning under uncertainty or partial information
Future work :
compilation of goal models that relate to specific standards
(e.g. SMART, SCRUM)
increase the expressiveness of PAG models
27. Introduction Modeling Alchemy Training Inference Case Study Conclusion
Acknowledgements
This research has been co-financed by the European Union (Eu-
ropean Social Fund ESF) and Greek national funds through the
Operational Program ”Education and Lifelong Learning” of the Na-
tional Strategic Reference Framework (NSRF) - Research Funding
Program: Heracleitus II. Investing in knowledge society through the
European Social Fund.