2. Outline
Motivation
Different Techniques
◦ Clustered based
FOCUS
ROMANTIC
◦ Pattern based Conclusion
Conclusion
References
2
3. Motivation
Old software systems are often not
documented or very less documentation
is available, even in the systems where
documentation is available there is no
explicit mention of the architecture that
the code possesses.
New changes to the system need a
knowledge of implicit architecture that
the system possess.
3
4. Motivation
Legacy Transformation
It is a tough task to convert a 10,000 line
Cobol code to C/C++ code if the
programmer is unaware of the underlying
architecture .
4
5. Motivation
System evolution
As system evolves , it tends to drift from
it’s original architecture .
So it is very important to recover or
reconstruct the architecture of the
system in the spirit that new changes to
the system do not affect the existing
working model.
5
6. Different Techniques
Approaches to architecture extraction can be
classified mainly into clustered based techniques
and pattern-based techniques.
The clustered-based techniques gradually build the
architecture by grouping the components .It is a
Bottom up process which starts with a low-level
knowledge like source code and gradually discover
the complete architecture.
Pattern-based techniques on the other hand is a Top-
down process which first build a conceptual
architecture of the system in terms of some pattern
and then the software system is searched to find
instances of that pattern in a top-down manner
6
7. Clustering based Techniques.
The focus of techniques here is to view
different parts of the source code as a
single component (cluster) , then use an
hierarchical approach to find the
architecture .
7
10. FOCUS
In software evolution , architecture erosion
is a common problem where architecture is
modified to a point such that the basic
properties of the architecture no longer
hold.
FOCUS is an approach to recover
architecture in such applications.
It allows engineers to focus their attention
directly to the part that has affected mainly
from the change and recover the
architecture in an incremental fashion.
10
14. Architecture recovery
1. Indentifying components.
• Use call-graphs, class-diagram
• Inheritance ,Aggregation
2. Propose idealized architectural model
• basis of the future evolution’s requirements
• may incorrectly characterize some of the application aspects
3. Map identified components to different architectural elements
• intermediate architecture, which contains those components
which fits with the idealized architecture.
4. Identify key use-cases
• UML use-cases to express the application requirements
• Three categories of the use-cases .
• First is the category of cases which is unaffected by the desired
modification,
• Second are those corresponding to the new changes
• Third are the earlier existing cases which now needs to be
modified according to the changes
14
15. Architecture recovery
5. Analyze Component interaction
• Along with the static relationship of classes their interactions and
control flow must also be analyzed
• UML sequence diagram
6. Generate refined Architecture
• The control flow identified in the previous step can be used to find
remaining components mapping from the architectural elements
and their inter-relations with earlier discovered components.
• At the same time this can be used to find out inconsistencies
(missing component interactions) which were introduced by the
step 2.
15
17. System evolution
1. Propose idealized architecture Evolution plan
• high-level architecture evolution plan like distributed client server
model
2. Add/Modify components
• identify which components needs modification and what new
components need to be added.
3. Update components interactions
• modify the existing component interactions also. Adding a new feature
may cause the earlier component interactions to change
4. Generate Evolved Architecture
• changes made in the previous steps now needs to be integrated with
original architecture that was recovered.
• This generated architecture serves as the basis for all the design issue
related to the new modifications to the system
5. Set the new Focus
• the components which were affected by the previous iteration now
become new focus, so that they can be more refined.
17
19. FOCUS
In summary this is an approach to
recover and evolve architecture of
moderately sized OO applications.
The approach reduces the complexity
involved in the recovery process by
allowing engineers to focus on a
particular segment of the architecture.
The idea is that the recovery is not just
one-time process. You need to maintain
the architecture as the system evolves
19
20. ROMANTIC
The main idea behind this approach is to use
system structural and semantic properties to
come up with a quasi-automatic process of
extracting architecture from OO Systems
semantic information about the system like
architecture patterns, architectural quality is
used to decrease the need for human-
interaction
The approach is based on the two principals
20
21. Principle 1 - Extraction of architecture from
object-oriented systems
21
22. Principle 1 - Extraction of architecture from
object-oriented systems
define the mapping between the object concepts and
the architectural elements.
extract the architecture in terms of classes, shapes,
components and connectors
shapes as collection of components where each
component is composed of several classes which may
belong to different packages.
Each shape is portioned in two parts namely 1)shape
interface 2) center .
“shape interface” contains all the
classes(components) which have links to the outside
of shape like a class may call a function that is defined
in a different class(component) . The remaining classes
of shape are referred to as “center”.
22
24. Principle 2 -guides of the architecture
extraction process
Architecture is considered relevant if it satisfies
these four guides.
Firstly it must be semantically correct, meaning
that it must not convey any meaningless
information .
Secondly architectural quality must be good
Thirdly it should respect the recommendation
specified by the architect and the constraint and
specification specified in system documentation as
far as possible .
lastly it must be adaptable to the specificity of the
deployment hardware architecture.
24
25. How to evaluate software architecture
semantics
semantic sub-characteristics defined are
composability, autonomy and specificity.
refine these sub-characteristics into
component properties like cohesion,
coupling. We link these properties to the
shape properties.
shape properties are measured mainly by
two parameters , Firstly Cohesion ,
Secondly Coupling
25
26. How to evaluate software architecture
semantics
Semantic measurement function based on
our three sub-characteristics, autonomy,
composability and specificity). say Spe ,C,
Auto respectively
Where Ai is designed by the engineers .This
function is used in a hierarchical clustering
algorithm to obtain a shape (partition of
classes) and consequently a semantically
correct architecture
26
28. Romantic
In summary This approach is based on the
component semantic characteristics.
These characteristics helps in creating a
partioning the system classes and
extracting components from it
28
30. Pattern based techniques
Top-down approach.
Uses interaction with human extensively to
put mental model of architecture as well as
to verify the results of recovery.
Two phases
◦ Offline phase
Call-graphs, class-diagrams, dependency graphs
◦ Online
User provides architectural patterns to in the form of a
query
A* search algorithm to match different sub-graphs that
satisfy the pattern given by user.
30
31. System Representation
system software is represented in terms of a
entity-relationship source graph Gs = (Ns
,Rs) .
Nodes of the graph ( ni )represents system
entities like data-types , files , classes
,variables ,functions ,interfaces etc.
edges( ej ) represent the interaction
between various system entities ,like calling
a function from a class ,inheritance etc .
graph is decomposed into sub-graphs where
each sub-graph works as a separate search
space
31
32. Pattern representation
The abstract patterns are represented in a
query language (AQL) .
An AQL query can be represented as a
graph which consists of composite nodes
and links.
A node is a expanded into a pattern region
Gipr, and each link is expanded into a group
of edges Ri m<->pr .
In the ith matching phase a pattern region is
matched with source graph Gg(i)sr and the
edges Ri m<->pr are matched with
corresponding connectors edges Ri m<->sr
32
34. graph matching
matches the query graph (pattern-graph)
Gip with the input graph GiI
k(no of different AQL queries) different
phases .
The results obtained in each phase should
satisfy the constraint specified by the AQL
query
34
35. graph matching
A* search algorithm.
Algorithm generates a search tree which represents the recovery
of module Mi (corresponding to ith AQL query). This graph contains
a root node which is the matching of main-seed ni in source graph
and the first place-holder (node) ni , in Gipr.
Different non-leafs nodes at one level of the search graph
represents the alternative matching of the placeholder from the
source graph and the leaf nodes which represents the solution
obtained from the marching process.
It follows A*, so at each step the cost of matching the source node
and corresponding pattern region is evaluated and the node with
least cost is expanded.
The difference from normal A* is that depth of each search tree is
bounded by numbers of placeholders –nodes in the particular
pattern region or number of modules in the AQL query.
35
36. CONCLUSION
Legacy systems, architectural erosion and need for
modularity are few features that engineers require.
The report compares different approaches to recover
the software architecture. The comparison was based
on different inputs that it takes user in- cooperation,
knowledge of previous architecture and
documentation available.
The common characteristic in all these approaches is
that they all recover the architecture in an
incremental way.
Both clustered-based techniques as well as the
pattern-based techniques have proven to be efficient
in recovering architecture.
36
37. References/Readings
Lei Ding; Medvidovic, N., "Focus: a light-weight, incremental approach
to software architecture recovery and evolution," Software Architecture,
2001. Proceedings. Working IEEE/IFIP Conference on , vol., no., pp.191-200,
Sartipi, K.; Kontogiannis, K., "A graph pattern matching approach to
software architecture recovery ," Software Maintenance, 2001.
Proceedings. IEEE International Conference on , vol., no., pp.408-419, 2001
Chardigny, S.; Seriai, A.; Oussalah, M.; Tamzalit, D., "Extraction of
Component-Based Architecture from Object-Oriented Systems,"
Software Architecture, 2008.WICSA 2008. Seventh Working IEEE/IFIP
Conference on , vol., no., pp.285-288, 18-21 Feb. 2008
D Pollet, S Ducasse, L Poyet, I Alloui, S Cimpan, … - … European
Conference on Software Maintenance and Reengineering -
people.untyped.org
http://www.sparxsystems.com.au/uml-tutorial.html
Czeranski, J.; Eisenbarth, T.; Kienle, H.; Koschke, R.; Simon, D., "Analyzing
xfig using the Bauhaus tool," Reverse Engineering, 2000. Proceedings.
Seventh Working Conference on , vol., no., pp.197-199, 2000
37
38. References/Readings
Eixelsberger, W.; Ogris, M.; Gall, H.; Bellay, B., "Software architecture
recovery of a program family," Software Engineering, 1998.
Proceedings of the 1998 International Conference on , vol., no., pp.508-
511, 19-25 Apr 1998
J. M. Bieman and B.-K. Kang Cohesion and reuse in an object-
oriented system. In Proc. of the Symp. on Software reusability,SSR ’95,
pages 259–262, 1995.
R. Agrawal and R. Srikant. Fast algorithms for mining association
rules. In Proceedings of the 20th International Conference on Very Large
Databases, pages 487–499, 1994
ISO/IEC-9126-1. In Software engineering - Product quality - Part 1:
Quality Model. ISO-IEC, 2001.
C. Szyperski. Component Software. ISBN: 0-201-17888-5. Addison-
Wesley, 1998
38