1. 1
Extended Abstract of Thesis:
“Workflow Extraction from UML Activity
Diagrams”
Thesis Summary – unaltered
http://vivliothmmy.ee.auth.gr/2262/
In the context of Software Engineering every system can have two aspects, static and
dynamic. During the investigation of the dynamic part of a system every engineer seeks to
automate the extraction of Workflow in a both human and machine processable form, as the
Workflow constitutes essentially all the available information of a system in the light of its
dynamic aspect.
One problem often faced by Software Engineers throughout the development of a new
software project is the fact that they can not easily find (online) the workflow for similar types of
systems. All they can find is .jpg images of UML Activity Diagrams by corresponding software
projects, which graphically represent system’s workflow.
As a part of this diploma thesis a potential solution to the above (stated) problem is
developed, a program named UADxTractor. This software – based tool receives images of
Activity Diagrams as input and subjects them to various levels of processing. First level aims to
enable the identification of the main entities of each diagram and the relationships between
them. The second processing level involves the detection and storage of text included in each
diagram as well as the detection of workflow's direction between its entities. In final level, the
proposed system, having already obtained all the available information which is included
(graphically) in the input image, stores it in a semantically aware structure (Ontology), named
Workflow_RDF. To verify the proper operation of UADxTractor several experiments were
performed, the results of which are presented along with the necessary conclusions at the end
of this paper.
Skoumpakis Spyridon,
Thessaloniki, August 2013
Problem Analysis
The fact of the software engineers’ inability in finding the workflow of similar
systems in any form beyond images was the main motive of this dissertation. The
2. 2
solution proposed in this work leads hopefully towards the establishment of Semantic
Web.
In the past, the exchange of information between two people (software engineers
or not) regarding the Workflow of their respective projects was being made, if not hand
to hand, with exclusive contact of two parts.
At present, progress was made and any interested party can search the internet for
information concerning the Workflow of similar projects but all he can find are images
of UML Activity Diagrams.
The ultimate goal of the thesis is the creation of a practical engineering tool (not
only) for better deconstruction and understanding of the dynamic part of each system
but also for facilitating the sharing of information acquired.
The Unified Modeling Language (UML) is a general-purpose modeling language in
the field of software engineering, which is designed to provide a standard way to
visualize the design of a system. Activity Diagrams are a specific category of UML
Diagrams and represent the Dynamic view of a system model. This Dynamic view
emphasizes the dynamic behavior of the system by showing collaborations among
objects and changes to the internal states of objects. Activity Diagrams are graphical
representations of Workflows of stepwise activities and actions. Unsurprisingly, we
focus on the study of this particular category of UML diagrams.
Figure 1 – Definition of Problem and Goal
Figure 2 – Chain of causality
3. 3
Approach
It is more than clear (up until now) that any proposed solution will have to deal with
Image Processing. The only data that every engineer has concerning the above stated
problem are images (.jpg is the most common format). Consequently, those images will
be inputs to our tool.
After analyzing the problem and finding the input the very next subject that we
have to tackle is the Development Environment. In order to decide which programming
tools and languages to use we need to clarify our field of interest. The field that includes
methods for acquiring, processing analyzing and understanding images, in general, high-
dimensional data from the real world in order to produce numerical or symbolic
information is Computer Vision.
As soon as the acknowledgment of the field is completed we can proceed in
choosing the programming tools. After an in-depth search of the existing alternatives we
chose the library OpenCV and its interface in Java. The two main reasons which justify
our choice are i) the unlimited capabilities of the library in question combined with the
platform independency of java and the ii) compatibility of the API with an existing java-
tool for use case diagrams. Code was developed in IDE Eclipse, an author’s personal
preference.
In order to verify the proper function of our tool we had to realize instances of
Activity Diagrams. Some of them were common and others not so common and could
not be found online. We used two known modeling tools (StarUML and ArgoUML) to
draw these instances and extract them in .jpg format. Therefore some input images
came from the Internet and some from the two modeling tools.
The main objective of our software-based tool is the decomposition of Activity
Diagrams in a form that can be processed by both humans and machines. The steps of
the decomposition are presented in detail later on.
Another main concern of our proposed solution is the attainment of easy sharing
and reuse of all the information acquired from the decomposition. In this context, the
last important choice that we had to make in our stepwise approach was the type of
storage construct. There were many alternatives but for reasoning (semantically
awareness) and compatibility purposes Ontology prevailed. The platform that was used
in order to create, edit and view Ontologies was Protégé OWL and the API responsible
for the transfer of the decomposed data to the storage construct was Jena.
A general scheme of the solution along with the exact version of each used program
is depicted in Figure 3.
4. 4
Architecture & Outline of System
This section presents a brief outline of the system step by step. Our software –
based tool was named UADxTractor from the sentence Uml Activity Diagram eXtractor
which reflects its function. System receives images as input and subjects them to various
levels of processing. These levels are shown in Figure 4.
Figure 3 – Tools and Languages used
Figure 4 – Levels of Processing
5. 5
1. Image pre-processing:
This level is placed at the point in time in which the user has already located the
input image and also precedes the main code. The functions of OpenCV Library (via its
Java Interface known as JavaCV) which are used in our project insert some restrictions
concerning the format of the input. The ultimate goal of our tool is the decomposition of
any diagram. To accomplish that we use mainly OpenCV shape detection functions.
These functions have some strict specifications about the images that can process. For
example it is well known that most of the images which can found online are colored
(RGB) with three input channels. One of the most common specifications of OpenCV is
that the image has to be in greyscale and single – channel. The role of this level is to
transform the input image in the proper format for the OpenCV functions of next levels.
The procedure of pre-processing is depicted in Figure 5.
2. Shape & Entities Detection:
In UML each shape of each diagram means something. Consequently, after the
preparation of the input image the task we have to do next is to detect every
meaningful object depicted in the image. OpenCV has some ready-to-use functions for
some of the known (geometric) shapes but there are also many objects that cannot be
detected from the simple use of these functions. As a solution to the above stated
problem we combined these functions with some customized algorithms (created ad-
hoc). The visualization of the (arithmetic) outcome of each detection function is
performed using the drawing functions of the Library. The basic entities in the context of
Activity Diagrams are:
Figure 5 – Level of pre-processing explained
6. 6
Initial Node (& Final Node)
After numerous experiments we concluded that the Contour and Circle Detection
functions of OpenCV fail to systematic detect this entity so we performed the Template
Matching Algorithm. Template Matching is a ready-to-use method of the library which
searches the input image using as reference template images from a user-defined pool.
Template images are usually small icons depicting entities. So if the algorithm finds in
the input image a similar (nearly identical) entity as the one that is depicted in the
template image returns its coordinates (and a colored rectangle is drawn around it).
Line/Transition
The largest and perhaps most important part of Entities Detection Level is Line
Detection. Its importance rests in its complexity but also in the fact that practically all
other entities of an Activity Diagram are related with lines and through them the
essence of each diagram is depicted vividly (i.e. workflow).
In terms of programming for the detection of all lines in an Activity Diagram we
used the cvHoughLines2() algorithm. This ready-to-use function of OpenCV is based
on the Progressive Probabilistic Hough Transform Methodology. Usage example is
shown below.
LinesSeq = cvHoughLines2(img_canny, store, CV_HOUGH_PROBABILISTIC, 1,
Math.PI/360, 40, 40, 7);
//(1)image: The input 8-bit single-channel binary image. In case of
probabilistic method the image is modified by the function
//(2)lineStorage: The storage for the lines detected
//(3)method: The Hough transform variant
//(4)rho: Distance resolution in pixel-related units
//(5)theta – Angle resolution of the accumulator in radians
//(6)threshold – Accumulator threshold parameter. Only those lines are
returned that get enough votes (>threshold).
//(7)param1: the minimum line length,
//(8)param2: it is the maximum gap between line segments lying on the same
line to treat them as a single line segment (that is, to join them)
However many times we change the arguments of the aforementioned method the
results will not be perfect. On the contrary, many imperfections appeared during the
testing period such as the inability to detect some lines, the double detection of other
lines, the fact that some oblique lines were detected as two or more different, the
incorrect detection of lines out of parts of other entities and many other problems. In
order to address all these problems we created some customized checks using mostly
arithmetic boundaries and ifs.
7. 7
Action
For the detection of Actions, we used cvFindContours() method combined with
the cvBoundingRect() function. The first finds all the contours of the input image
within some specified boundaries and the latter bounds the detected contour creating a
rectangle around it. We then used the cvRectangle() method to draw and visualize
the result. Corrective customized checks are needed again mostly because many objects
of the image can be perceived as contours by the primary algorithm.
CvMemStorage mem = CvMemStorage.create();
CvSeq contours = new CvSeq();
cvFindContours(img_canny, mem, contours, Loader.sizeof(CvContour.class) ,
CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0));
CvRect boundbox;
boundbox = cvBoundingRect(contours, 0);
cvRectangle( out_image , cvPoint( boundbox.x(), boundbox.y() ),
cvPoint( boundbox.x() + boundbox.width(),
boundbox.y() + boundbox.height()),
CvScalar.MAGENTA, 1, 0, 0 );
Diamond (Decision & Merge Node)
The same methodology is applied here too except the part of the visualization. In
order to draw the diamond we used four cvLines() functions because OpenCV lacks of a
diamond drawing method. Due to the fact that Decision and Merge nodes are
graphically identical we detect them as one entity but subsequently we make an explicit
conceptual separation between them.
Synchronization Box (Fork & Join Node)
We constructed two equal solutions for the detection of Synchronizations Boxes.
The first one uses the Multiple Template Matching method which we have already
described and the second one uses the algorithm for detecting lines with altered
arguments taking advantage of the graphical similarity between the two entities (a
SynchBox is a very small rectangle, four lines).
A successful detection example is shown in Figure 7. The pink lines are the
visualization of the customized check which tackles the detection problem of broken
lines. The two big lines due to the fact that they have a break point are detected as two
or more separate lines from the cvHoughLines2() method and we have to merge
these lines into one.
9. 9
3. Associations/Relationships of Entities Recognition:
Having all entities detected successfully we move to the next level of processing
which is the recognition of all the relationships between them. We examine the
associations between two entities at a time. According to the rules of UML and the
assumptions that we have made we have eight cases:
1) Initial Node - Action
2) Final Node - Action
3) Final Node - Diamond
4) Final Node - Synchronization
5) Diamond - Synchronization
6) Action - Synchronization
7) Action - Diamond
8) Action – Action
Any relationship consists of 3 entities essentially if we take into account the line
joining the two basic. The algorithm that we have created it is quite similar to the
customized checks mentioned before. We search at a rectangular area around each
entity for line ends. If a line end is detected next to an entity we then search similarly
around all the other entities for the second end of the line in question. Once the second
line end is detected we have identified a relationship which we initially store in a table
and later on in a vector (coordinates of two entities and line). Simultaneously, we
perform some checks for the rejection of double-records.
4. Transitions/Lines Direction Detection:
Every line in any Activity Diagram is part of an association between two entities and
has a specific direction towards one of them. The detection of this direction is the next
processing level and it is very important because it contains information about the
workflow depicted in each diagram.
This level was the most demanding task of the Thesis. Many solutions can be found
but the complexity of the problem based on the number of the entities and the rules
governing the relationships increases exponentially. After many tests, we created two
solutions a primary and a complementary.
Our goal was to find a solution as general as possible. This solution has to obey the
following rules which are derived from UML theory and some assumptions made by the
author due to the limitations of the Thesis.
Initial Node is connected via only one line and only with an action/activity.
Mandatory, this line is outgoing from the Initial Node.
Final Node can be connected with unlimited entities but in all cases the direction
of the lines will be incoming/ingoing to the Final Node.
10. 10
Every Action can have only one outgoing line and unlimited incoming.
Every Diamond is connected to at least three lines.
o 1 ingoing, 2 outgoing = Decision Node
o 2 ingoing, 1 outgoing = Merge Node
Every Synchronization Box is also connected with at least three lines.
o 1 ingoing, 2 outgoing = Fork Node
o 2 ingoing, 1 outgoing = Join Node
Very useful information that came out of these rules is that the first four of the
associations are unidirectional so we a priori know the direction of the arrow. The other
four associations are bidirectional so we need to find the direction of the arrow every
time.
Our primary solution is an algorithm that searches the neighbors of the two entities
which comprise each relationship. In most cases a one-step neighbor search (search of
the neighbors that are right next to each of the two entities of the association) is
sufficient to extract all the information that is needed in order to find out the right
direction.
During our (internet) research on candidate images that could be input to the tool
in question some design patterns began to emerge.
The main pattern was about the development of the diagram. The vast majority of
the depicted diagrams were following a top-down (vertical) development. Some other
patterns that we came across were:
The fact that the first entity of the diagram is the Initial Node which is also
located higher than any other entity. (The (0,0) point for OpenCV is the upper
left corner of every image).
The last and the lowest entity of each diagram is the Final Node.
When a line is far from being horizontal (y1>>y2) then the arrow is directed to
the entity which is located lower in the diagram (having greater y). We coined
the term “Verticality” for these cases.
All of the above are closely related with the Top-Down development pattern. Taking
advantage of the frequency of occurrence of these patterns we simplified our algorithm.
Our complementary solution is less reliable and is applied rarely. In the context of
this new algorithm in order to determine which end of each line has the direction arrow
we examine a rectangular/square area around each end of each relationship for very
small lines (using Line Detection Algorithm). Based on how many small lines we find next
to each end we infer the direction of the arrow. Obviously, whichever end is detected to
have a greater number of small lines around it is the one that has the arrow and thus
defines the direction.
11. 11
The result of the two direction detection algorithms is visualized by drawing colored
dots at the right end of each line. This new image is given as output to the user. An
example is shown in Figure 8 (the unidirectional transitions omitted). The black dots are
from the primary algorithm and the pink from the complementary.
5. OCR Emulation:
Every Activity Diagram contains text and this text is crucial information for the
complete deconstruction of the diagram. After an in-depth search on the internet for an
Open Source OCR (optical character recognition) API we found only one suitable for our
problem, Asprise OCR. This tool is available to the public in a free trial version. We
tested this tool using many different diagrams but its performance was disappointing.
The results were not worthy of showcasing.
The diagrams under study contain characters in only two places: the name of
actions/activities and the guard conditions of decision nodes. This is all the text
information that we want to extract from the image and insert to our program. We
therefore decided to construct an alternative way of retrieving this information. The
new path involves user intervention. We created two classes which undertake to guide
the user to enter the required information from the keyboard. The interactivity is
achieved through messages on the (Eclipse) console combined with explanatory colored
images.
Figure 8 – Direction Detection Algorithms combined
12. 12
Sorting Entities
This is a transitional intermediate level that allows the complete extraction of
information depicted in the input image via classification of entities.
The largest part of the available information after its extraction from the input
image consists of (entity) coordinates. Coordinates are a kind of data that cannot and
does not make sense to store. Apart from the actions and the guard conditions we had
to find a way to also impart to the remaining entities some characteristics in order to
ultimately enable the overall storage of every Activity Diagram. Those characteristics are
a way of distinguishing one entity from another.
The devised method was the serial naming of entities by combining their names
with real numbers (e.g. Fork1, Fork2 etc.). In order to accomplish that in a consistent
way the entities should be sorted. OpenCV detection algorithms automatically sort all
entities (using their upper left point) except lines so all we had to do was to sort them
“manually”.
At this level of processing we also performed the conceptual separation of
Diamonds and Synchronization Boxes in Decision and Merge Nodes and in Fork and Join
Nodes respectively.
6. Connection with Ontology
“Ontology is a formal, explicit and detailed definition of a domain” (general),
“Ontology is a common model for representing the basic concepts of Activity
Diagrams” (specific)
For the development of the Ontology we used the tool Protégé OWL. Our
methodology is based on design rules found in the literature and the model Ontology
OWL-S. The concepts ultimately selected to form the semantic model Workflow_RDF
are depicted in Figure 9.
The last step before the insertion of the aggregate data in the Ontology is the
grouping stage in order to achieve the complete conceptual reconstruction of the
diagram and the representation of the Workflow.
We have developed an algorithm that searches all existing relationships of every
diagram while detecting the direction of the arrows in order to record the flow of
information.
In the context of our Thesis, the recognized relationships between all entities are
eight. Four of them are unidirectional and the other four bidirectional. In addition, three
of them (all the relationships of Decision Nodes) may contain binary conditions known
as Guard Conditions. In short, we created eight functions for one direction of all
relationships, plus another three for the other direction of the bidirectional (Action-
Action can be used for both), plus another three for those with Guard Conditions. In
total, we created fourteen functions separating the original information in groups of two
13. 13
entities with a specified flow direction. These groups are called Sub-Activities as they
constitute parts of the overall Activity which each Diagram represents.
For the connection of our main program with the constructed Ontology in order to
transfer all the available information we used an Open Source Java Framework. This
framework is called Jena API.
Implementation
Now it is time to verify the proper operation of UADxTractor through experiments.
Every user of this tool will probably need to change only a limited number of variables,
the name of the input image and the arguments of the detection algorithms. The values
of those variables are externally defined from a text (.txt) file so the user will not need
to come in contact with the main code.
Figure 9 - Hierarchical Structure of Ontology
14. 14
We use a simple, yet representative diagram, found online, for our experiments.
This diagram depicts the activity of displaying bowling score.
The pre-processing level is next and the results are shown in Figure 11.
Figure 10 - Display Bowling Score Diagram
Figure 11 – Example of pre-processing Level
15. 15
Now the image is ready to be used from the OpenCV functions. The detection of the
entities and their relationships follows. In the big diagram on the left we can see the
overall results of the detection algorithms and the right four diagrams depict the
detection of the relationships between the entities.
During the whole procedure explanatory messages are printed on the console of
the Eclipse. Examples of those messages are shown in Figure 13.
Figure 12 – Detection of Entities and Relationships
Figure 13 – Samples of Console Messages
16. 16
The most complex level of processing is the direction detection. The outcome of our
algorithm is presented in Figure 14.
The last step of our proposed solution is the connection of the main tool with the
Ontology which is the storage construct that we chose. There are two ways to ensure
the unimpeded transfer of all the available information. The first is through console
messages and the second is using Protégé. Both ways are shown below.
Method of Finding Sub - Activities has started...
InitialNode->Line1->Calculate_numbers_of_knocked_out_pins
Do you agree with this activity? Press y(Yes) or n(No)
y
So, the [Sub - Activities] of 2 entities are:
---------------------------------------------
Sub-Act[0]:InitialNode->Line1->Calculate_numbers_of_knocked_out_pins
Sub-Act[1]:Display_Score->Line7->FinalNode
Sub-Act[2]:Decision0->Line6->Non_Strike->Display_Score
Sub-Act[3]:Decision0->Line4->Strike->Show_Animation
Sub-Act[4]:Calculate_the_score->Line3->Decision0
Sub-Act[5]:Show_Animation->Line5->Display_Score
Sub-Act[6]:Calculate_numbers_of_knocked_out_pins->Line2->Calculate_the_score
--------------------
Number of Sub - Activities = 7
THE END!
Figure 14 – Example of the Direction Detection Algorithm
17. 17
Conclusions
Through this work I discovered some of my strengths and put myself into a test to
see how I deal with adversity. The lessons I took from this process are invaluable. My
supervisor Dr. Andreas Symeonidis stood by me and shared his insight when it was
needed. We kept a close contact dividing my work into deliverables and discussing every
issue.
One of the main goals of the current paper is to continue the effort started with the
Diploma Thesis of Anastasia Mourka (Use Case Diagrams) automating the conceptual
reconstruction of every UML diagram and thus prepare the ground for the future
creation of a semantic (UML) search engine. Anastasia was an excellent advisor at the
beginning of this effort providing valuable guidance as a former student of the same
department.
This implementation was constructed based on the most common design patterns,
but due to the vast number of existing alternative models was impossible to cover all
eventualities. For example, the analysis of images depicting Activity Diagrams with
horizontal development introduces some deficiencies. Nevertheless, the overall
conclusion, in the context of analysis, is quite satisfactory, as the vertical development
of these diagrams is the usual practice.
The images used in the experiments were chosen so as to cover the widest possible
range of the online available diagrams. They have an average complexity and most of
them came from the well-known tools StarUML and ArgoUML.
In conclusion, the main success of UADxTractor, in addition to saving time for
Software Engineers, is that it provides direct access to the Workflow for similar types of
Systems. This is something than can be useful for both humans (engineers) and other
programs.
Figure 15 – Display Bowling Score Instance in Protégé