SlideShare a Scribd company logo
1 of 17
Download to read offline
1
Extended Abstract of Thesis:
“Workflow Extraction from UML Activity
Diagrams”
Thesis Summary – unaltered
http://vivliothmmy.ee.auth.gr/2262/
In the context of Software Engineering every system can have two aspects, static and
dynamic. During the investigation of the dynamic part of a system every engineer seeks to
automate the extraction of Workflow in a both human and machine processable form, as the
Workflow constitutes essentially all the available information of a system in the light of its
dynamic aspect.
One problem often faced by Software Engineers throughout the development of a new
software project is the fact that they can not easily find (online) the workflow for similar types of
systems. All they can find is .jpg images of UML Activity Diagrams by corresponding software
projects, which graphically represent system’s workflow.
As a part of this diploma thesis a potential solution to the above (stated) problem is
developed, a program named UADxTractor. This software – based tool receives images of
Activity Diagrams as input and subjects them to various levels of processing. First level aims to
enable the identification of the main entities of each diagram and the relationships between
them. The second processing level involves the detection and storage of text included in each
diagram as well as the detection of workflow's direction between its entities. In final level, the
proposed system, having already obtained all the available information which is included
(graphically) in the input image, stores it in a semantically aware structure (Ontology), named
Workflow_RDF. To verify the proper operation of UADxTractor several experiments were
performed, the results of which are presented along with the necessary conclusions at the end
of this paper.
Skoumpakis Spyridon,
Thessaloniki, August 2013
Problem Analysis
The fact of the software engineers’ inability in finding the workflow of similar
systems in any form beyond images was the main motive of this dissertation. The
2
solution proposed in this work leads hopefully towards the establishment of Semantic
Web.
In the past, the exchange of information between two people (software engineers
or not) regarding the Workflow of their respective projects was being made, if not hand
to hand, with exclusive contact of two parts.
At present, progress was made and any interested party can search the internet for
information concerning the Workflow of similar projects but all he can find are images
of UML Activity Diagrams.
The ultimate goal of the thesis is the creation of a practical engineering tool (not
only) for better deconstruction and understanding of the dynamic part of each system
but also for facilitating the sharing of information acquired.
The Unified Modeling Language (UML) is a general-purpose modeling language in
the field of software engineering, which is designed to provide a standard way to
visualize the design of a system. Activity Diagrams are a specific category of UML
Diagrams and represent the Dynamic view of a system model. This Dynamic view
emphasizes the dynamic behavior of the system by showing collaborations among
objects and changes to the internal states of objects. Activity Diagrams are graphical
representations of Workflows of stepwise activities and actions. Unsurprisingly, we
focus on the study of this particular category of UML diagrams.
Figure 1 – Definition of Problem and Goal
Figure 2 – Chain of causality
3
Approach
It is more than clear (up until now) that any proposed solution will have to deal with
Image Processing. The only data that every engineer has concerning the above stated
problem are images (.jpg is the most common format). Consequently, those images will
be inputs to our tool.
After analyzing the problem and finding the input the very next subject that we
have to tackle is the Development Environment. In order to decide which programming
tools and languages to use we need to clarify our field of interest. The field that includes
methods for acquiring, processing analyzing and understanding images, in general, high-
dimensional data from the real world in order to produce numerical or symbolic
information is Computer Vision.
As soon as the acknowledgment of the field is completed we can proceed in
choosing the programming tools. After an in-depth search of the existing alternatives we
chose the library OpenCV and its interface in Java. The two main reasons which justify
our choice are i) the unlimited capabilities of the library in question combined with the
platform independency of java and the ii) compatibility of the API with an existing java-
tool for use case diagrams. Code was developed in IDE Eclipse, an author’s personal
preference.
In order to verify the proper function of our tool we had to realize instances of
Activity Diagrams. Some of them were common and others not so common and could
not be found online. We used two known modeling tools (StarUML and ArgoUML) to
draw these instances and extract them in .jpg format. Therefore some input images
came from the Internet and some from the two modeling tools.
The main objective of our software-based tool is the decomposition of Activity
Diagrams in a form that can be processed by both humans and machines. The steps of
the decomposition are presented in detail later on.
Another main concern of our proposed solution is the attainment of easy sharing
and reuse of all the information acquired from the decomposition. In this context, the
last important choice that we had to make in our stepwise approach was the type of
storage construct. There were many alternatives but for reasoning (semantically
awareness) and compatibility purposes Ontology prevailed. The platform that was used
in order to create, edit and view Ontologies was Protégé OWL and the API responsible
for the transfer of the decomposed data to the storage construct was Jena.
A general scheme of the solution along with the exact version of each used program
is depicted in Figure 3.
4
Architecture & Outline of System
This section presents a brief outline of the system step by step. Our software –
based tool was named UADxTractor from the sentence Uml Activity Diagram eXtractor
which reflects its function. System receives images as input and subjects them to various
levels of processing. These levels are shown in Figure 4.
Figure 3 – Tools and Languages used
Figure 4 – Levels of Processing
5
1. Image pre-processing:
This level is placed at the point in time in which the user has already located the
input image and also precedes the main code. The functions of OpenCV Library (via its
Java Interface known as JavaCV) which are used in our project insert some restrictions
concerning the format of the input. The ultimate goal of our tool is the decomposition of
any diagram. To accomplish that we use mainly OpenCV shape detection functions.
These functions have some strict specifications about the images that can process. For
example it is well known that most of the images which can found online are colored
(RGB) with three input channels. One of the most common specifications of OpenCV is
that the image has to be in greyscale and single – channel. The role of this level is to
transform the input image in the proper format for the OpenCV functions of next levels.
The procedure of pre-processing is depicted in Figure 5.
2. Shape & Entities Detection:
In UML each shape of each diagram means something. Consequently, after the
preparation of the input image the task we have to do next is to detect every
meaningful object depicted in the image. OpenCV has some ready-to-use functions for
some of the known (geometric) shapes but there are also many objects that cannot be
detected from the simple use of these functions. As a solution to the above stated
problem we combined these functions with some customized algorithms (created ad-
hoc). The visualization of the (arithmetic) outcome of each detection function is
performed using the drawing functions of the Library. The basic entities in the context of
Activity Diagrams are:
Figure 5 – Level of pre-processing explained
6
 Initial Node (& Final Node)
After numerous experiments we concluded that the Contour and Circle Detection
functions of OpenCV fail to systematic detect this entity so we performed the Template
Matching Algorithm. Template Matching is a ready-to-use method of the library which
searches the input image using as reference template images from a user-defined pool.
Template images are usually small icons depicting entities. So if the algorithm finds in
the input image a similar (nearly identical) entity as the one that is depicted in the
template image returns its coordinates (and a colored rectangle is drawn around it).
 Line/Transition
The largest and perhaps most important part of Entities Detection Level is Line
Detection. Its importance rests in its complexity but also in the fact that practically all
other entities of an Activity Diagram are related with lines and through them the
essence of each diagram is depicted vividly (i.e. workflow).
In terms of programming for the detection of all lines in an Activity Diagram we
used the cvHoughLines2() algorithm. This ready-to-use function of OpenCV is based
on the Progressive Probabilistic Hough Transform Methodology. Usage example is
shown below.
LinesSeq = cvHoughLines2(img_canny, store, CV_HOUGH_PROBABILISTIC, 1,
Math.PI/360, 40, 40, 7);
//(1)image: The input 8-bit single-channel binary image. In case of
probabilistic method the image is modified by the function
//(2)lineStorage: The storage for the lines detected
//(3)method: The Hough transform variant
//(4)rho: Distance resolution in pixel-related units
//(5)theta – Angle resolution of the accumulator in radians
//(6)threshold – Accumulator threshold parameter. Only those lines are
returned that get enough votes (>threshold).
//(7)param1: the minimum line length,
//(8)param2: it is the maximum gap between line segments lying on the same
line to treat them as a single line segment (that is, to join them)
However many times we change the arguments of the aforementioned method the
results will not be perfect. On the contrary, many imperfections appeared during the
testing period such as the inability to detect some lines, the double detection of other
lines, the fact that some oblique lines were detected as two or more different, the
incorrect detection of lines out of parts of other entities and many other problems. In
order to address all these problems we created some customized checks using mostly
arithmetic boundaries and ifs.
7
 Action
For the detection of Actions, we used cvFindContours() method combined with
the cvBoundingRect() function. The first finds all the contours of the input image
within some specified boundaries and the latter bounds the detected contour creating a
rectangle around it. We then used the cvRectangle() method to draw and visualize
the result. Corrective customized checks are needed again mostly because many objects
of the image can be perceived as contours by the primary algorithm.
CvMemStorage mem = CvMemStorage.create();
CvSeq contours = new CvSeq();
cvFindContours(img_canny, mem, contours, Loader.sizeof(CvContour.class) ,
CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0));
CvRect boundbox;
boundbox = cvBoundingRect(contours, 0);
cvRectangle( out_image , cvPoint( boundbox.x(), boundbox.y() ),
cvPoint( boundbox.x() + boundbox.width(),
boundbox.y() + boundbox.height()),
CvScalar.MAGENTA, 1, 0, 0 );
 Diamond (Decision & Merge Node)
The same methodology is applied here too except the part of the visualization. In
order to draw the diamond we used four cvLines() functions because OpenCV lacks of a
diamond drawing method. Due to the fact that Decision and Merge nodes are
graphically identical we detect them as one entity but subsequently we make an explicit
conceptual separation between them.
 Synchronization Box (Fork & Join Node)
We constructed two equal solutions for the detection of Synchronizations Boxes.
The first one uses the Multiple Template Matching method which we have already
described and the second one uses the algorithm for detecting lines with altered
arguments taking advantage of the graphical similarity between the two entities (a
SynchBox is a very small rectangle, four lines).
A successful detection example is shown in Figure 7. The pink lines are the
visualization of the customized check which tackles the detection problem of broken
lines. The two big lines due to the fact that they have a break point are detected as two
or more separate lines from the cvHoughLines2() method and we have to merge
these lines into one.
8
Figure 6 – Input Image
Figure 7 – Exported Image
9
3. Associations/Relationships of Entities Recognition:
Having all entities detected successfully we move to the next level of processing
which is the recognition of all the relationships between them. We examine the
associations between two entities at a time. According to the rules of UML and the
assumptions that we have made we have eight cases:
1) Initial Node - Action
2) Final Node - Action
3) Final Node - Diamond
4) Final Node - Synchronization
5) Diamond - Synchronization
6) Action - Synchronization
7) Action - Diamond
8) Action – Action
Any relationship consists of 3 entities essentially if we take into account the line
joining the two basic. The algorithm that we have created it is quite similar to the
customized checks mentioned before. We search at a rectangular area around each
entity for line ends. If a line end is detected next to an entity we then search similarly
around all the other entities for the second end of the line in question. Once the second
line end is detected we have identified a relationship which we initially store in a table
and later on in a vector (coordinates of two entities and line). Simultaneously, we
perform some checks for the rejection of double-records.
4. Transitions/Lines Direction Detection:
Every line in any Activity Diagram is part of an association between two entities and
has a specific direction towards one of them. The detection of this direction is the next
processing level and it is very important because it contains information about the
workflow depicted in each diagram.
This level was the most demanding task of the Thesis. Many solutions can be found
but the complexity of the problem based on the number of the entities and the rules
governing the relationships increases exponentially. After many tests, we created two
solutions a primary and a complementary.
Our goal was to find a solution as general as possible. This solution has to obey the
following rules which are derived from UML theory and some assumptions made by the
author due to the limitations of the Thesis.
 Initial Node is connected via only one line and only with an action/activity.
Mandatory, this line is outgoing from the Initial Node.
 Final Node can be connected with unlimited entities but in all cases the direction
of the lines will be incoming/ingoing to the Final Node.
10
 Every Action can have only one outgoing line and unlimited incoming.
 Every Diamond is connected to at least three lines.
o 1 ingoing, 2 outgoing = Decision Node
o 2 ingoing, 1 outgoing = Merge Node
 Every Synchronization Box is also connected with at least three lines.
o 1 ingoing, 2 outgoing = Fork Node
o 2 ingoing, 1 outgoing = Join Node
Very useful information that came out of these rules is that the first four of the
associations are unidirectional so we a priori know the direction of the arrow. The other
four associations are bidirectional so we need to find the direction of the arrow every
time.
Our primary solution is an algorithm that searches the neighbors of the two entities
which comprise each relationship. In most cases a one-step neighbor search (search of
the neighbors that are right next to each of the two entities of the association) is
sufficient to extract all the information that is needed in order to find out the right
direction.
During our (internet) research on candidate images that could be input to the tool
in question some design patterns began to emerge.
The main pattern was about the development of the diagram. The vast majority of
the depicted diagrams were following a top-down (vertical) development. Some other
patterns that we came across were:
 The fact that the first entity of the diagram is the Initial Node which is also
located higher than any other entity. (The (0,0) point for OpenCV is the upper
left corner of every image).
 The last and the lowest entity of each diagram is the Final Node.
 When a line is far from being horizontal (y1>>y2) then the arrow is directed to
the entity which is located lower in the diagram (having greater y). We coined
the term “Verticality” for these cases.
All of the above are closely related with the Top-Down development pattern. Taking
advantage of the frequency of occurrence of these patterns we simplified our algorithm.
Our complementary solution is less reliable and is applied rarely. In the context of
this new algorithm in order to determine which end of each line has the direction arrow
we examine a rectangular/square area around each end of each relationship for very
small lines (using Line Detection Algorithm). Based on how many small lines we find next
to each end we infer the direction of the arrow. Obviously, whichever end is detected to
have a greater number of small lines around it is the one that has the arrow and thus
defines the direction.
11
The result of the two direction detection algorithms is visualized by drawing colored
dots at the right end of each line. This new image is given as output to the user. An
example is shown in Figure 8 (the unidirectional transitions omitted). The black dots are
from the primary algorithm and the pink from the complementary.
5. OCR Emulation:
Every Activity Diagram contains text and this text is crucial information for the
complete deconstruction of the diagram. After an in-depth search on the internet for an
Open Source OCR (optical character recognition) API we found only one suitable for our
problem, Asprise OCR. This tool is available to the public in a free trial version. We
tested this tool using many different diagrams but its performance was disappointing.
The results were not worthy of showcasing.
The diagrams under study contain characters in only two places: the name of
actions/activities and the guard conditions of decision nodes. This is all the text
information that we want to extract from the image and insert to our program. We
therefore decided to construct an alternative way of retrieving this information. The
new path involves user intervention. We created two classes which undertake to guide
the user to enter the required information from the keyboard. The interactivity is
achieved through messages on the (Eclipse) console combined with explanatory colored
images.
Figure 8 – Direction Detection Algorithms combined
12
 Sorting Entities
This is a transitional intermediate level that allows the complete extraction of
information depicted in the input image via classification of entities.
The largest part of the available information after its extraction from the input
image consists of (entity) coordinates. Coordinates are a kind of data that cannot and
does not make sense to store. Apart from the actions and the guard conditions we had
to find a way to also impart to the remaining entities some characteristics in order to
ultimately enable the overall storage of every Activity Diagram. Those characteristics are
a way of distinguishing one entity from another.
The devised method was the serial naming of entities by combining their names
with real numbers (e.g. Fork1, Fork2 etc.). In order to accomplish that in a consistent
way the entities should be sorted. OpenCV detection algorithms automatically sort all
entities (using their upper left point) except lines so all we had to do was to sort them
“manually”.
At this level of processing we also performed the conceptual separation of
Diamonds and Synchronization Boxes in Decision and Merge Nodes and in Fork and Join
Nodes respectively.
6. Connection with Ontology
 “Ontology is a formal, explicit and detailed definition of a domain” (general),
 “Ontology is a common model for representing the basic concepts of Activity
Diagrams” (specific)
For the development of the Ontology we used the tool Protégé OWL. Our
methodology is based on design rules found in the literature and the model Ontology
OWL-S. The concepts ultimately selected to form the semantic model Workflow_RDF
are depicted in Figure 9.
The last step before the insertion of the aggregate data in the Ontology is the
grouping stage in order to achieve the complete conceptual reconstruction of the
diagram and the representation of the Workflow.
We have developed an algorithm that searches all existing relationships of every
diagram while detecting the direction of the arrows in order to record the flow of
information.
In the context of our Thesis, the recognized relationships between all entities are
eight. Four of them are unidirectional and the other four bidirectional. In addition, three
of them (all the relationships of Decision Nodes) may contain binary conditions known
as Guard Conditions. In short, we created eight functions for one direction of all
relationships, plus another three for the other direction of the bidirectional (Action-
Action can be used for both), plus another three for those with Guard Conditions. In
total, we created fourteen functions separating the original information in groups of two
13
entities with a specified flow direction. These groups are called Sub-Activities as they
constitute parts of the overall Activity which each Diagram represents.
For the connection of our main program with the constructed Ontology in order to
transfer all the available information we used an Open Source Java Framework. This
framework is called Jena API.
Implementation
Now it is time to verify the proper operation of UADxTractor through experiments.
Every user of this tool will probably need to change only a limited number of variables,
the name of the input image and the arguments of the detection algorithms. The values
of those variables are externally defined from a text (.txt) file so the user will not need
to come in contact with the main code.
Figure 9 - Hierarchical Structure of Ontology
14
We use a simple, yet representative diagram, found online, for our experiments.
This diagram depicts the activity of displaying bowling score.
The pre-processing level is next and the results are shown in Figure 11.
Figure 10 - Display Bowling Score Diagram
Figure 11 – Example of pre-processing Level
15
Now the image is ready to be used from the OpenCV functions. The detection of the
entities and their relationships follows. In the big diagram on the left we can see the
overall results of the detection algorithms and the right four diagrams depict the
detection of the relationships between the entities.
During the whole procedure explanatory messages are printed on the console of
the Eclipse. Examples of those messages are shown in Figure 13.
Figure 12 – Detection of Entities and Relationships
Figure 13 – Samples of Console Messages
16
The most complex level of processing is the direction detection. The outcome of our
algorithm is presented in Figure 14.
The last step of our proposed solution is the connection of the main tool with the
Ontology which is the storage construct that we chose. There are two ways to ensure
the unimpeded transfer of all the available information. The first is through console
messages and the second is using Protégé. Both ways are shown below.
Method of Finding Sub - Activities has started...
InitialNode->Line1->Calculate_numbers_of_knocked_out_pins
Do you agree with this activity? Press y(Yes) or n(No)
y
So, the [Sub - Activities] of 2 entities are:
---------------------------------------------
Sub-Act[0]:InitialNode->Line1->Calculate_numbers_of_knocked_out_pins
Sub-Act[1]:Display_Score->Line7->FinalNode
Sub-Act[2]:Decision0->Line6->Non_Strike->Display_Score
Sub-Act[3]:Decision0->Line4->Strike->Show_Animation
Sub-Act[4]:Calculate_the_score->Line3->Decision0
Sub-Act[5]:Show_Animation->Line5->Display_Score
Sub-Act[6]:Calculate_numbers_of_knocked_out_pins->Line2->Calculate_the_score
--------------------
Number of Sub - Activities = 7
THE END!
Figure 14 – Example of the Direction Detection Algorithm
17
Conclusions
Through this work I discovered some of my strengths and put myself into a test to
see how I deal with adversity. The lessons I took from this process are invaluable. My
supervisor Dr. Andreas Symeonidis stood by me and shared his insight when it was
needed. We kept a close contact dividing my work into deliverables and discussing every
issue.
One of the main goals of the current paper is to continue the effort started with the
Diploma Thesis of Anastasia Mourka (Use Case Diagrams) automating the conceptual
reconstruction of every UML diagram and thus prepare the ground for the future
creation of a semantic (UML) search engine. Anastasia was an excellent advisor at the
beginning of this effort providing valuable guidance as a former student of the same
department.
This implementation was constructed based on the most common design patterns,
but due to the vast number of existing alternative models was impossible to cover all
eventualities. For example, the analysis of images depicting Activity Diagrams with
horizontal development introduces some deficiencies. Nevertheless, the overall
conclusion, in the context of analysis, is quite satisfactory, as the vertical development
of these diagrams is the usual practice.
The images used in the experiments were chosen so as to cover the widest possible
range of the online available diagrams. They have an average complexity and most of
them came from the well-known tools StarUML and ArgoUML.
In conclusion, the main success of UADxTractor, in addition to saving time for
Software Engineers, is that it provides direct access to the Workflow for similar types of
Systems. This is something than can be useful for both humans (engineers) and other
programs.
Figure 15 – Display Bowling Score Instance in Protégé

More Related Content

What's hot

Final year embedded projects in bangalore
Final year embedded projects in bangaloreFinal year embedded projects in bangalore
Final year embedded projects in bangaloreAshok Kumar.k
 
11 ooad uml-14
11 ooad uml-1411 ooad uml-14
11 ooad uml-14Niit Care
 
Software Architecture for Robotics
Software Architecture for RoboticsSoftware Architecture for Robotics
Software Architecture for RoboticsLorran Pegoretti
 
Unit 1- OOAD ppt
Unit 1- OOAD  pptUnit 1- OOAD  ppt
Unit 1- OOAD pptPRIANKA R
 
UML (Unified Modeling Language)
UML (Unified Modeling Language)UML (Unified Modeling Language)
UML (Unified Modeling Language)Nguyen Tuan
 
Aspect Oriented Programming Through C#.NET
Aspect Oriented Programming Through C#.NETAspect Oriented Programming Through C#.NET
Aspect Oriented Programming Through C#.NETWaqas Tariq
 
5 - Architetture Software - Metamodelling and the Model Driven Architecture
5 - Architetture Software - Metamodelling and the Model Driven Architecture5 - Architetture Software - Metamodelling and the Model Driven Architecture
5 - Architetture Software - Metamodelling and the Model Driven ArchitectureMajong DevJfu
 
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHM
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHMA SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHM
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHMVLSICS Design
 
Object Oriented Design in Software Engineering SE12
Object Oriented Design in Software Engineering SE12Object Oriented Design in Software Engineering SE12
Object Oriented Design in Software Engineering SE12koolkampus
 
[2016/2017] Architectural languages
[2016/2017] Architectural languages[2016/2017] Architectural languages
[2016/2017] Architectural languagesIvano Malavolta
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworkskim.mens
 
12 ooad uml-16
12 ooad uml-1612 ooad uml-16
12 ooad uml-16Niit Care
 
Introducing Uml And Development Process
Introducing Uml And Development ProcessIntroducing Uml And Development Process
Introducing Uml And Development ProcessTerry Cho
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender andYuheng Wang
 
A novel methodology for test scenario generation based on control flow analys...
A novel methodology for test scenario generation based on control flow analys...A novel methodology for test scenario generation based on control flow analys...
A novel methodology for test scenario generation based on control flow analys...eSAT Publishing House
 

What's hot (20)

Final year embedded projects in bangalore
Final year embedded projects in bangaloreFinal year embedded projects in bangalore
Final year embedded projects in bangalore
 
11 ooad uml-14
11 ooad uml-1411 ooad uml-14
11 ooad uml-14
 
Software Architecture for Robotics
Software Architecture for RoboticsSoftware Architecture for Robotics
Software Architecture for Robotics
 
Unit 1- OOAD ppt
Unit 1- OOAD  pptUnit 1- OOAD  ppt
Unit 1- OOAD ppt
 
UML (Unified Modeling Language)
UML (Unified Modeling Language)UML (Unified Modeling Language)
UML (Unified Modeling Language)
 
Aspect Oriented Programming Through C#.NET
Aspect Oriented Programming Through C#.NETAspect Oriented Programming Through C#.NET
Aspect Oriented Programming Through C#.NET
 
Ch14
Ch14Ch14
Ch14
 
Case stydy cs701
Case stydy cs701 Case stydy cs701
Case stydy cs701
 
5 - Architetture Software - Metamodelling and the Model Driven Architecture
5 - Architetture Software - Metamodelling and the Model Driven Architecture5 - Architetture Software - Metamodelling and the Model Driven Architecture
5 - Architetture Software - Metamodelling and the Model Driven Architecture
 
Software Patterns
Software PatternsSoftware Patterns
Software Patterns
 
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHM
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHMA SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHM
A SYSTEMC/SIMULINK CO-SIMULATION ENVIRONMENT OF THE JPEG ALGORITHM
 
Object Oriented Design in Software Engineering SE12
Object Oriented Design in Software Engineering SE12Object Oriented Design in Software Engineering SE12
Object Oriented Design in Software Engineering SE12
 
[2016/2017] Architectural languages
[2016/2017] Architectural languages[2016/2017] Architectural languages
[2016/2017] Architectural languages
 
CS8592-OOAD Lecture Notes Unit-3
CS8592-OOAD Lecture Notes Unit-3CS8592-OOAD Lecture Notes Unit-3
CS8592-OOAD Lecture Notes Unit-3
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworks
 
12 ooad uml-16
12 ooad uml-1612 ooad uml-16
12 ooad uml-16
 
UML Modeling in Java
UML Modeling in JavaUML Modeling in Java
UML Modeling in Java
 
Introducing Uml And Development Process
Introducing Uml And Development ProcessIntroducing Uml And Development Process
Introducing Uml And Development Process
 
Facial Image Analysis for age and gender and
Facial Image Analysis for age and gender andFacial Image Analysis for age and gender and
Facial Image Analysis for age and gender and
 
A novel methodology for test scenario generation based on control flow analys...
A novel methodology for test scenario generation based on control flow analys...A novel methodology for test scenario generation based on control flow analys...
A novel methodology for test scenario generation based on control flow analys...
 

Viewers also liked

Multimedia Presentation - PV (secure)
Multimedia Presentation - PV (secure)Multimedia Presentation - PV (secure)
Multimedia Presentation - PV (secure)Peter Vo
 
CTDLC Multimedia Design Presentation
CTDLC Multimedia Design PresentationCTDLC Multimedia Design Presentation
CTDLC Multimedia Design PresentationKevin Corcoran
 
Instructional Multimedia Development
Instructional Multimedia DevelopmentInstructional Multimedia Development
Instructional Multimedia DevelopmentPinoy Guro
 
Lesson 6 Using and Evaluating Instructional Material (EDTECH)
Lesson 6 Using and Evaluating Instructional Material (EDTECH)Lesson 6 Using and Evaluating Instructional Material (EDTECH)
Lesson 6 Using and Evaluating Instructional Material (EDTECH)120308
 
Using and Evaluating Instructional Materials
Using and Evaluating Instructional MaterialsUsing and Evaluating Instructional Materials
Using and Evaluating Instructional MaterialsRecelvictoriano
 
Ed. Tech ; Lesson VI - Using and evaluating instructional materials
Ed. Tech ; Lesson VI - Using and evaluating instructional materialsEd. Tech ; Lesson VI - Using and evaluating instructional materials
Ed. Tech ; Lesson VI - Using and evaluating instructional materialschristianerickbesana
 
Abstract Art sgp power point
Abstract Art sgp power pointAbstract Art sgp power point
Abstract Art sgp power point102235
 
Instructional Design for Multimedia
Instructional Design for MultimediaInstructional Design for Multimedia
Instructional Design for MultimediaSanjaya Mishra
 
Ed. Tech. I Lesson 6:Using and Evaluating Instructional Materials
Ed. Tech. I Lesson 6:Using and Evaluating Instructional MaterialsEd. Tech. I Lesson 6:Using and Evaluating Instructional Materials
Ed. Tech. I Lesson 6:Using and Evaluating Instructional MaterialsRisha Cajilig
 
USING AND EVALUATING INSTRUCTIONAL MATERIALS
USING AND EVALUATING INSTRUCTIONAL MATERIALSUSING AND EVALUATING INSTRUCTIONAL MATERIALS
USING AND EVALUATING INSTRUCTIONAL MATERIALSjanehbasto
 
instructional matertials authored by Mr. Ranie M. Esponilla
instructional matertials authored by Mr. Ranie M. Esponillainstructional matertials authored by Mr. Ranie M. Esponilla
instructional matertials authored by Mr. Ranie M. EsponillaRanie Esponilla
 
Thesis Identifying Activity
Thesis Identifying ActivityThesis Identifying Activity
Thesis Identifying Activitymr_rodriguez23
 
Basic Instructional Design Principles - A Primer
Basic Instructional Design Principles - A PrimerBasic Instructional Design Principles - A Primer
Basic Instructional Design Principles - A PrimerMike Kunkle
 
Thesis questionnaire
Thesis questionnaireThesis questionnaire
Thesis questionnaireMaliha Ahmed
 
Final thesis presented december 2009 march 2010
Final thesis presented december 2009 march 2010Final thesis presented december 2009 march 2010
Final thesis presented december 2009 march 2010Lumbad 1989
 
Instructional Design
Instructional DesignInstructional Design
Instructional Designjamalharun
 

Viewers also liked (20)

Multimedia Presentation - PV (secure)
Multimedia Presentation - PV (secure)Multimedia Presentation - PV (secure)
Multimedia Presentation - PV (secure)
 
Sumary of thesis 1
Sumary of thesis  1Sumary of thesis  1
Sumary of thesis 1
 
CTDLC Multimedia Design Presentation
CTDLC Multimedia Design PresentationCTDLC Multimedia Design Presentation
CTDLC Multimedia Design Presentation
 
Instructional Multimedia Development
Instructional Multimedia DevelopmentInstructional Multimedia Development
Instructional Multimedia Development
 
Lesson 6 Using and Evaluating Instructional Material (EDTECH)
Lesson 6 Using and Evaluating Instructional Material (EDTECH)Lesson 6 Using and Evaluating Instructional Material (EDTECH)
Lesson 6 Using and Evaluating Instructional Material (EDTECH)
 
Using and Evaluating Instructional Materials
Using and Evaluating Instructional MaterialsUsing and Evaluating Instructional Materials
Using and Evaluating Instructional Materials
 
Ed. Tech ; Lesson VI - Using and evaluating instructional materials
Ed. Tech ; Lesson VI - Using and evaluating instructional materialsEd. Tech ; Lesson VI - Using and evaluating instructional materials
Ed. Tech ; Lesson VI - Using and evaluating instructional materials
 
Abstract Art sgp power point
Abstract Art sgp power pointAbstract Art sgp power point
Abstract Art sgp power point
 
Instructional Design for Multimedia
Instructional Design for MultimediaInstructional Design for Multimedia
Instructional Design for Multimedia
 
Ed. Tech. I Lesson 6:Using and Evaluating Instructional Materials
Ed. Tech. I Lesson 6:Using and Evaluating Instructional MaterialsEd. Tech. I Lesson 6:Using and Evaluating Instructional Materials
Ed. Tech. I Lesson 6:Using and Evaluating Instructional Materials
 
Multimedia thesis
Multimedia thesisMultimedia thesis
Multimedia thesis
 
USING AND EVALUATING INSTRUCTIONAL MATERIALS
USING AND EVALUATING INSTRUCTIONAL MATERIALSUSING AND EVALUATING INSTRUCTIONAL MATERIALS
USING AND EVALUATING INSTRUCTIONAL MATERIALS
 
A research proposal
A research proposalA research proposal
A research proposal
 
instructional matertials authored by Mr. Ranie M. Esponilla
instructional matertials authored by Mr. Ranie M. Esponillainstructional matertials authored by Mr. Ranie M. Esponilla
instructional matertials authored by Mr. Ranie M. Esponilla
 
Thesis Identifying Activity
Thesis Identifying ActivityThesis Identifying Activity
Thesis Identifying Activity
 
Basic Instructional Design Principles - A Primer
Basic Instructional Design Principles - A PrimerBasic Instructional Design Principles - A Primer
Basic Instructional Design Principles - A Primer
 
Thesis questionnaire
Thesis questionnaireThesis questionnaire
Thesis questionnaire
 
Final thesis presented december 2009 march 2010
Final thesis presented december 2009 march 2010Final thesis presented december 2009 march 2010
Final thesis presented december 2009 march 2010
 
Instructional Design
Instructional DesignInstructional Design
Instructional Design
 
Multimedia
MultimediaMultimedia
Multimedia
 

Similar to [LinkedIn]_Thesis Sum in English_New

Presentation1.2.pptx
Presentation1.2.pptxPresentation1.2.pptx
Presentation1.2.pptxpranaykusuma
 
Object Oriented Database
Object Oriented DatabaseObject Oriented Database
Object Oriented DatabaseMegan Espinoza
 
502 Object Oriented Analysis and Design.pdf
502 Object Oriented Analysis and Design.pdf502 Object Oriented Analysis and Design.pdf
502 Object Oriented Analysis and Design.pdfPradeepPandey506579
 
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...ijseajournal
 
Object-oriented modeling and design.pdf
Object-oriented modeling and  design.pdfObject-oriented modeling and  design.pdf
Object-oriented modeling and design.pdfSHIVAM691605
 
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors India
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors IndiaBEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors India
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors IndiaTutors India
 
Automatically translating a general purpose C image processing library for ...
Automatically translating a general purpose C   image processing library for ...Automatically translating a general purpose C   image processing library for ...
Automatically translating a general purpose C image processing library for ...Carrie Cox
 
Various Approaches Of System Analysis
Various Approaches Of System AnalysisVarious Approaches Of System Analysis
Various Approaches Of System AnalysisLaura Torres
 
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTSMANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTSijseajournal
 
Accelerating A C Image Processing Library With A GPU
Accelerating A C   Image Processing Library With A GPUAccelerating A C   Image Processing Library With A GPU
Accelerating A C Image Processing Library With A GPUDaniel Wachtel
 
Component based models and technology
Component based models and technologyComponent based models and technology
Component based models and technologyMayukh Maitra
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptVGaneshKarthikeyan
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptVGaneshKarthikeyan
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptVGaneshKarthikeyan
 

Similar to [LinkedIn]_Thesis Sum in English_New (20)

Sdlc
SdlcSdlc
Sdlc
 
Sdlc
SdlcSdlc
Sdlc
 
Chapter1
Chapter1Chapter1
Chapter1
 
Presentation1.2.pptx
Presentation1.2.pptxPresentation1.2.pptx
Presentation1.2.pptx
 
Object Oriented Database
Object Oriented DatabaseObject Oriented Database
Object Oriented Database
 
ThesisProposal
ThesisProposalThesisProposal
ThesisProposal
 
502 Object Oriented Analysis and Design.pdf
502 Object Oriented Analysis and Design.pdf502 Object Oriented Analysis and Design.pdf
502 Object Oriented Analysis and Design.pdf
 
Sample SRS format
Sample SRS formatSample SRS format
Sample SRS format
 
Ooad unit 1
Ooad unit 1Ooad unit 1
Ooad unit 1
 
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...
Distributed Graphical User Interfaces to Class Diagram: Reverse Engineering A...
 
Object-oriented modeling and design.pdf
Object-oriented modeling and  design.pdfObject-oriented modeling and  design.pdf
Object-oriented modeling and design.pdf
 
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors India
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors IndiaBEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors India
BEST IMAGE PROCESSING TOOLS TO EXPECT in 2023 – Tutors India
 
Automatically translating a general purpose C image processing library for ...
Automatically translating a general purpose C   image processing library for ...Automatically translating a general purpose C   image processing library for ...
Automatically translating a general purpose C image processing library for ...
 
Various Approaches Of System Analysis
Various Approaches Of System AnalysisVarious Approaches Of System Analysis
Various Approaches Of System Analysis
 
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTSMANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTS
 
Accelerating A C Image Processing Library With A GPU
Accelerating A C   Image Processing Library With A GPUAccelerating A C   Image Processing Library With A GPU
Accelerating A C Image Processing Library With A GPU
 
Component based models and technology
Component based models and technologyComponent based models and technology
Component based models and technology
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
 
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.pptUNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
UNIT-I(Unified_Process_and_Use Case_Diagrams)_OOAD.ppt
 

[LinkedIn]_Thesis Sum in English_New

  • 1. 1 Extended Abstract of Thesis: “Workflow Extraction from UML Activity Diagrams” Thesis Summary – unaltered http://vivliothmmy.ee.auth.gr/2262/ In the context of Software Engineering every system can have two aspects, static and dynamic. During the investigation of the dynamic part of a system every engineer seeks to automate the extraction of Workflow in a both human and machine processable form, as the Workflow constitutes essentially all the available information of a system in the light of its dynamic aspect. One problem often faced by Software Engineers throughout the development of a new software project is the fact that they can not easily find (online) the workflow for similar types of systems. All they can find is .jpg images of UML Activity Diagrams by corresponding software projects, which graphically represent system’s workflow. As a part of this diploma thesis a potential solution to the above (stated) problem is developed, a program named UADxTractor. This software – based tool receives images of Activity Diagrams as input and subjects them to various levels of processing. First level aims to enable the identification of the main entities of each diagram and the relationships between them. The second processing level involves the detection and storage of text included in each diagram as well as the detection of workflow's direction between its entities. In final level, the proposed system, having already obtained all the available information which is included (graphically) in the input image, stores it in a semantically aware structure (Ontology), named Workflow_RDF. To verify the proper operation of UADxTractor several experiments were performed, the results of which are presented along with the necessary conclusions at the end of this paper. Skoumpakis Spyridon, Thessaloniki, August 2013 Problem Analysis The fact of the software engineers’ inability in finding the workflow of similar systems in any form beyond images was the main motive of this dissertation. The
  • 2. 2 solution proposed in this work leads hopefully towards the establishment of Semantic Web. In the past, the exchange of information between two people (software engineers or not) regarding the Workflow of their respective projects was being made, if not hand to hand, with exclusive contact of two parts. At present, progress was made and any interested party can search the internet for information concerning the Workflow of similar projects but all he can find are images of UML Activity Diagrams. The ultimate goal of the thesis is the creation of a practical engineering tool (not only) for better deconstruction and understanding of the dynamic part of each system but also for facilitating the sharing of information acquired. The Unified Modeling Language (UML) is a general-purpose modeling language in the field of software engineering, which is designed to provide a standard way to visualize the design of a system. Activity Diagrams are a specific category of UML Diagrams and represent the Dynamic view of a system model. This Dynamic view emphasizes the dynamic behavior of the system by showing collaborations among objects and changes to the internal states of objects. Activity Diagrams are graphical representations of Workflows of stepwise activities and actions. Unsurprisingly, we focus on the study of this particular category of UML diagrams. Figure 1 – Definition of Problem and Goal Figure 2 – Chain of causality
  • 3. 3 Approach It is more than clear (up until now) that any proposed solution will have to deal with Image Processing. The only data that every engineer has concerning the above stated problem are images (.jpg is the most common format). Consequently, those images will be inputs to our tool. After analyzing the problem and finding the input the very next subject that we have to tackle is the Development Environment. In order to decide which programming tools and languages to use we need to clarify our field of interest. The field that includes methods for acquiring, processing analyzing and understanding images, in general, high- dimensional data from the real world in order to produce numerical or symbolic information is Computer Vision. As soon as the acknowledgment of the field is completed we can proceed in choosing the programming tools. After an in-depth search of the existing alternatives we chose the library OpenCV and its interface in Java. The two main reasons which justify our choice are i) the unlimited capabilities of the library in question combined with the platform independency of java and the ii) compatibility of the API with an existing java- tool for use case diagrams. Code was developed in IDE Eclipse, an author’s personal preference. In order to verify the proper function of our tool we had to realize instances of Activity Diagrams. Some of them were common and others not so common and could not be found online. We used two known modeling tools (StarUML and ArgoUML) to draw these instances and extract them in .jpg format. Therefore some input images came from the Internet and some from the two modeling tools. The main objective of our software-based tool is the decomposition of Activity Diagrams in a form that can be processed by both humans and machines. The steps of the decomposition are presented in detail later on. Another main concern of our proposed solution is the attainment of easy sharing and reuse of all the information acquired from the decomposition. In this context, the last important choice that we had to make in our stepwise approach was the type of storage construct. There were many alternatives but for reasoning (semantically awareness) and compatibility purposes Ontology prevailed. The platform that was used in order to create, edit and view Ontologies was Protégé OWL and the API responsible for the transfer of the decomposed data to the storage construct was Jena. A general scheme of the solution along with the exact version of each used program is depicted in Figure 3.
  • 4. 4 Architecture & Outline of System This section presents a brief outline of the system step by step. Our software – based tool was named UADxTractor from the sentence Uml Activity Diagram eXtractor which reflects its function. System receives images as input and subjects them to various levels of processing. These levels are shown in Figure 4. Figure 3 – Tools and Languages used Figure 4 – Levels of Processing
  • 5. 5 1. Image pre-processing: This level is placed at the point in time in which the user has already located the input image and also precedes the main code. The functions of OpenCV Library (via its Java Interface known as JavaCV) which are used in our project insert some restrictions concerning the format of the input. The ultimate goal of our tool is the decomposition of any diagram. To accomplish that we use mainly OpenCV shape detection functions. These functions have some strict specifications about the images that can process. For example it is well known that most of the images which can found online are colored (RGB) with three input channels. One of the most common specifications of OpenCV is that the image has to be in greyscale and single – channel. The role of this level is to transform the input image in the proper format for the OpenCV functions of next levels. The procedure of pre-processing is depicted in Figure 5. 2. Shape & Entities Detection: In UML each shape of each diagram means something. Consequently, after the preparation of the input image the task we have to do next is to detect every meaningful object depicted in the image. OpenCV has some ready-to-use functions for some of the known (geometric) shapes but there are also many objects that cannot be detected from the simple use of these functions. As a solution to the above stated problem we combined these functions with some customized algorithms (created ad- hoc). The visualization of the (arithmetic) outcome of each detection function is performed using the drawing functions of the Library. The basic entities in the context of Activity Diagrams are: Figure 5 – Level of pre-processing explained
  • 6. 6  Initial Node (& Final Node) After numerous experiments we concluded that the Contour and Circle Detection functions of OpenCV fail to systematic detect this entity so we performed the Template Matching Algorithm. Template Matching is a ready-to-use method of the library which searches the input image using as reference template images from a user-defined pool. Template images are usually small icons depicting entities. So if the algorithm finds in the input image a similar (nearly identical) entity as the one that is depicted in the template image returns its coordinates (and a colored rectangle is drawn around it).  Line/Transition The largest and perhaps most important part of Entities Detection Level is Line Detection. Its importance rests in its complexity but also in the fact that practically all other entities of an Activity Diagram are related with lines and through them the essence of each diagram is depicted vividly (i.e. workflow). In terms of programming for the detection of all lines in an Activity Diagram we used the cvHoughLines2() algorithm. This ready-to-use function of OpenCV is based on the Progressive Probabilistic Hough Transform Methodology. Usage example is shown below. LinesSeq = cvHoughLines2(img_canny, store, CV_HOUGH_PROBABILISTIC, 1, Math.PI/360, 40, 40, 7); //(1)image: The input 8-bit single-channel binary image. In case of probabilistic method the image is modified by the function //(2)lineStorage: The storage for the lines detected //(3)method: The Hough transform variant //(4)rho: Distance resolution in pixel-related units //(5)theta – Angle resolution of the accumulator in radians //(6)threshold – Accumulator threshold parameter. Only those lines are returned that get enough votes (>threshold). //(7)param1: the minimum line length, //(8)param2: it is the maximum gap between line segments lying on the same line to treat them as a single line segment (that is, to join them) However many times we change the arguments of the aforementioned method the results will not be perfect. On the contrary, many imperfections appeared during the testing period such as the inability to detect some lines, the double detection of other lines, the fact that some oblique lines were detected as two or more different, the incorrect detection of lines out of parts of other entities and many other problems. In order to address all these problems we created some customized checks using mostly arithmetic boundaries and ifs.
  • 7. 7  Action For the detection of Actions, we used cvFindContours() method combined with the cvBoundingRect() function. The first finds all the contours of the input image within some specified boundaries and the latter bounds the detected contour creating a rectangle around it. We then used the cvRectangle() method to draw and visualize the result. Corrective customized checks are needed again mostly because many objects of the image can be perceived as contours by the primary algorithm. CvMemStorage mem = CvMemStorage.create(); CvSeq contours = new CvSeq(); cvFindContours(img_canny, mem, contours, Loader.sizeof(CvContour.class) , CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0)); CvRect boundbox; boundbox = cvBoundingRect(contours, 0); cvRectangle( out_image , cvPoint( boundbox.x(), boundbox.y() ), cvPoint( boundbox.x() + boundbox.width(), boundbox.y() + boundbox.height()), CvScalar.MAGENTA, 1, 0, 0 );  Diamond (Decision & Merge Node) The same methodology is applied here too except the part of the visualization. In order to draw the diamond we used four cvLines() functions because OpenCV lacks of a diamond drawing method. Due to the fact that Decision and Merge nodes are graphically identical we detect them as one entity but subsequently we make an explicit conceptual separation between them.  Synchronization Box (Fork & Join Node) We constructed two equal solutions for the detection of Synchronizations Boxes. The first one uses the Multiple Template Matching method which we have already described and the second one uses the algorithm for detecting lines with altered arguments taking advantage of the graphical similarity between the two entities (a SynchBox is a very small rectangle, four lines). A successful detection example is shown in Figure 7. The pink lines are the visualization of the customized check which tackles the detection problem of broken lines. The two big lines due to the fact that they have a break point are detected as two or more separate lines from the cvHoughLines2() method and we have to merge these lines into one.
  • 8. 8 Figure 6 – Input Image Figure 7 – Exported Image
  • 9. 9 3. Associations/Relationships of Entities Recognition: Having all entities detected successfully we move to the next level of processing which is the recognition of all the relationships between them. We examine the associations between two entities at a time. According to the rules of UML and the assumptions that we have made we have eight cases: 1) Initial Node - Action 2) Final Node - Action 3) Final Node - Diamond 4) Final Node - Synchronization 5) Diamond - Synchronization 6) Action - Synchronization 7) Action - Diamond 8) Action – Action Any relationship consists of 3 entities essentially if we take into account the line joining the two basic. The algorithm that we have created it is quite similar to the customized checks mentioned before. We search at a rectangular area around each entity for line ends. If a line end is detected next to an entity we then search similarly around all the other entities for the second end of the line in question. Once the second line end is detected we have identified a relationship which we initially store in a table and later on in a vector (coordinates of two entities and line). Simultaneously, we perform some checks for the rejection of double-records. 4. Transitions/Lines Direction Detection: Every line in any Activity Diagram is part of an association between two entities and has a specific direction towards one of them. The detection of this direction is the next processing level and it is very important because it contains information about the workflow depicted in each diagram. This level was the most demanding task of the Thesis. Many solutions can be found but the complexity of the problem based on the number of the entities and the rules governing the relationships increases exponentially. After many tests, we created two solutions a primary and a complementary. Our goal was to find a solution as general as possible. This solution has to obey the following rules which are derived from UML theory and some assumptions made by the author due to the limitations of the Thesis.  Initial Node is connected via only one line and only with an action/activity. Mandatory, this line is outgoing from the Initial Node.  Final Node can be connected with unlimited entities but in all cases the direction of the lines will be incoming/ingoing to the Final Node.
  • 10. 10  Every Action can have only one outgoing line and unlimited incoming.  Every Diamond is connected to at least three lines. o 1 ingoing, 2 outgoing = Decision Node o 2 ingoing, 1 outgoing = Merge Node  Every Synchronization Box is also connected with at least three lines. o 1 ingoing, 2 outgoing = Fork Node o 2 ingoing, 1 outgoing = Join Node Very useful information that came out of these rules is that the first four of the associations are unidirectional so we a priori know the direction of the arrow. The other four associations are bidirectional so we need to find the direction of the arrow every time. Our primary solution is an algorithm that searches the neighbors of the two entities which comprise each relationship. In most cases a one-step neighbor search (search of the neighbors that are right next to each of the two entities of the association) is sufficient to extract all the information that is needed in order to find out the right direction. During our (internet) research on candidate images that could be input to the tool in question some design patterns began to emerge. The main pattern was about the development of the diagram. The vast majority of the depicted diagrams were following a top-down (vertical) development. Some other patterns that we came across were:  The fact that the first entity of the diagram is the Initial Node which is also located higher than any other entity. (The (0,0) point for OpenCV is the upper left corner of every image).  The last and the lowest entity of each diagram is the Final Node.  When a line is far from being horizontal (y1>>y2) then the arrow is directed to the entity which is located lower in the diagram (having greater y). We coined the term “Verticality” for these cases. All of the above are closely related with the Top-Down development pattern. Taking advantage of the frequency of occurrence of these patterns we simplified our algorithm. Our complementary solution is less reliable and is applied rarely. In the context of this new algorithm in order to determine which end of each line has the direction arrow we examine a rectangular/square area around each end of each relationship for very small lines (using Line Detection Algorithm). Based on how many small lines we find next to each end we infer the direction of the arrow. Obviously, whichever end is detected to have a greater number of small lines around it is the one that has the arrow and thus defines the direction.
  • 11. 11 The result of the two direction detection algorithms is visualized by drawing colored dots at the right end of each line. This new image is given as output to the user. An example is shown in Figure 8 (the unidirectional transitions omitted). The black dots are from the primary algorithm and the pink from the complementary. 5. OCR Emulation: Every Activity Diagram contains text and this text is crucial information for the complete deconstruction of the diagram. After an in-depth search on the internet for an Open Source OCR (optical character recognition) API we found only one suitable for our problem, Asprise OCR. This tool is available to the public in a free trial version. We tested this tool using many different diagrams but its performance was disappointing. The results were not worthy of showcasing. The diagrams under study contain characters in only two places: the name of actions/activities and the guard conditions of decision nodes. This is all the text information that we want to extract from the image and insert to our program. We therefore decided to construct an alternative way of retrieving this information. The new path involves user intervention. We created two classes which undertake to guide the user to enter the required information from the keyboard. The interactivity is achieved through messages on the (Eclipse) console combined with explanatory colored images. Figure 8 – Direction Detection Algorithms combined
  • 12. 12  Sorting Entities This is a transitional intermediate level that allows the complete extraction of information depicted in the input image via classification of entities. The largest part of the available information after its extraction from the input image consists of (entity) coordinates. Coordinates are a kind of data that cannot and does not make sense to store. Apart from the actions and the guard conditions we had to find a way to also impart to the remaining entities some characteristics in order to ultimately enable the overall storage of every Activity Diagram. Those characteristics are a way of distinguishing one entity from another. The devised method was the serial naming of entities by combining their names with real numbers (e.g. Fork1, Fork2 etc.). In order to accomplish that in a consistent way the entities should be sorted. OpenCV detection algorithms automatically sort all entities (using their upper left point) except lines so all we had to do was to sort them “manually”. At this level of processing we also performed the conceptual separation of Diamonds and Synchronization Boxes in Decision and Merge Nodes and in Fork and Join Nodes respectively. 6. Connection with Ontology  “Ontology is a formal, explicit and detailed definition of a domain” (general),  “Ontology is a common model for representing the basic concepts of Activity Diagrams” (specific) For the development of the Ontology we used the tool Protégé OWL. Our methodology is based on design rules found in the literature and the model Ontology OWL-S. The concepts ultimately selected to form the semantic model Workflow_RDF are depicted in Figure 9. The last step before the insertion of the aggregate data in the Ontology is the grouping stage in order to achieve the complete conceptual reconstruction of the diagram and the representation of the Workflow. We have developed an algorithm that searches all existing relationships of every diagram while detecting the direction of the arrows in order to record the flow of information. In the context of our Thesis, the recognized relationships between all entities are eight. Four of them are unidirectional and the other four bidirectional. In addition, three of them (all the relationships of Decision Nodes) may contain binary conditions known as Guard Conditions. In short, we created eight functions for one direction of all relationships, plus another three for the other direction of the bidirectional (Action- Action can be used for both), plus another three for those with Guard Conditions. In total, we created fourteen functions separating the original information in groups of two
  • 13. 13 entities with a specified flow direction. These groups are called Sub-Activities as they constitute parts of the overall Activity which each Diagram represents. For the connection of our main program with the constructed Ontology in order to transfer all the available information we used an Open Source Java Framework. This framework is called Jena API. Implementation Now it is time to verify the proper operation of UADxTractor through experiments. Every user of this tool will probably need to change only a limited number of variables, the name of the input image and the arguments of the detection algorithms. The values of those variables are externally defined from a text (.txt) file so the user will not need to come in contact with the main code. Figure 9 - Hierarchical Structure of Ontology
  • 14. 14 We use a simple, yet representative diagram, found online, for our experiments. This diagram depicts the activity of displaying bowling score. The pre-processing level is next and the results are shown in Figure 11. Figure 10 - Display Bowling Score Diagram Figure 11 – Example of pre-processing Level
  • 15. 15 Now the image is ready to be used from the OpenCV functions. The detection of the entities and their relationships follows. In the big diagram on the left we can see the overall results of the detection algorithms and the right four diagrams depict the detection of the relationships between the entities. During the whole procedure explanatory messages are printed on the console of the Eclipse. Examples of those messages are shown in Figure 13. Figure 12 – Detection of Entities and Relationships Figure 13 – Samples of Console Messages
  • 16. 16 The most complex level of processing is the direction detection. The outcome of our algorithm is presented in Figure 14. The last step of our proposed solution is the connection of the main tool with the Ontology which is the storage construct that we chose. There are two ways to ensure the unimpeded transfer of all the available information. The first is through console messages and the second is using Protégé. Both ways are shown below. Method of Finding Sub - Activities has started... InitialNode->Line1->Calculate_numbers_of_knocked_out_pins Do you agree with this activity? Press y(Yes) or n(No) y So, the [Sub - Activities] of 2 entities are: --------------------------------------------- Sub-Act[0]:InitialNode->Line1->Calculate_numbers_of_knocked_out_pins Sub-Act[1]:Display_Score->Line7->FinalNode Sub-Act[2]:Decision0->Line6->Non_Strike->Display_Score Sub-Act[3]:Decision0->Line4->Strike->Show_Animation Sub-Act[4]:Calculate_the_score->Line3->Decision0 Sub-Act[5]:Show_Animation->Line5->Display_Score Sub-Act[6]:Calculate_numbers_of_knocked_out_pins->Line2->Calculate_the_score -------------------- Number of Sub - Activities = 7 THE END! Figure 14 – Example of the Direction Detection Algorithm
  • 17. 17 Conclusions Through this work I discovered some of my strengths and put myself into a test to see how I deal with adversity. The lessons I took from this process are invaluable. My supervisor Dr. Andreas Symeonidis stood by me and shared his insight when it was needed. We kept a close contact dividing my work into deliverables and discussing every issue. One of the main goals of the current paper is to continue the effort started with the Diploma Thesis of Anastasia Mourka (Use Case Diagrams) automating the conceptual reconstruction of every UML diagram and thus prepare the ground for the future creation of a semantic (UML) search engine. Anastasia was an excellent advisor at the beginning of this effort providing valuable guidance as a former student of the same department. This implementation was constructed based on the most common design patterns, but due to the vast number of existing alternative models was impossible to cover all eventualities. For example, the analysis of images depicting Activity Diagrams with horizontal development introduces some deficiencies. Nevertheless, the overall conclusion, in the context of analysis, is quite satisfactory, as the vertical development of these diagrams is the usual practice. The images used in the experiments were chosen so as to cover the widest possible range of the online available diagrams. They have an average complexity and most of them came from the well-known tools StarUML and ArgoUML. In conclusion, the main success of UADxTractor, in addition to saving time for Software Engineers, is that it provides direct access to the Workflow for similar types of Systems. This is something than can be useful for both humans (engineers) and other programs. Figure 15 – Display Bowling Score Instance in Protégé