Generating requirements analysis models from textual requiremen

Generating Requirements Analysis Models from Textual Requirements
João Santos, Ana Moreira, João Araújo, Vasco Amaral, Mauricio Alférez, Uirá Kulesza
Dept. Informática, FCT, Universidade Nova de Lisboa, Portugal
{jps, amm, ja, vasco.amaral, mauricio.alferez, uira}@di.fct.unl.pt
Abstract
Use case modeling is a commonly used technique to
describe functional requirements in Requirements
Engineering. Typically, use cases are captured from
textual requirements documents describing the
functionalities the system should meet. Requirements
elicitation, analysis and modeling is a time consuming
and error-prone activity, which it is not usually
supported by automated tools. This paper tackles this
problem by taking free-form textual requirements and
offering a semi-automatic process for generation of
domain models, such as use cases. Our goal is
twofold: (i) reduce the time spent to produce
requirements artifacts; and (ii) enable future
application of model-driven engineering techniques to
maintain traceability information and consistency
between textual and requirements visual models
artifacts.
1. Introduction
Requirements Engineering (RE) is the process of
discovering, documenting and managing requirements
for a computer-based system with the goal to identify
the purpose of a software system, and the contexts in
which it will be used [9]. Collecting and structuring
information associated with a certain domain problem
are some of the several activities present in a RE
process, and are usually known as requirements
elicitation and requirements analysis, respectively.
Requirements elicitation deals with the production of
(usually textual) documents from interviews,
ethnographic studies, etc. Requirements analysis
examines the contents of these documents to produce a
set of RE models. Therefore, individual requirements
are identified and are structured into artefacts (e.g., use
cases [1], viewpoints [10], goals [5].
In requirements engineering, and in particular in the
requirements analysis stage, use cases are a commonly
used technique to capture the functional requirements
of a system [7]. Use cases describe the interaction
between actors – external entities (users, systems) –
and the system functionalities. The system itself offers
desirable functionalities that help the user to achieve a
specific goal.
Creating requirements analysis models from textual
descriptions is a time consuming and error-prone
activity. Therefore, it would be useful to have
automatic or at least semi-automatic processes to
generate use case models from a document containing
set of informal textual requirements descriptions.
This paper focus on describing a process for
extracting use cases using concern information and
other useful data mined automatically. A Concern is an
area of interest or focus in a system. Our approach uses
output data generated by the EA-Miner tool. EA-Miner
[2, 15, 16] is an Eclipse plug-in that supports the
concern mining activities based on natural language
processing tools. In order to maintain traceability and
consistency between textual requirements and use case
abstractions, Model-driven engineering (MDE) [19]
techniques, more specifically, the use of metamodeling
strategies are used.
The remainder of this paper is organized as follows.
Section 2 gives a background on EA-Miner tool, and
also presents an overview of RDL, a requirements
language adopted by EA-Miner. Section 3 gives an
overview of our approach by describing the activities
involved in its realization. Section 4 describes how
traceability and consistency between different
abstractions are maintained. Section 5 illustrates the
application of our approach to an existing case study.
Section 6 presents current limitations found in EA-
Miner tool and discuss possible solutions to deal with
them. Section 7 explores and presents lessons learned.
Finally, Section 8 concludes the paper and provides
directions for future work.
2. Background
2008 First International Workshop on Managing Requirements Knowledge(MARK'08)
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

Natural Language Processing (NLP) studies the
problems of automated generation and understanding
of natural human languages [8]. NLP is a very
attractive method of human-computer interaction. This
section gives a background: (i) on EA-Miner, a
software tool that uses NLP techniques (Section 2.1);
and (ii) on Requirements Description Language
(RDL), the language adopted by EA-Miner to specify
requirements (Section 2.2).
2.1 EA-Miner
EA-Miner is an Eclipse plug-in that uses the natural
language processing features of WMatrix [14] to
identify base concerns, early aspects and crosscutting
relationships between them by mining different source
of requirements (e.g., interviews, natural language
descriptions of the system). WMatrix is a web tool that
provides features such as syntactic and semantic
tagging, frequency analysis and concordances from
textual documents. In EA-Miner, base concerns are
any matter of interest at the requirements level that can
be decomposed in a unit of the chosen decomposition
model [3]. For example, if the chosen decomposition
model is viewpoint-oriented [12, 13] , a base concern
is a viewpoint while if the view used is RDL-based [3],
a concern is identified and expressed in the RDL
language.
The main goal of EA-Miner is to help to achieve
separation of concerns at the requirements level by
modularizing crosscutting concerns as “early aspects”
[6]. Requirements elements (e.g., viewpoints, early
aspects) are presented to the user using different
visualization mechanisms in Eclipse. These
visualization mechanisms allow the user to see the
requirements allocated to the concerns. Moreover, EA-
Miner enables generation of initial specification
documents (such as, a word file) containing an initial
AORE [6] description of the requirements or an RDL-
based description of the system in XML format.
EA-Miner receives semantic and syntactic analysis
results from WMatrix, processes the results, and
finally extracts concerns which are essentially
categorized in three types:
• Noun based concerns – representing nouns found
in the textual description of the system;
• Functional based concerns – this is a concern
related with an activity or service. EA-Miner considers
Functional-based activities, the words that denote an
action;
• Non-functional based concerns – this is a
concern which is extracted based on a non-functional
[4] related word found in the requirement textual
description. If a non-functional requirement related
word is found in the original textual document, then it
is considered a NFR concern.
2.2 Requirements Description Language
RDL (Requirements Description Language) [3] is a
XML-based language that is implemented on EA-
Miner that enriches the existing natural language
requirements specification with semantic information
derived from the semantics of the natural language
itself [3]. It uses the natural language to support
expressive semantic-based compositions within a
requirements description language (RDL). It is based
on the symmetric view of AOSD (a Concern is used to
represent both crosscutting and non-crosscutting
elements). The main elements constituting the RDL
metamodel are presented next [3]:
• Requirement – is a description of a service the
stakeholders expect of the system. One or more
requirement elements are encapsulated within a
concern. A requirement is composed of: (i) Subject – is
the entity that undertakes actions described in the
sentence clause. It corresponds to the grammatical
subject in the clause; (ii) Object – is the entity that is
affected by the actions undertaken by the subject of the
sentence. Objects in RDL correspond to grammatical
objects in the clause; and (iii) Relationship – reflects
the interaction between the subject and the object.
• Concern – is a container for localizing
semantically related requirements. It can be simple
(containing only requirements), or composite
(containing other concerns as well as requirements),
thus allowing hierarchical structuring of related
requirements.
3. Extracting Analysis Models from
Textual Requirements
Our approach consists of a process to extract
candidate use cases from the output information mined
by EA-Miner plug-in. The approach (presented on
Figure 1) considers an iterative process to reflect both
requirements elicitation and analysis activities.
The pre-condition to apply the activities proposed
in this approach is to textually describe the
functionalities that the system should meet as textual
requirements contained in a document and give this
document as input to EA-Miner (Package Pre-
Conditions in Figure 1).
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

Figure 1 - Requirements Elicitation and Analysis
Activities.
After satisfying the pre-conditions, our approach is
decomposed in two main phases, where each one
contains several activities:
Phase 1. Extraction of Candidate Use Cases – In
the first phase of our approach, candidate use cases are
extracted from output information gathered by the EA-
Miner plug-in. The extracted information includes use
cases, actors, relationships between use cases and
actors and information about identified related use
cases. Relationship information between use cases
helps the modeler to identify possible includes and
extends relationships in the refinement phase.
Phase 2. Candidate Use Case Model Refinement
– on the second phase of our approach, the user refines
the candidate use cases extracted in the previous step.
The refinement includes the following sub-activities:
• Delete undesirable use cases and actors;
• Complete abstractions names;
• Add new use case abstractions and relate them
with textual requirement sentences;
• Analyze related use cases and create relationships
(includes, extends, and inherits) between use cases;
• Model the detailed behavior of use cases using
activity diagrams;
These activities do not follow any order, so they can
be realized at any time of the refinement phase. Next
sections present all these activities in detail.
3.1 Extracting Candidate Use Cases from EA-
Miner Output Data
This section describes in detail the process for
extracting candidate use cases from output data
generated by EA-Miner. Figure 2 depicts the EA-
Miner output elements used in the proposed process
for extracting use case abstractions.
EA-Miner
Functional
Concerns
RDL
Sentences
Wmatrix
(NLP Processor)
Use Case
Model
Semantic
Analizer
Syntactic
Analizer
Tagged
Document
Process for
extracting Use
Cases
Figure 2 - Process to extract use cases abstractions
from EA-Miner.
EA-Miner decomposes requirements textual
documents into RDL sentences, extracts functional
concerns and also returns the document tagged by
WMatrix with syntactic information. All these
elements are used to derivate candidate use case
models abstractions.
The use case elements extracted in this process are:
use case names, actors, relationships between use cases
and actors and information about related use cases.
In use case models, use case names usually have the
following format:
Verb + Noun
Where the verb part (on infinitive form) denotes the
action performed by the actor and the noun denotes the
object which suffers the action. For example: Register
Vehicle
Functional-based Concerns detected by EA-Miner
denote actions, so, these are good candidates to be
used as the verbs present in use case names. Since we
want the infinitive form of the verb, this can be
obtained by analyzing the provided tagged document
and do a search for the lemma of that verb. The lemma
of a word is the most basic form that it has. For
example the lemma of the word allowing is allow.
On the other hand, the Object abstraction present in
RDL sentences (see RDL metamodel in [3]) is used to
derive the noun part of the use case names. Candidate
actors are also created based on RDL sentences. The
abstraction Subject present in RDL sentences is used to
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

derive candidate actors, since subjects in requirements
sentences typically denote the person or thing that
performs an action.
Figure 3 synthesizes how the EA-Miner output
information is used to extract candidate use cases,
actors and relationships between use cases and actors.
Verb (Infinite Form)
+
Noun
Functional
Concern
Relationship
RDL Sentence
Object Subject
Requirement Sentence
Functional
Concern Lemma
Figure 3 – Generating use case abstractions from
RDL and functional concerns.
A relationship between a candidate actor and a use
case is created if both abstractions were derived from
the same requirement sentence. Figure 4 illustrates an
example of a sentence given as input to EA-Miner.
The user can make a reservation
RDL Subject Functional-based
Concern
RDL Object
Figure 4 - Sentence decomposed into Subject,
Functional-based Concern and Object.
EA-miner detected the word make as a functional-
based concern. As expected, the lemma of the word
make, is also make, since it is already in the most basic
form. The decomposition in RDL abstractions resulted
in the word user as a Subject and reservation as an
Object.
Figure 5 shows the resulting use case and actor,
and respective associations between these two
abstractions using the strategy defined previously.
Figure 5 - Example of an extracted use case, actor
and relationship between them.
For each new iteration of the process, the document
containing textual descriptions of requirements
changes (e.g. resulting from new interviews with
stakeholders). Some textual requirements are removed
from the document, whereas others are added, and
finally others remain unchanged.
It is convenient to the modeler, to see these
changes in the generated use cases models. To address
this problem, we annotate each use case through the
use of UML stereotypes. When a sentence related with
a use case has been removed from the source document
on a new iteration, in the generation process this use
case is annotated with an <<obsolete>> stereotype.
When a new use case is created based on a new textual
requirement, this use case is annotated with an
<<new>> stereotype. These annotations help the
modeler to delete obsolete use cases and see new
functionalities added on the new iteration.
For identification of functional crosscutting
concerns, EA-miner detects the repeated occurrences
of action words, which may suggest presence of a
functional aspect. EA-Miner also identifies
relationships between concerns. Two concerns
(independently of the concern type) are related if they
appear on the same requirement sentence.
In our approach, we consider that a crosscutting
functional concern is a good candidate to a use case to
be included by or extend another one. As an example,
considering the next two textual requirements:
The user can delete a file, if he is logged.
The user can add a file, if he is logged.
The functional concern logged is repeated among
the two sentences and consequently is detected as a
functional crosscutting concern. On the other hand,
this crosscutting functional concern is related both
with functional concern delete and add. So, the
functional concern logged is a good candidate to be
included or extends both delete and add functional
concerns.
3.2 Refining Use Case Models
The activity of refining the generated use case
model includes a set of non-ordered sub-activities
which are presented below in detail:
1. Remove undesirable use case abstractions –
if the original textual document not only describes
functionalities that the system should meet, but also
another type of descriptions (an introduction of the
company involved in the system, for example), this
will generate additional use cases which may be
undesirable in the generated use case model. Then, the
modeler has to manually delete these undesirable use
cases. In subsequent iterations of the process, these
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

abstractions remain deleted. So, if an undesirable use
case has been deleted on the previous iteration, it
would not appear when generating the use case model
for the new iteration.
2. Complete abstractions names - there are
situations where extracted use case names are not
complete or well understood. By using traceability
information between generated use cases and RDL
sentences (described in Section 4), the sentence related
to a use case can be highlighted from the rest of the
document, where the user can then analyze the context
of the creation of the use case through the related
sentence and the neighborhood text if necessary.
Having knowledge about the context, it is easier to
complete the use case name if necessary. The same
happens with extracted actors.
3. Add new use cases and actors – there are two
situations that could result in a desirable use case or
actor not appearing on the candidate use case model:
• The functionality was present in the original
textual document; however, EA-Miner did not
detect it. In fact, we verified that EA-Miner has
some limitations on detecting functional concerns
and also abstractions in RDL sentences, elements
used to extracted candidate use case models. Some
of these limitations will be described in Section 6.
• The functionality was not present in the
original textual document and requirement engineer
identified a new functionality by means of an
interview with the system users. In this situation, it
is necessary to document textually the new
requirement, and regenerate the candidate use case
model. Another option is to create manually a new
use case directly in the use case model without the
need to update and re-process the textual
document.
After a user creates a new use case, he must relate
the use case or actor with the correspondent RDL
sentence present in the textual document. This relation
is then saved in the metamodel presented in Section 4.
4. Analyze and create relationships (includes,
extends, and inherits) between use cases –
information about relationships between use
cases resulting from the extracting process are
presented to the user. These relationships help
the user on manually creating includes and
extends relationships between use cases if
desirable.
5. Refactor use cases as steps – some documents
already contain use case descriptions, whereas
others contain more refined descriptions (steps
or even operations). In fact, requirements can
be described at different levels of abstraction
and consequently generate functional concerns
at different levels. To address this granularity
problem, the user can detail the behavior of
each use case, by creating an activity diagram
and relating it with the use case, similarly to
several existing UML-based methods, such as
RUP [11]. As a result, a modeler could consider
an extracted functional concern in a more
refined level, i.e., as a step of a use case
(represented as an activity in the related activity
diagram).
A use case, that have already a relationship of
includes or extends with another use case, can
also be specified as a step (an activity of the
activity diagram). We consider that if a use case
UCy is considered a step of use case UCx and
use case UCy have a relationship (includes or
extends) with use case UCz, the UCy becomes a
step of UCx and a relationship is created
between UCx and UCz with the same direction.
Figure 6 illustrates this example.
UCx UCy
UCz
UCx
UCz
Activity Diagram for
UCx
Activity A1
Activity A2
Activity Diagram for
UCx
Activity A1
Activity A2
UCy
Figure 6 - Refining use cases as steps.
4. Maintaining Traceability and
Consistency between Textual
Requirements and Use Cases
To preserve the traceability information between
RDL sentences and generated use cases, our approach
uses MDE techniques. In particular, a traceability
metamodel (Figure 7), is used to link abstractions from
the RDL language and the requirements models. This
offers the engineer the possibility to see how
requirements abstractions are associated with RDL
sentences, by navigating across abstractions of the
different models.
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

Figure 7 depicts the traceability metamodel. The
Trace Link abstraction maintains the traceability
information between RDL sentences and use case
abstractions (Actor and Use Case). Traceability
between RDL sentences and the Activity abstraction is
also maintained since generated use cases can be
considered steps (Activities). Each time the user
identifies a use case as a step, this new step becomes
an Activity that is linked with the correspondent RDL
identifier. The OtherElems abstraction represents other
nodes in activity diagrams (fork, join, decision, etc).
The boolean attribute deleted on Actor, Use Case and
Activity abstractions indicates if the user has deleted
any generated abstraction. This is needed for
consistency and regeneration purposes as explained
later.
RDL
Metamodel
Traceability
Metamodel
Requirement
Metamodel
Traces
Traces
(a) General Strategy Adopted
RDL Sentence
RelationshipObject
RDLModel
TracingModel
UseCaseModel
ActivityModel
Id : String
name : String
UCElement
Id : String
ADNode
1..*
1..*
name : String
deleted: boolean
Activity OtherElems
deleted: boolean
Use Case
deleted: boolean
Actor
Package
Trace Link
Subject
1..*
* *
*
*
Figure 7 – Traceability supporting strategy between
RDL sentences and requirements abstractions.
Requirements elicitation and analysis make part of
an iterative process in requirements engineering. On
each iteration, a new interview with the stakeholder is
performed, and consequently, the document containing
textual requirements is modified, denoting new
desirable functionalities for the system and possibly
removing undesirable ones.
Since in our approach, candidate use cases are
generated from changing textual documents, it is
necessary to maintain consistency between these
textual and visual models. This consistency is
addressed by saving relevant information on the
metamodel. To preserve consistency between the
document containing textual requirements and use
cases, we defined a set of considerations that must be
taken into account when generated abstractions are
deleted or the model is regenerated from the source
document:
1. When the user deletes a generated use case,
actor or activity from the candidate elements, the
traceability information is preserved in the metamodel
and the attribute deleted of the use case, actor or
activity is set to true. This avoids the deleted
abstractions appearing again in the regenerated model.
2. When the user regenerates the use case model
from the textual document, three situations can occur
related with a certain requirement sentences:
a. There is a sentence which has been removed
from the source document – in this situation, the use
case, actor or activity is annotated with an
<<obsolete>> stereotype, indicating that there is no
RDL sentence related. The user should delete the use
case (if it does not contain any steps inside) or relate
with a sentence of the source document.
b. A sentence has not been modified in the
source document – in this case, all actors, use cases or
activities related with this textual sentence are
maintained. If the abstraction has the attribute deleted
as true, this should be maintained.
c. If a new sentence is found in the source
document, a new use case with stereotype <<new>> is
created, by applying the process for extracting use
cases defined previously.
The use of the traceability model of Figure 7 also
helps the modeler to complete use case names as, in
some cases, the verb and noun are not enough to
completely understand what the functionality provided
by the use case is. Therefore, for each use case, the
user can navigate to the related RDL sentence which is
highlighted over the entire source document, and
understand the context where the use case name
appears and complete the use case name if necessary.
5. Applying the Approach to a Case Study
This section illustrates the application of our
approach to the Mobile Media system [20]. Mobile
Media is an application that manipulates photo, music
and video on mobile devices, such as mobile phones.
For mining abstractions with EA-Miner, it is
necessary to list textually the requirements that the
Mobile Media should meet. These textual requirements
are listed next. Each sentence is identified with Ri,
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

where i conforms to the EA-Miner numeration of each
sentence.
R1. The prototype application is a media album appropriate for
cellular phones, personal digital organizers or Blackberry devices
although it is not limited to these.
R2. Application offers a set of functionalities which are listed next.
R3. A user can create new Media, including photos, videos or
sounds.
R4. User can then add created media to an album.
R5. System must store media in RMS format.
R6. The user can also create a new media album, allowing the
storage of related categories of pictures, sounds or videos on the
device.
R7. The user can only create media if logged.
R8. The system should persist the album information in RMS
format.
R9. User can also delete a media file or an Album.
R10. The system must verify if the album to delete is already empty.
R11. To delete media files and albums, the user must be logged.
R12. Users should also be able to list albums, allowing the view of
all albums contained in the device.
R13. The system also allows a user to list media included in the
device.
R14. User can label media files.
R15. Labels appear in the display list, and could be used for future
search functionality.
R16. User can play media which is in the device file system.
R17. The user can send a media via SMS or via Email, including
photo, video and audio.
R18. User can link a Media with an Address Book Entry, allowing a
user to associate an entry in their contact list with a media in their
album.
R19. The system also allows to configure the media of incoming
caller, which intercepts incoming phone calls and show photo or
plays audio of the associated caller.
R20. The user can hear the ringtone of an Incomming Call, which
intercepts incoming phone calls and plays a custom ring-tone of the
associated caller while ringing.
R21. The user can configure favorite media, to include a media in
the set of favorites.
R22. After configuring favorite media, the user can get the list of
favorite media.
The output data generated by EA-Miner for Mobile
Media can be found in [17].
5.1 Extracting Candidate Use Case Model from
EA-Miner
Table 1 shows the list of abstractions extracted by
EA-Miner. The table includes for each requirement
sentence, the functional concern lemma, the object and
subject. These are elements used to generate the
candidate use cases model, according to the strategy
presented in Section 3.2.
Table 1 - Extraction of Use Cases abstractions from
EA-Miner.
Figure 8 shows the resulting use case model,
created using the information provided in Table 1.
Figure 8 - Extracted Use Case Model.
After the candidate model is generated, information
about related use cases is also provided. Table 2 lists
detected crosscutting functional concerns and
correspondent related functional concerns.
Table 2 - Crosscutting and related functional concerns
Crosscutting Functional
Concern
Related functional
concerns
Create (R3, R6, R7) Log (R7)
List (R2, R12, R13) Configure (R22)
Configure (R19, R21, Get (R22)
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

R22)
Log (R7, R11) Create (R7), Delete
(R11 )
This table helps the creation of includes or extends
relationships between use cases in the refinement
phase of our process.
5.2 Refining Candidate Use Case Model
On the process of refining generated use cases, the
modeler is responsible to make adaptations to
generated candidate use cases models. This includes
deleting undesirable use cases, complete use case
names if necessary, add relationships between use
cases (<<includes>>, <<extends>> and
<<inherits>>) based on relationship information
between generated use cases, add other desirable
abstractions and, finally, consider use cases as steps
(represented as activities in the correspondent activity
diagram). Figure 9 shows the resulting use case model
after the refinement.
Figure 9 – Refined use case model for Mobile
Media Case Study.
Each use case name has been completed based on
each corresponding RDL sentence and analyzing the
context where it surges. Part of the use case names in
bold corresponds to parts manually added by the user.
Use cases created manually, have name and reference
do the sentence both in bold. For example, the use
cases Send Media via Email and Send Media via SMS
has been manually created and associated with the
requirement sentence R17.
By analyzing relationships between generated use
cases, includes relationships between Log in and
Create Media and also between Log in and Delete
Media File were created.
We considered the use case Persist information in
RMS format as a step of the use case Create new
Media. Figure 10 illustrates this operation with a drag-
and-drop like operation.
Figure 10- Considering Persist using RMS as a step
of Create New Media use case.
Steps included in a use case are modelled as
activities in the corresponding activity diagram. The
activity diagram for the use case Create new Media is
represented in Figure 11.
Get name for
new media file
Persist
information in
RMS format (R8)
Show Message
("Media created
with success!")
Figure 11 - Activity model for use case Create new
Media.
As Figure 11 shows, the Persist information in RMS
format functionality (denoted with a different colour)
became a step in the Create new Media use case,
resulting from considering the refinement of this
functionality. The remaining activities and flows
between activities were manually created.
6. EA-Miner Limitations and
Recommendations
After applying Mobile Media and other system
descriptions to EA-Miner, we verified that there were
functional concerns and RDL abstractions (Subjects
and Objects) needed in the proposed extraction
process, which were not detected. In fact, on average,
about 50% and 81% of functional concerns and
subjects are detected respectively. [18]. Consequently,
the proposed process for extracting use cases did not
derive all desirable use cases as expected. In fact, we
verified that these abstractions are only detected if
requirements sentences are described in a certain way.
We will not list all ways to describe a requirement.
However, we can exemplify using two ways of
describing the same requirement:
1. The user can make a reservation.
2. The user makes a reservation.
We verified that in the first sentence, EA-Miner
detected the word make as a functional concern.
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

However, on the second sentence, the functional
concern makes was not detected.
Since the source-code of the framework
implementation was not available, we are only able to
recommend an improvement on the implementation of
this plug-in to better detect functional concerns and
better decompose textual requirements into RDL
sentences. We also verified that sometimes subject and
object RDL abstractions were also not detected.
The non detection of a functional concern by EA-
Miner has two consequences in our approach: (i) the
non creation of a use case; and (ii) the potential non
detection of a relationship between two use cases. In
this case, the user is unable to see that a functional
concerns is crosscutting, and consequently, an includes
or extends relationship could not be generated.
Another recommendation in EA-Miner
implementation is to change the way a crosscutting
functional concern is detected. We verified that only
considering repeated occurrences of action words as a
heuristic to generate candidate crosscutting functional
concerns is not enough to filter the set of results. In
fact, we believe that the context should be better
analyzed in these situations, for example, by counting
the occurrences in which a certain functional concern
is related with an object.
7. Lessons Learned
We defined a process that uses output information
mined by EA-Miner to automate the creation of
candidate visual domain models (use cases) from
requirements described textually. We verified that
important abstractions needed in our approach
(functional concerns and RDL abstractions) were
detected by EA-Miner. However, a significant part of
these abstractions were not detected, resulting in an
incomplete candidate use case model.
Although limitations were found on EA-Miner, we
believe that improvements on the implementation of
this framework would decrease the time spent in the
following activities:
• Making association links between use cases
and actors – the links between use cases and actors
are automatically created.
• Creating use case names – use case names were
automatically derived. When use case names were not
complete or well understood, the traceability links
between textual requirements and candidate use cases
helped us to complete the use case names, since textual
requirements gave the context where the candidate use
case arises.
• Creating relationships between use cases – the
user can see detected relationships between generated
use cases. This facilitates the process of creating
includes and extends relationships between use cases.
• Refining use cases with activity diagrams – the
time spent to refine use cases with activity diagrams
potentially decreases since some activities are already
derived from extracted functionalities present in the
textual functionalities descriptions. More systematic
empirical studies still need to be carried out in order to
understand better the benefits of EA-Miner can bring
to this activity.
We also observed that if only well-formed
functionalities that the system must meet are described
in the document, the chance to get undesirable
abstractions (use cases or actors) on the extraction
process decreases. Another important lesson learned is
that MDE techniques, i.e., the use of a metamodel
integrating textual and use cases abstractions together
with the adequate supporting tools allow us to address
consistency and traceability between elements from the
two metamodels (RDL and requirement metamodel).
Our approach has shown that the requirements
analysis process for requirements engineering can be
improved by adopting a mining tool (such as, EA-
Miner) in conjunction with a set of heuristics that
allows to create initial versions of RE models based on
existing textual requirements. However, in order to
allow the use of this solution strategy in an industrial
context, it is important to define functionalities with
the capacity of discarding or filtering unnecessary
information for a given domain. In our approach, if the
source document is too large, it will generate many use
cases which probably are not desirable. When a use
case model is generated for the first time (on the first
iteration of requirement analysis), there is a significant
time spent on correcting, refining and relating created
use cases with textual sentences. However, we believe
that this additional time spent can be compensated on
subsequent iterations since, when the model is
regenerated, the modeller can see explicit indication of
modifications in the document that shows: (i)
annotations in use cases denoting that the requirements
sentence has been deleted from the document; and (ii)
new use cases denoting new requirements in the source
document.
8. Conclusions and Future Work
This paper presented an approach to automatically
extract candidate use cases from document containing
textual requirements described informally. For use
cases extraction, it was described how functional
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

concerns and RDL output information from EA-Miner
plug-in could be used to generate candidate use case
models.
Traceability between RDL sentences and generated
use case models abstractions is supported in our
approach by a metamodeling strategy. It enables the
navigation across abstractions of the RDL sentences
and requirements models (use cases and activity
diagrams), helping the user to refine use case models.
The problems associated with different levels of
abstraction present in textual requirements
descriptions, and consequently in candidate use cases,
were addressed by considering them as steps, a more
fine-grained element.
Our work is currently being extended to address
additional perspectives, such as: (i) to provide more
explicit guidance for non-functional requirements. In
fact, EA-Miner detects and extracts non-functional
concerns from textual documents; (ii) to implement a
complete visual tool that provides support to the
functionalities mentioned in the paper; (iii) to consider
the integration of the proposed approach with the
history of the different versions of requirements
documents and generated models in order to allow a
better management of the system functionalities
evolution; (iv) evaluate through experiments if time
spent on the correction of the generated use case
diagrams are compensated in later iterations.
Acknowledgements. The authors are partially
supported by EU Grant IST-33710: Aspect-Oriented,
Model-Driven Product Line Engineering (AMPLE).
References
[1] I. Alexander and N. Maiden, Scenarios, Stories, Use
Cases: Through the Systems Development Life-Cycle:
John Wiley, 2004.
[2] S. Americo, R. Awais, and R. Paul, "Early-AIM: An
Approach for Identifying Aspects in Requirements",
presented at Proceedings of the 13th IEEE
International Conference on Requirements
Engineering, 2005.
[3] R. Chitchyan, A. Rashid, P. Rayson, and R. Waters,
"Semantics-Based Composition for Aspect-Oriented
Requirements Engineering", in Proceedings of the 6th
International Conference on Aspect-oriented Software
Development, Vancouver, British Columbia, Canada,
ACM Press, 2007, pp. 36-48.
[4] L. Chung, B. A. Nixon, E. Yu, and J. Mylopoulos,
Non-Functional Requirements in Software
Engineering: Kluwer Academic Publishers, 1999.
[5] R. Darimont, E. Delor, P. Massonet, and A. van
Lamsweerde, "GRAIL/KAOS: An Environment for
Goal-Driven Requirements Engineering", in
Proceedings of the 19th International Conference on
Software Engineering, Boston, Massachusetts, USA,
ACM Press, 1997, pp. 612-613.
[6] Early Aspects.net, "Early Aspects", http://www.early-
aspects.net, 12.03.2007.
[7] I. Jacobson and P.-W. Ng, Aspect-Oriented Software
Development with Use Cases. Stoughton, MA, USA:
Addison-Wesley, 2004.
[8] D. Jurafsky and J. Martin, Speech and Language
Processing: An Introduction to Natural Language
Processing, Computational Linguistics and Speech
Recognition, second ed: Prentice Hall, 2008.
[9] G. Kotonya and I. Sommerville, Requirements
Engineering: Processes and Techniques: John Wiley,
1998.
[10] M. Mannion, B. Keepence, and D. Harper, "Using
Viewpoints to Define Domain Requirements", IEEE
Software, vol. 15, pp. 95-102, 1998.
[11] K. Philippe, The Rational Unified Process: An
Introduction: Addison-Wesley Longman Publishing
Co., Inc., 2003.
[12] A. Rashid, A. Moreira, and J. Araújo, "Modularisation
and Composition of Aspectual Requirements", in
Proceedings of the 2nd International Conference on
Aspect-oriented Software Development, Boston,
Massachusetts, ACM Press, 2003, pp. 11 - 20.
[13] A. Rashid, P. Sawyer, A. Moreira, and J. Araújo,
"Early Aspects: A Model for Aspect-Oriented
Requirements Engineering", presented at Proceedings
of the 10th Anniversary IEEE Joint International
Conference on Requirements Engineering, 2002.
[14] P. Rayson, "Wmatrix Corpus Analysis and Comparison
Tool", http://www.comp.lancs.ac.uk/ucrel/wmatrix/,
15.03.2007.
[15] C. Ruzanna, Am, S. rico, R. Awais, and R. Paul, "A
tool suite for aspect-oriented requirements
engineering", presented at Proceedings of the 2006
international workshop on Early aspects at ICSE,
Shanghai, China, 2006.
[16] A. Sampaio, R. Chitchyan, A. Rashid, and P. Rayson,
"EA-Miner: A Tool for Automating Aspect-Oriented
Requirements Identification", in Proceedings of the
20th IEEE/ACM international Conference on
Automated software engineering, Long Beach, CA,
USA, ACM Press, 2005, pp. 352-355.
[17] J. Santos, "EA-Miner output data for Mobile Media
Case Study", http://ample.di.fct.unl.pt/public/joao/EA-
MinerData/MobileMedia/, 2008
[18] J. Santos, "An Evaluation of EA-miner for Use Cases
extraction", http://ample.di.fct.unl.pt/public/joao/EA-
MinerData/AnEvaluationofEA-
Minertoextractusecases.doc,
[19] M. Volter and T. Stahl, Model-Driven Software
Development. Glasgow, UK: Wiley, 2006.
[20] T. Young, "Using AspectJ to Build a Software Product
Line for Mobile Devices -
www.cs.ubc.ca/grads/resources/thesis/Nov05/Trevor_
Young.pdf", University of Waterloo.
978-0-7695-3627-9/08 $25.00 © 2008 IEEE

Generating requirements analysis models from textual requiremen

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Generating requirements analysis models from textual requiremen

Similar to Generating requirements analysis models from textual requiremen (20)

Recently uploaded

Recently uploaded (20)

Generating requirements analysis models from textual requiremen