Hello everybody. This week, we are moving on to the third part of the Analysis phase. Just as a quick recap: We started with planning phase of the Systems Development Life Cycle. During the planning phase, we looked at how to identify and initiate a project, how a system request gets prepared based on business needs, how feasibility analyses are performed, and how the projects are selected, planned, managed and controlled. Once the projects are approved to be further analyzed, then we discussed how functional and non-functional requirements of the system can be initially determined. Then, last week, we examined a way to describe the functional requirements that are directly related to the system’s users in more detail, namely the use cases.
Let us pause a moment and think what we are trying to accomplish by performing these activities: Our main goal here is to create an information system. In other words, we have a task at hand and we want computers help us to perform that task. Computers are deterministic machines and they cannot understand generic concepts and deduce information, unless (on very rare occasions) they run software specifically written for this purpose, but that is outside of the scope of our discussion. Because the computers are deterministic machines, they need very specific instructions that tell them what we need them to do for us. So, the activities that make up the scope of this course aim to explain the tasks that we need computers’ help with in a very specific way, so that computer programmers can easily translate those instructions into a language that computers can perform with. That translation is the job for the computer programmer, however as a systems analyst or a project manager we have a similar job, that is, we need to be able to clarify the task to the level that is easy to explain to the programmers. To me, that clarification process is one of the learning objectives of this course.
So, this week, we are trying to move the definition and scope of the project that started as a system request (which is just the description of the business need) one step closer to what the programmers can easily understand and implement. One of the tools that the designers are familiar with is process models. Although there are different process models, we will follow the textbook and suffice with learning the most common one, the Data Flow Diagrams.
So, on our quest to keep clarifying and further specifying the requirements, a process model moves us a step further to have more specific requirements than the way we described them in use cases. If you recall, use cases were pretty flexible in terms of how detailed you may want to create them; you had an option to have casual use cases without inputs / outputs and input sources / output destinations. However, with process models that is no longer optional.
Process models describe the business activities with graphics by breaking down use case sequences further into actionable steps.
And just like the use cases, process models can also be used to describe both the as-is system or the to-be system. It is also worth noting that process models don’t necessarily have to describe information systems, they can also describe business workflows that do not include any use of computers.
As we said before, we will use data flow diagrams to study process models. Data Flow Diagramming technique diagrams the business processes and the data that pass among them. Contrary to what its name implies, Data Flow Diagrams do not focus on data. Their focus is on the business processes and how data flows among them. How the data is created and used by these processes is the topic of next week, when we will learn about the data models.
It is also important to note that, this lecture focuses on the logical process models: That means, we will focus on describing the processes, however we will not be interested in how they are conducted yet. For example, when we describe a process, we will not know whether a computer of just paper records are used to conduct it.
We will use the logical process models that we create here in the analysis phase later during the design phase, where we are interested in how to implement these processes.
Here is a sample Data Flow Diagram from the textbook. It is helpful to see how they are used to describe the flow of business processes. We will go into the details of the elements of Data Flow Diagrams and what each symbol means in the coming slides, but in a general sense, actions and entities are represented by symbols and inputs and outputs are represented by arrows. They are typically read from left-to-right and top-to-bottom. And as you can imagine, they can be created easily if a well formed Use Cases exist. They are just more formalized than the use cases and closer to how computers understand and process information.
Data Flow Diagrams consist of four main elements: process, data flow, data store and external entity.
A process is an activity or a function performed for some specific business reason and they are shown in Data Flow Diagrams either as a rounded rectangle or an oval, depending on the style used. They each have a unique identifying number, a descriptive name, description and at least one input data flow and one output data flow. Descriptions can be simple text or more formal techniques and they usually get larger as more information is gathered about the system. Based on its complexity, a process may get split into multiple processes.
A Data Flow is a single piece of data, or a logical collection of several pieces of information. Each data flow or data element has a name and a description. The description should list every data items it includes. Each data flow should have at least one connection to a process. This can be an incoming or outgoing connection. Data flows hold the processes together and their arrow head indicates the flow of data from a process/data store/external entity to another. It is also important to note that data flow is always one way. If for some reason, a two way data flow is needed, they should be two separate data flows.
A data store is a collection of data that is stored in some way. It could be a spreadsheet, a paper, or a table of an enterprise level database. At this analysis phase, we are not interested how the data is stored. Each data store has a unique number, a descriptive name, and a description. A data flow into a data store means either addition or change of data in the data store and an output data flow from a data store indicates data retrieval. Data stores in process modeling are the first step of data modeling and they are the main relation between process modeling and data modeling.
And finally an External Entity is a person, organization, organizational unit, or system that is external to the system, but interacts with it. Usually the primary actor of a use case is an external entity. Also people who use information from the system to perform other processes or who provides information into the system are considered external entities.
Like we saw with the use cases, process models can also be abstract or detailed. Most business processes are too complex to be explained in one Data Flow Diagram. So, one important principle in process modeling with Data Flow Diagrams is the decomposition of the business process into a series of Data Flow Diagrams, each representing less scope , but more detail.
The first Data Flow Diagram in every business process is the context diagram and there is only one context diagram per process model It shows the entire system in context with its environment. The context diagram shows the overall business process as just one process and shows the data flows to and from external entities. The process, which is the entire system here, has process number “0” and represents the most abstract view of the process model. It can also be called as the collapsed view. Data stores are usually referred as external entities on the context diagrams, unless they are part of the system, in which case they are within Process 0.
The next level of Data Flow Diagrams is the Level 0 diagram that details the single process of the context diagram
The level 0 diagram shows all the major high-level processes of the system and how they are interrelated. It shows all the processes at the first level the numbering, such as 1, 2, 3 and it shows the data stores, external entities, and data flows among them. Although it shows more details then the context diagram, it still describes a higher level representation of the system. And there is only one Level 0 diagram for a particular process model.
As we go from more abstract to more detailed, we should pay attention to a key concept called balancing Which means, we have to make sure that all information presented in a Data Flow Diagram at one level is accurately represented in the next-level Data Flow Diagram. In other words, when a parent process is decomposed into its children, its children must completely perform all of its functions.
Each process on the level 0 Data Flow Diagram can be further decomposed into a more explicit Data Flow Diagram called level 1 diagram.
This example here shows the decomposition of Process 2 from Level 0 as a Level 1 Data Flow Diagram. The set of children and their parent are basically identical; the children are simply more detailed version of their parent process. Just like we did with decomposition form the context diagram to the Level 0 diagram, it is important to ensure that level 0 and level 1 Data Flow Diagrams are balanced as well. In general, all process models have as many level 1 diagrams as there are processes on the level 0 diagram. Because it is assumed that all of the processes of Level 0 should be defined in more detail at least as a Level 1 Data Flow Diagram or even further level Data Flow Diagrams. The parent process and the children processes are numbered consistently, children numbers are being the parent’s number, followed by a dot and consecutive numbers for each child. The traditional Data Flow Diagrams do not show external entities or data sources at level 1 and below. For example at Level 1 diagram for Process 2, it is not obvious that Data Flow B comes from Process 1. Even though the traditional Data Flow Diagrams do not show it, we will include them in this course following our textbook.
The next level of decomposition is a level 2 diagram, which shows all processes, data flows, and data stores that make up a single process on the level 1 diagram.
This example shows the decomposition of Process 2.2 from Level 1 as a Level 2 Data Flow Diagram Just like before, it is important to ensure that level 1 and level 2 Data Flow Diagrams are balanced.
A process can produce different data flows under different circumstance. For example an approval process may output approved or unapproved data flows. In those cases, we show each potential data flow separately and use the process description to explain why there are alternatives.
Some processes may be complex enough, so that simple text description may not completely describe how the process internally works. In such cases, along with simple English text description, the process description may also have a decision tree or a decision table. A decision tree displays the logic with IF statements with nodes (questions) and branches (potential answers) And a decision tables display complex policy decisions that link various decisions with certain actions.
Now that we have an understanding of what Data Flow Diagrams represent, what their elements and levels are, let us now study the process of creating them.
Data Flow Diagrams start with the information captured in the use cases and the requirements definition statement and, since they are more technical documents, they get created by the project team. Generally, the set of Data Flow Diagrams are made of integration of the individual use cases, joined by the processes that are in the requirements statement that did not get use cases created for. The project team takes the use cases and rewrites them as Data Flow Diagrams , following the Data Flow Diagram formal rules about symbols and syntax by using simple or more comprehensive CASE tools.
Some analysts start building Data Flow Diagrams by creating Level 0 diagram first, however we will follow the method described by our textbook which starts with building the context diagram first. That way we are aware of all external entities and how they provide to or receive data from the system at hand.
Then, we go to use cases and create Data Flow Diagram fragments for each use case. The more complete use cases are, the easier to build Data Flow Diagram fragments for them.
Now that we have a set of Data Flow Diagrams fragments based on major use cases of the system, we organize them into a Level 0 diagram that gives a high-level processes of the system and their interrelations.
We have combined all of our use cases into Level 0 Data Flow Diagram; now we can get into details of each process in the Level 0 diagram. Since each process of the Level 0 diagram represent a use case, the steps in each use case can be used as basis for Level 1 Diagrams, which can be further decomposed into Level 2 Diagrams, Level 3 Diagrams, and so on … If the use cases are detailed enough sub level diagrams can be created based on sub-steps of the use cases.
Finally, the Data Flow Diagrams are validated for completeness and correctness.
Let us know briefly go over what happens at each of these Data Flow Diagram creation steps with the help of the Holiday Travel Vehicles Sales System example from the textbook .
As we mentioned before, the context diagram defines how the business process or computer system interacts with its environment.
To create the context diagram, we first draw a process symbol for the business process or system being modeled and number it with number 0 and named it for that process or system. In our example, we are creating the Holiday Travel Vehicle Sales System, so we have a Process 0 with that name in the context diagram.
Next, we add all inputs and outputs listed on the form of the use cases as data flows along with external entities as the source or destination of the data flows. As we transfer data flows from the form of use cases to the context diagram, if there are too many of them, we may have to simplify them, so that the context diagram is not too cluttered. For example, data inputs from the salesperson, such as vehicle ID, offer ID, offer type, etc. are grouped into offer details in the context diagram. These details will be apparent as the diagram is decomposed into lower levels.
It is best practice not to include data stores in the context diagram. Data stores that are part of external entities are described as external entities and data stores that are part of this business process or system are represented in the Process 0.
Next, we create Data Flow Diagram fragments for each use case. Like the name suggests, a Data Flow Diagram fragment is one part of a Data Flow Diagram that eventually will be combined with other Data Flow Diagram fragments to form a Data Flow Diagram .
At this step each use case is converted into one Data Flow Diagram fragment using the name, the ID number, and major inputs and outputs of the use case. The information about the major steps of the use case is ignored at this step; but will be used in a later step.
The most common changes made while converting use cases into Data Flow Diagram fragments are modifications of the process names and addition of data flows. As use case names are not formal, but Data flow Diagram names have to start a verb and include a noun, so this may require a name change. It is also important to note that Data Flow Diagram names are usually from the viewpoint of the organization that the system is being developed for, not from the viewpoint of customer of external entities. Since use cases are created from the viewpoint of the user, they may omit to include how the system obtains data from data-stores. So we might have to create additional data flows to display how the system receives data from them.
As far as the layouts of Data Flow Diagram fragments are concerned, there aren’t formal rules, however typically the process is placed in the middle, the data stores are places below the process, inputs start from the left or top and outputs leave from the right or the bottom
Once we have all major Data Flow Diagrams created from major use cases of the system, then we combine them into the level 0 Data Flow Diagram for the system. Although, there aren’t formal layout rules in term of how to combine Data Flow Diagram fragments into Data Flow Diagrams; Generally, we place the process that is first chronologically in the upper-left corner and work our way from top to bottom, left to right; We try to minimize the number of crossed data flow lines.
The main goal to creating Data Flow Diagrams is for us to better identify processes, data stores, and external entities and understand how data flows among them. So, initial Data Flow Diagram we come up with will help us with this goal but it probably will not be a perfect diagram. So, many iterations with improved understanding of the system may be necessary to come up with a good Data Flow Diagram design.
Again following the use cases, we have used all the information the use case provided, except the major steps listed in each use case. Each major step in a use case included in the Level 0 diagram becomes a process on a level 1 Data Flow Diagram, with its inputs and outputs becoming the input and output data flows. The considerations for naming and additional data flows that we discussed in creating Level 0 diagrams also apply here. Additionally, it is worth mentioning that although traditional approaches choose to not include external entities in level 1 and lower diagrams, I agree with the textbook that including them can simplify the readability of Data Flow Diagrams significantly.
An obvious question at this point is what the ideal level of decomposition should be. There is no simple answer to it, because it depends on the complexity of the system or business process being modeled. In general, we decompose a process into a lower-level Data Flow Diagram whenever the process is sufficiently complex that additional decomposition can help explain the process.
There are a couple of general rules of thumb that we can follow to help us make a decision whether we should further decompose: There should be at least 3, and no more than 7-9 processes each every Data Flow Diagram. So if we decompose a process into a Data Flow Diagram and it only includes a couple of processes, the new Data Flow Diagram is probably not necessary. On the other end of the spectrum, if we have more than 9 processes in a Data Flow Diagram, it is probably time to decompose one or more of its processes into new Data Flow Diagrams. Usually, the processes that have many in and out data flows are better candidates for further decomposition. The second criteria that we can use to decide if we need to further decompose a process is the length of the detailed process description: If detailed process description is longer than one page, that process is probably a good candidate for further decomposition into another Data Flow Diagram.
The last step of building Data Flow Diagrams is to check their quality by identifying most common errors.
The textbook has a very nice summary of Data Flow Diagram Quality Checklist in Figure 5-3.
Syntax errors are easier to find and fix than semantics errors because there are clear rules that can be used to identify them and most tools have syntax checkers that will detect syntax errors.
Most common syntax errors are made by violating the law of conservation of data, which basically means that Data needs process to be moved, it cannot go to or come from a data store or an external entity without a process pushing or pulling it. Errors 1 and 8 in this example show data flowing between external entities or data stores without a process. The second rule means that a process cannot destroy its input and have no output. This is also called a black-hole process. Like Error 4 in the exampleOr a process cannot make-up its output without an input, which is also called as miracle process, like Error 5 in the example.
Semantics errors cause the most problems in system development and more difficult to identify, because they indicate misunderstanding of the system on the part of the development team.
Here are three useful checks to help ensure that models are semantically correct: First, to ensure that the model is an appropriate representation by asking the users to validate the model in a walk-through, or by having the users to role-play, similar to the technique used for use cases. It is especially important to check for sufficient inputs exist for a process to perform.
Second check is to ensure consistent decomposition at the lowest-level processes of Data Flow Diagrams. This is to make sure we have a consistent level of details across the board, which is similar to ensuring consistent step sizes in use cases.
And the third check is to ensure that the terminology is consistent throughout the model, which means an agreement upon a naming and viewpoints, so that the same name does not mean different things for different users based on their role in the organization.
So, that is all for this week.
We have taken the use cases from last week and increased the level of details in describing business processes by describing them in Data Flow Diagrams.
And next week, we will take the data stores as a starting point in creating data models, namely entity relationship diagrams. Next week, we will have a live lecture.
See you next week!
HI-600: Analysis and Design of Health Information Systems
Analysis: Part III
Process Modeling: Data Flow Diagrams
• A process model can be used to further clarify
the requirements definition and use cases.
• A process model is a graphical way of
representing how a business system should
• A process model can be used to document the
as-is system or the to-be system, whether
computerized or not.
• Data flow diagramming is a process modeling
technique that diagrams the business processes
and the data that pass among them.
• Logical process models describe processes
without suggesting how they are conducted.
• Physical process models provide information
that is needed to build the system.
Elements of Data Flow Diagrams
• Process – A process is an activity or a
function performed for some specific
• Data Flow – A data flow is a single piece of
data, or a logical collection of several pieces
• Data Store – A data store is a collection of
data that is stored in some way.
• External Entity – An external entity is a
person, organization, organization unit, or
system that is external to the system, but
interacts with it.
• The purpose of the process descriptions is to
explain what the process does and provide
additional information that the DFD does not
• Three techniques are commonly used to
describe more complex processing logic:
• Structured English
• Decision trees
• Decision tables
Data Flow Diagram Creation Steps
1. Build the context diagram.
2. Create DFD fragments for each use case.
3. Organize the DFD fragments into level 0
4. Develop level 1 DFDs based on the steps with
each use case. In some cases, these level 1
DFDs are further decomposed into level 2
DFDs, level 3 DFDs, and so son.
5. Validate the set of DFDs to make sure that they
are complete and correct.
Step 5: Validating the DFD
• There two fundamental types of errors in DFDs:
• Syntax errors – can be thought of as grammatical
errors that violate the rules of the DFD language.
• Semantics errors – can be thought of as
misunderstandings by the analyst in collecting,
analyzing, and reporting information about the system.
Common Syntax Errors
Law of conservation of data
• Data at rest stays at rest
until moved by a process
• Processes cannot
consume or create data
How to Check for Semantic Errors
• To ensure that the model is an appropriate
representation by asking the users to validate
the model in a walk-through
• To ensure consistent decomposition
• To ensure that the terminology is consistent
throughout the model
• Data Flow Diagram Syntax – four symbols are used on
data flow diagrams (processes, data flows, data stores,
and external entities).
• Creating Data Flow Diagrams
• The DFDs are created from use cases.
• Every set of DFDs starts with a context diagram.
• DFDs segments are created for each use case, and are then
organized into a level 0 DFD.
• Level 1 DFDs are developed on the basis of the steps within
each use case.
• The set of DFDs are validated to make sure that they are
complete and correct and contain no syntax or semantics errors.