1. CHAPTER 3: DATABASE MODELING
Chapter Objectives
At the end of this chapter, you should be able to :
distinguish between unary, binary and ternary relationship;
draw an entity-relationship (E-R) diagram to represent common business;
model ISA relationships in an E-R diagram;
draw an object-oriented data model (OODM) to represent common business
situations;
know the limited or concerns regarding OODBMS;
transform E-R to relational.
Essential Readings
Modern Database Management (4th Edition), Fred R. Mcfadden and Jeffrey A.
Hiffer (1994), Bejamin/Cummings. [Chapters 4,5,14 and 15]
Planning
Enterprise data model
Analysis
Conceptual data model
Design Logical Database
Design
Logical data model
Physical
Database Design
Technology model
Implementation
Data base and repositories
Summary and the database development process
Prof. Erwin M. Globio, MSIT 3-1
2. DB212 CHAPTER 3: DATABASE MODELING
3.1 Entity-relationship Model
The E-R model is used to construct a conceptual data model, which is a representation of
the structure of a database that is independent of the software that will be used to
implement the database.
3.1.1 Entities
An entity can be a person, place, object, event or concept in the user environment
about which the organization wishes to maintain data.
Examples:
Person: EMPLOYEE, STUDENT, PATIENT
Place: STATE, REGION, COUNTRY
Object: MACHINE, BUILDING
Event: SALE. REGISTRATION
Concept: ACCOUNT, COURSE
There is a difference between entity type and entity instance.
Entity type is a collection of entities that share common properties or characteristics.
Entity instance is a single occurrence of an entity type.
Examples:
Entity type: EMPLOYEE
Attributes: EMPONO.
NAME
ADDRESS
YR HIRED
Instances: 100 101
Roy Lim Mary Wong
100 Chalet Lane S(0211) Blk 321 Toa Payoh Lor 1 S(1231)
1989 1990
3.1.2 Attributes
An attributes is a property or characteristic of an entity that is of interest to the
organization. Both entity and relationships may have attributes.
STUDENT: STUDENT NO.,NAME, ADDRESS
EMPLOYEE: EMPLOYEE NO., NAME, SKILL
Every entity type must have an attribute or set of attributes that uniquely identifies
each instance and distinguishes that instance from the other instances of the same
entity type.
Candidate key is an attribute (or combination of attributes) that uniquely identifies
each instance of an entity type.
3-2 Prof. Erwin M. Globio, MSIT
3. DB212 CHAPTER 3: DATABASE MODELING
A primary key is a candidate key that has been selected as the identifier for an entity
type.
Think about:
What are the criteria for selecting primary key?
Multivalued attribute
This refer to attributes that can have more than one value for
each entity instance. For example an EMPLOYEE may possess
a number of skills. So SKILL is a multivalued attributes.
Name Name
Employee No. Skill name
EMPLOYEE
Note: This is during the first pass of conceptual design which is common to use a
double-line ellipse to highlight multivalued attributes. However, subsequently,
we will normalize the entity data by removing this multivalued attributes as
shown:
Name Name Skill name
Employee No.
EMPLOYEE HAS SKILL
3.1.3 Relationship
Relationships are what that holds together the various entities.
Degree of a relationship
The degree of a relationship is the number of entity types that participate in that
relationship.
Unary relationship (recursive relationship)
It is a relationship between the instances of one entity type.
Prof. Erwin M. Globio, MSIT 3-3
4. DB212 CHAPTER 3: DATABASE MODELING
Examples:
Is married
PERSON to EMPLOYEE Manages
1 1 1 M
One-to-one One-to-many
Has
ITEM components
N M
Many-to-many
Binary relationship
It is a relationship between instances of two entity types.
Examples :
PRODUCT Is PRODUCT
One-to-one LINE LINE
assigned
PRODUCT PRODUCT
One-to-many Contains
LINE LINE
PRODUCT Registers PRODUCT
Many-to-many
LINE for LINE
Ternary relationship
It is a simultaneous relationship among instances of three entity types. Ternary
relationshipIt is a simultaneous relationship among instances of three entity
types.
3-4 Prof. Erwin M. Globio, MSIT
5. DB212 CHAPTER 3: DATABASE MODELING
Example :
PART
VENDOR Ships WAREHOUSE
Existence Dependency
An instance of one entity cannot exist without the existence of an instance of
some other entity.
Weak Entity
An entity type that has an existence dependency.
Identifying Relationship
A relationship in which the primary key of the parent entity is used as part of the
primary key of the dependent entity.
Weak entities usually do not have a natural identifier. Instead ,the primary key of
the parent entity is often used as part of the dependent child entity.
Example :
Student Student Student Parent
No. Name No. Name
STUDENT Has PARENT
Therefore, data integrity is enforced as weak entity cannot exist unless the parent
entity exists.
3.1.4 Generalization
Business entitles are often best modeled using the concepts of generalization and
categorization.
Generalization is the concept that some things(entities)are subtypes of other, more
general things.
For Example: Accountant, Programmer Analyst are subtypes of the more general type
called STAFF.
Categorization is when an entity comes in various subtypes.
For example : There are different subtypes of employee which are hourly employee,
salaried employee and part-time employee.
Prof. Erwin M. Globio, MSIT 3-5
6. DB212 CHAPTER 3: DATABASE MODELING
Supertypes
A generic entity type that is subdivided into subtypes. For example, in figure 3.1
EMPLOYEE is a supertype.
Subtypes
A subset of a supertype that shares common attributes or relationships distinct from
other subsets.
The relationship between each subtype and supertype is called an ISA relationship.
Usually, the subtypes are mutually exclusive and that one i.e. required for each instance
of the supertype.
For example in figure 3.1, each employee must be an hourly employee, a salaried
employee or a part-time employee. Attributes that are peculiar to each subtype are
included with that subtype only(e.g. HOURLY RATE is peculiar to HOURLY
EMPLOYEE).
This exclusive relationship is represented with a curved line(as shown in figure 3.1)
Inheritance
Inheritance is the property that, when entity types or object classes are arranged in a
hierarchy, each entity type of object class assumes the attributes and methods of its
ancestors.
For example, in the above example, the attributes NAME, ADDRESS and DATA HIRED
are inherited by three employee subtypes. Only attributes that are unique to a subtype are
associated with that subtype.
3-6 Prof. Erwin M. Globio, MSIT
7. DB212 CHAPTER 3: DATABASE MODELING
Employee Name
Date
Address
Hired
EMPLOYEE
ISA ISA ISA
Daily
Rate
HOURLY
EMPLOYEE EMPLOYEE CONSULTANT
Daily Daily Daily Daily
Rate Rate Rate Rate
Employee Employee Employee
no. no. no.
Figure 3-1
3.2 Logical Database Design
There is a process of transforming the conceptual data model into a logical database
model. There are four types of logical database models in use today: object-oriented,
hierarchical, network and relational.
3.2.1 Object-Oriented Data Model
Most future database management systems will be based on objects, or will incorporate
object-oriented functionality. This enable users to create generic, all purpose components
that can be reused in multiple applications.
Core concepts
Prof. Erwin M. Globio, MSIT 3-7
8. DB212 CHAPTER 3: DATABASE MODELING
Objects
Objects are abstraction of the real world entities that exhibit states and behaviours.
The state of objects are expressed as values of the attributes of the object. The
behaviour of an object is expressed by a set of methods that operate on its attributes.
Attributes
These are the properties of objects that are of interest to the organization.
Methods
Methods define the behaviour of an object. Methods can only process data within the
object class in which they are defined (concept of encapsulation). However, they can
receive requests from methods in another object class. There are a number of method
categories:
Occur methods - instance add, instance change, instance delete.
Calculate methods - perform calculations on the data values encapsulated in the
same object class.
Monitor methods - produce signals when predetermined limits are exceeded in a
system.
Encapsulation
This is the property that the attributes and the methods of an object are hidden from
the outside world and do not have to be known to access its data values or use its
methods. Each object has an interface that is known to the outside world. An outside
agent (can be another object) may request that a method be performed by sending a
message to the object.
Object classes and instance
A logical grouping of objects that have the same attributes and behaviour. An object
instance is one occurrence of an object class. When we use the term object by itself,
we are referring to an object class. For example: VEHICLE is an example of object
class and LORRY, CAR, MOTORCYCLE are examples of object instances.
VEHICLE
Number
Year
Model
Add Vehicle
Identifying and describing objects
Top-down approach
This begins with a high level description of the environment and proceeds from
the general to specific.
In the object-oriented data model, methods are activated by sending messages
from a sending object to a receiving object.
For example studying written material and talking with users are activities to
locate nouns in written material. This provides information about potential
objects.
3-8 Prof. Erwin M. Globio, MSIT
9. DB212 CHAPTER 3: DATABASE MODELING
Bottom-up approach
Bottom-up approach begins with system detail, examples reports, video forms
and other detail documents and displays. The analyst identifies the candidate
objects and their properties.
In reality, the top-down and bottom-up approaches should be used to identify
and describe candidates objects.
Generalization
Object-oriented data model show generalization and specialization specification of
real-world entities. To express generalization relationships, object are arranged into a
hierarchy. For example, an organization has three basic types of employees: hour
employee, salaried employees and contract consultants. The following are notations
used to represent generalization.
Inheritance
Inheritance is an important principle of the object-oriented mode. Inheritance means
that all properties of an object class become the properties of its subclasses. For
example, the attributes of EMPLOYEE apply to all three subclasses. The method
CalculateAge applies to all employees.
EMPLOYEE
Employee No.
Name
Address
Date hired
Birthdate
CalculateAge
HOURLY SALARIED CONSULTANT
Hourly rate Annual salary Contact no.
Stock option Date hired
Calculate MthlyWage Calculate StockBenefit Allocate To Contact
The object-oriented approach can be a basis for several database management
systems.
Advantages
Reusability
Objects can be defined for a variety of functions and then reused in numerous
applications.
Complex data types
An object-oriented database can store and manage complex data such as
documents, graphics, images, voice message and video sequences.
Prof. Erwin M. Globio, MSIT 3-9
10. DB212 CHAPTER 3: DATABASE MODELING
Distributed databases
Due to the communication mode denoted between objects (sending of messages
to activate methods in an object), object-oriented databases can support
distribution of data across a network more easily than other data models.
Illustration
Build an E-R diagram and object-printed diagram for following scenario:
Customer orders arrive daily. If the customer order is from a new customer, the
salesperson enters information to add that customer to the database. Customer
information include customer no., name, address, city, credit limit and total owned.
There is also a computation of total owned for each update by the system.
The user then update the order files according. For each order, there will be an order
no. and order date. For each line item on the order, the system then prompts the user
to enter the product number and quantity ordered. As each product is added to the
order, the system consults the quantity on hand for that item and computes the
quantity that can be shipped. The product number, description and quantity shipped
are then added to a shipping notice for the order (The quantity-shipped is calculated
by the system).
The system also computes the extended amount (quantity, shipped times unit price)
for each item on the order and adds it to the total owed and adds it to total owed for
that customer. If the total owned exceeds the customer's credit limit, a message is
sent to the user.
3 - 10 Prof. Erwin M. Globio, MSIT
11. DB212 CHAPTER 3: DATABASE MODELING
Name Address
Customer
City
No.
Total owned Credit limit
CUSTOMER
Places
Order no.
ORDER
Order date
Qty-shipped Qty-ordered
Consists
Product no. Unit price
PRODUCT
Description Qty on hand
Figure 3-2: E–R diagram
Prof. Erwin M. Globio, MSIT 3 - 11
12. DB212 CHAPTER 3: DATABASE MODELING
ORDER
Order no.
Order date
Product no.
Quantity ordered
Quantity shipped
Calculate QtyShipped
CUSTOMER CONSULTANT
Customer no. Product no.
Name Description
Address Unit price
City Quantity on hand
Credit limit
Total owned
Calculate MthlyWage Calculate QtyOnHand
Figure 3-3: Object-oriented diagram
3.2.2 Hierarchical Data Model
The hierarchical database model was the first important logical database model. Today it
is primarily on implemented mainframe.
In this model, records are arranged in a top-down structure that resembles an upside-down
tree. The parent and child are often used in describing hierarchical model. An important
characteristic is that a child is related to one parent.
The leading hierarchical DBMS in use today is IBM's Information Management System
(IMS).
IMS Physical Databases
The physical database record is a basic building block in IMS. A physical database
record (PDBR) consists of a set of related fields. A PDBR consists of a root segment
and its subordinate segments called child segments.
Example:
IMS physical database record
DEPARTMENT
DEPTNO DNAME LOCATION
EQUIPMENT EMPLOYEE
IDENT COST NUMBER EMPNO ENAME YEARS
DEPENDENT SKILL
DNAME AGE CODE SNAME NOYEARS
3 - 12 Prof. Erwin M. Globio, MSIT
13. DB212 CHAPTER 3: DATABASE MODELING
PDBR Occurrences
0001 Accounting A
0002 Personnel B
0003 Engineering C
EQUIPMENT EMPLOYEE
IBM PC 3500 3 100 Mary 3 102 Thomas 2
Apple II 2500 2
DEPENDENT SKILL
Theresa 34 02 Programming 2
Paul 8
Mike 5 DEPENDENT SKILL
John 10 01 Accounting 3
David 14
In this case, the Engineering DEPARTMENT consists of a set of the EQUIPMENT
occurrences, a set of EMPLOYEE occurrences.
Under each EMPLOYEE occurrences, there are a set of DEPENDENT occurrences and a
set of SKILL occurrences.
IMS Logical Database
External views of individual users in IMS are reflected in logical database records
(LDBRs). Each LDBR type is a subset of a corresponding PDBR type. Any segment
type (except the root segment) of a PDBR may be omitted in the corresponding
LDBR.
Examples of logical database records :
(a) Equipment LDBR (b) Personnel LDBR
DEPARTMENT DEPARTMENT
DEPTNO DNAME LOCATION DEPTNO DNAME LOCATION
EQUIPMENT EMPLOYEE
IDENT COST NUMBER EMPNO ENAME YEARS
SKILL
CODE SNAME NOYEARS
Prof. Erwin M. Globio, MSIT 3 - 13
14. DB212 CHAPTER 3: DATABASE MODELING
(Notice that each LDBR types contain the root segment, DEPARTMENT. In IMS,
program communication block (PCB) is a series of statements which define a logical
database record. Each of these LDBR type represents the view of a different user.)
3.2.3 Network Data Model
In the network database model, there is no distinction between parent and child record
types. Any record types may be associated with both the EMPLOYEE and PROJECT
record types.
DEPARTMENT
DEPARTMENT DEPARTMENT
DEPARTMENT DEPARTMENT
Note: The Conference on Data System Languages (CODASYL) through its Data Base
Task Group (DBTG) is a standard organization that has developed and issued description
of language for defining and processing data in Network DBMS. IDMS (Integrated
Database Management System) is the leading DBTG DBMS on IBM computers.
In Network Data Model a set is the usual means employed in a DBTG database to
represent a relationship. A set is the definition of a directed relationship from an owner
record type to one or more member record types. In figure 3.3, DEPT-EMP, DEPT-PROJ
and PROJ-EMP are examples of set.
This set defines a 1:M or 1:1 relationship.
Generally, we can assume that set is implemented as a ring data structures with the
owner at the head of the chain and with the last member pointing to the owner.
DEPARTMENT
DEPTNO DNAME LOCATION
DEPT-EMP DEPT-PROJ
EMPLOYEE PROJECT
EMPNO YEARS ENAME PROJNO DESCRIPTION
PROJ-EMP
Figure 3-4
3 - 14 Prof. Erwin M. Globio, MSIT
15. DB212 CHAPTER 3: DATABASE MODELING
3.2.4 Relational Data Model
A data model that represents data in the form of tables or relation.
The relational database model consists of the following three components:
1. Data structure
Data are organized in the form of tables or relation.
2. Data manipulation
Powerful operations such as SQL languages or Query-by-example, are used to
manipulate data stored in the database.
3. Data integrity
Business rules are specified to maintain the integrity of data when they are
manipulated.
Physical Properties
A relation consists of 1 or more columns and 0 or more rows. In the relational model,
a row is called a tuple. Each relation is given a unique name, and each column has a
name unique within the relation. Each row contains an instance of the data associated
with the relation. A relation with no rows is empty (contain no data), but still exists.
COLUMN NAMES
a b c d
x1
x2
x3
–
–
–
–
–
–
–
xn
Figure 3-5: Diagrammatic representation of a relation
Logical Properties
Ordering of columns
Columns are unordered, left to right. This property is designed to preserve the
independence of each column.
Ordering of rows
Rows are unordered, top to bottom. This is designed to preserve the independence of
each row.
Uniqueness
No row may be duplicated in a given relation. Uniqueness in a relation is guaranteed
by the designation of a primary key for each relation. A candidate key in a relation is
an attribute that uniquely identifies in row in that relation. A primary key is a
candidate key that has been selected to be the unique identifier for each row. Primary
key values cannot be null, since they would then not identify a row.
Prof. Erwin M. Globio, MSIT 3 - 15
16. DB212 CHAPTER 3: DATABASE MODELING
The sequence of columns (Left to right) is significant
The columns of a relation can be interchanged without changing the meaning or use
of the relation. There is no hidden meaning implied by the ordering.
The sequence of rows (Top to bottom) is significant
The rows of a relation may be interchanged or stored in any sequences. Thus it makes
no differences as whether to insert a new row in front or at the end of the table.
3.3 Comparison of Data Representation Concepts
3.3.1 Relational vs Network Models
In relational model connections between two relations are represented by including two
attributes with the same domain -- one in each of relations.
Example:
ITEM SUPPLIER
Item No Description Supplier-no. Supplier-no. Supplier name
100 XXX A123 A123 HUP
200 YYY A124 A124 CHONG
Individual tuples that have the same value for that attributes are logically related, even
though they are not physically connected together.In this case, supplier-no is a foreign
key in the table ITEM and a primary key in the SUPPLIER table. Thus a logical link is
established.
In network model, 1:N connections between two record types are explicitly represented
by the set type construct. The DBMS connects related records together in a set instance
by some physical method. Records are physical connected together when they participate
in the same set instance. Hence a set type physically represents a logical 1:N relationship
type.
Example:
ITEM
Item-no. Description Supplier-no.
100 XXX A123
200 YYY A124
ITEM– 300 ZZZ A123
SUPPLIER
SUPPLIER
Supplier-no.. Supplier name
A123 HUP
A124 CHONG
In this case supplier-no A123 is physically connected together by the DBMS as
participants in a set instance of the ITEM-SUPPLIER set type.
In addition, we can keep logical connection among the records by duplicating the key
field of the owner record in the member records. This fields values can be used as an
automatic set selection or as a validation checking.
Therefore, the relational model is simpler.
3 - 16 Prof. Erwin M. Globio, MSIT
17. DB212 CHAPTER 3: DATABASE MODELING
3.3.2 Hierarchical vs Network models
Both represent relationship explicity. However a reoccurs type in the network model can
be a member in any number of set types.
In hierarchical, a record type can have one real parent. This creates problems when
modeling M:N and n-ary relationship types. Thus, if a schema contains mainly 1:N
relationship types in the same direction, it can be modeled naturally as a hierarchy.
However, if many relationship types exists, we will have to duplicate records and pointers
to design a hierarchical representation. Therefore, the hierarchical model is considered
inferior to both the relational and network models as far as modeling capability is
concerned.
3.3.3 Object-oriented Models
Object-oriented data model has a closer representation of real-world problem domains
and has a greater productivity in applications productivity. It has ability to model complex
data types such as images and documents.
However ODBMS technology is still very young. Currently, its limitations and concerns
are :
Lack of accepted standards
There are no initials standard at the national and international level yet.
Lack of development tools
Tools such as CASE and 4GL are still under development, hence but not widely
available.
Performance
The performance of ODBMS technology with large numbers of concurrent users and
frequent transactions has yet been tested or demonstrated.
Data management facilities
Some of the products do not have adequate for concurrency control, backup and
recovery.
Query languages
Users cannot retrieve data about one or more objects based on his own defined
criteria.
Prof. Erwin M. Globio, MSIT 3 - 17
18. DB212 CHAPTER 3: DATABASE MODELING
3.4 Review Questions
1. Draw an E-R and OO diagram for each of the following situations:
a. A company has a number of employees. The attributes of EMPLOYEE include
NAME, ADDRESS, BIRTHDATE and DATEHIRED. One method that is
required of all employees is claculateYesrsOfSevice. The company also has
several projects. Attributes of PROJECT include CODE, DESCRIPTION and
START DATE. Each employee may be assigned to one or more projects, or
maybe assigned to a project. A project must have at least one employee
assigned, and may not be assigned to a project. A project must have at least
one employee assigned, and may have several employee assigned. One method
required of all projects is CalculateTotalCostToDate.
b. In a vehicle-licensing application, there are three types of vehicle: passenger,
truck, and trailer. Vehicle ID is an attribute of all vehicle types. Truck and
trailer vehicles have an attribute named GROSS CAPACITY. The passenger
and truck vehicle types required of all courses is ChangeCourseDescription.
2. Compare and contrast the following:
a. inheritance vs generalization hierarchy
b. generalization vs specialization
c. candidate key vs primary key
d. subtype vs super type
e. physical database record vs logical database record
f. encapsulation vs inheritance
g. relational model vs network model
h. hierarchical model vs network model
3. What are the limitations of ODBMS technology?
Useful Websites to learn Database and Programming:
http://erwinglobio.wix.com/ittraining
http://ittrainingsolutions.webs.com/
http://erwinglobio.sulit.com.ph/
http://erwinglobio.multiply.com/
3 - 18 Prof. Erwin M. Globio, MSIT