DATABASE DESIGN: DATA MODELING
AND NORMALIZATION
Dr. R. Khanchana
Assistant Professor
Department of Computer Science
Sri Ramakrishna College of Arts and Science for Women
Data Modeling
Data Modeling:
• Data modeling - A tool used to represent various components and their
relationships.
• Popular modeling tool is Entity Relational Model(E-R Model)
• E-R Model provides
 An excellent communication tools
 A simple graphical representation of data.
• E-R Model uses E-R Diagram(ERD) for graphical representation of the
database components.
• An entity is represented by a rectangle it uses upper case letters and it should
be singular noun Eg. EMPLOYEE,DEPARTMENT etc
• Entity Representation in an E-R Diagram
EMPLOYEE
Representation of Relationship
• The line represents the relationship between the two entities it uses
lowercase letters and it should be active verb
• Passive verb can be used but active verb is preferable.
• Representation of relationship in an E-R diagram
1:1
• The types of relationships (1:1,1:M,M:N) between entities are called
connectivity or multiplicity.
M: N
1: Mmanages
employs
contains
Entity, Relationship, Connectivity
• The types of relationships (1:1,1:M,M:N) between entities are
called Connectivity or Multiplicity
• The connectivity is shown with vertical or angled lines next to
each entity
Cardinality
• The Relationship between two entities can be
given suing the lowercase and uppercase
limits
• Example - (n,m)
• n – lower limit
• m – upper limit
DEPARTMENTEMPLOYEE
FACULTYDIVISION
ITEMINVOICE
Supervises
Employees
Contains
(1, 1)
(1, 1)
(1, N)
(1, 1)
(1, N) (1, N)
Data Model Basic Building Blocks
Cardinality Types
• It has 2 types
– Mandatory
Relationship
– Optional
Relationship
• Oracle set rules for
the minimum and
maximum values for
cardinality. These
types of decision are
known as business
rules.
Minimum Cardinality
Cardinality Representation
Optional Relationship
• Optional relationships are shown with a small circle next to the
optional entity.
• The optional relational relationship can occur in 1:1,1:M or
M:N relationships and it can occur on one or both sides of the
relationship.
• In relational databases many-to-many (M:N) relationships are
allowed but they are not easy to implement.
Composite Entity & Relational Schema
• The decomposition from M:N to1:M involves
a third entity and it is known as Composition
entity or Associative entity
Weak Entity
• There are entities that cannot exist by
themselves.
Other Elements
• Some of the other elements considered in the database design are
 Simple Attributes:
Attributes that cannot be subdivided
Eg. city, gender
 Composite Attributes:
Attributes that can be subdivided
Eg. Full name(Last name, First name, Middle Initial)
 Single valued Attributes:
Attributes with a single value
Eg. Employee Id, Student Id
 Multivalued Attributes:
Attributes with multiple values
Eg. Course details
Quiz
• https://quizizz.com/admin/quiz/5f06a5c1b83
319001b5de46f
Dependency
• A dependency is a constraint that applies to or defines the relationship
between attributes.
• Primary key which uniquely identifies an entity
• The column that do not make up the primary key for the table such columns
are called the nonkey column
• The non key columns are functionally dependent on the primary key columns
• There are three types of dependency in tables
– Total or full dependency: A nonkey column dependent on
all primary key column
– Partial dependency : A nonkey column is dependent on part
of the primary key
– Transitive dependency :A nonkey column is dependent on
another nonkey column.
Dependency Types
Database Design
Normal forms
• Redundancy - Data Repetition
• Anomaly (Data Inconsistency)
• – if a change in piece of data, the change has
to made in many places
Anomalies
Deletion Anomaly
Insertion Anomaly
Update Anomaly
Deletion Anomaly
• Which results when the deletion of information about one
entity leads to the deletion of another entity.
• If someone decides to delete Botany department , he may
end up deleting all student’s data who had the department of
Botany.
Insertion Anomaly
When the information about an entity cannot be inserted
unless the information about another entity is known
and it is said to be Insertion Anomaly
Jerry is a new Student with department id 6. There is no Department with this Dept_ID 6. Hence , the
anomaly. The usual behaviour should be a new department id with 6 and only then Student could have
it.
Update Anomaly
• An update anomaly occurs when data is only
partially updated in a database.
• English department has now Dept_ID 8 , but unfortunately it
was not updated in Student table.
Quiz
• https://quizizz.com/admin/quiz/5f07f8f48538
03001bc75b8c/anamolies-and-depedency
Feedback
• https://docs.google.com/forms/d/1rk0rjFo78J
adkbHRT-SshgN0XavjgYvK5SnCOBqQ8V8/edit
Normalization
• Normalization is the process of organizing the
data in the database.
• Normalization is used to minimize the
redundancy from a relation or set of relations.
• Normalization divides the larger table into the
smaller table and links them using relationship.
• The normal form is used to reduce redundancy
from the database table.
Review
• Data Modeling
– Entity
– Attribute
– Connectivity
– Cardinality -Notation
• Dependency
• Database design
Types of Normal Forms
• Higher Normal Form - Lower the redundancy
i) First Normal Form(1NF)
ii) Second Normal Form(2NF)
iii) Third Normal Form(3NF)
First Normal Form (1NF)
The table said to be first normal form (or) labeled 1NF if the following conditions
exists:
• The primary key is defined
• Also includes composite key if a single column cannot be used as a primary key.
• First normal form disallows the multi-valued attribute/ composite attribute, and
their combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE
Second Normal Form (2NF)
• All 1NF requirements are fulfilled
•No partial dependency
•Suppose the table is in 1NF and does not have composite key and it is said to
be 2NF.
Example: A school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
Second Normal Form (2NF)
Third Normal Form (3NF)
• Table is said to be third normal form (or) 3NF
if the following requirements are satisfied.
– All 2NF requirements are fulfilled.
– No transitive dependency- non key column is
dependent on another non key column.
Third Normal Form (3NF)
• Super key in the table above:
– {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
• Candidate key: {EMP_ID}
• Non-prime attributes: all attributes except EMP_ID are non-prime.
– EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on
EMP_ID.
– The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on
super key(EMP_ID). It violates the rule of third normal form.
– Hence need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
Third Normal Form (3NF)
Other Normal forms
• BCNF - BoyceCodd Normal Form(BCNF)
• 4NF –Fourth Normal Form
• 5NF - Fifth Normal Form
• DKNF –Domain Key Normal Form
Dependency Diagram
• Total dependencies arrows are drawn above the boxes.
• Partial & transitive dependencies arrows are drawn below the boxes.
• i)Conversion from 1NF to 2NF
• ii)Conversion from 2NF to 3NF
Conversion from 1NF to 2NF
• The composite primary key is in the table remove partial dependency.
• 1st each primary key component on a separate line , they will become
primary key in two new tables.
• Composite key on the third line it will be the composite key on the third
table.
1NF to 2NF
Decomposition
Table in 2NF
Conversion from 2NF to 3NF
• Invoice table still has transitive dependency but no partial dependency.
• More columns of transitive dependency to a new table.
• Keep the primary key of new table as a foreign key at existing table.
2NF to 3NF
Decomposition
Table in 3NF
Denormaization
• Normalization process - splits tables into smaller
tables
• Smaller tables are joined together by common
columns to retrieve information from different
tables.
• Denomalization process – reverse process
• It reduces normal form and increases
data redundancy
• Duplicate data are stored more storage space is
required
Quiz
• https://quizizz.com/admin/quiz/5f071541f209
7b001e69591a
Feedback
• https://docs.google.com/forms/d/10qpFTAot
HMsCXuhBpCrNKnU3PIso8NtwjktMSc4sru8/e
dit
Assignment –Case Study
• Assume, a video library maintains a database of
movies rented out. Without any normalization, all
information is stored in one table as shown below.
Assignment Results
1NF
Assignment Results
2NF
Assignment Results
3NF

Data Modeling

  • 1.
    DATABASE DESIGN: DATAMODELING AND NORMALIZATION Dr. R. Khanchana Assistant Professor Department of Computer Science Sri Ramakrishna College of Arts and Science for Women
  • 2.
    Data Modeling Data Modeling: •Data modeling - A tool used to represent various components and their relationships. • Popular modeling tool is Entity Relational Model(E-R Model) • E-R Model provides  An excellent communication tools  A simple graphical representation of data. • E-R Model uses E-R Diagram(ERD) for graphical representation of the database components. • An entity is represented by a rectangle it uses upper case letters and it should be singular noun Eg. EMPLOYEE,DEPARTMENT etc • Entity Representation in an E-R Diagram EMPLOYEE
  • 3.
    Representation of Relationship •The line represents the relationship between the two entities it uses lowercase letters and it should be active verb • Passive verb can be used but active verb is preferable. • Representation of relationship in an E-R diagram 1:1 • The types of relationships (1:1,1:M,M:N) between entities are called connectivity or multiplicity. M: N 1: Mmanages employs contains
  • 4.
    Entity, Relationship, Connectivity •The types of relationships (1:1,1:M,M:N) between entities are called Connectivity or Multiplicity • The connectivity is shown with vertical or angled lines next to each entity
  • 5.
    Cardinality • The Relationshipbetween two entities can be given suing the lowercase and uppercase limits • Example - (n,m) • n – lower limit • m – upper limit DEPARTMENTEMPLOYEE FACULTYDIVISION ITEMINVOICE Supervises Employees Contains (1, 1) (1, 1) (1, N) (1, 1) (1, N) (1, N)
  • 6.
    Data Model BasicBuilding Blocks
  • 7.
    Cardinality Types • Ithas 2 types – Mandatory Relationship – Optional Relationship • Oracle set rules for the minimum and maximum values for cardinality. These types of decision are known as business rules.
  • 8.
  • 9.
  • 10.
    Optional Relationship • Optionalrelationships are shown with a small circle next to the optional entity. • The optional relational relationship can occur in 1:1,1:M or M:N relationships and it can occur on one or both sides of the relationship. • In relational databases many-to-many (M:N) relationships are allowed but they are not easy to implement.
  • 11.
    Composite Entity &Relational Schema • The decomposition from M:N to1:M involves a third entity and it is known as Composition entity or Associative entity
  • 12.
    Weak Entity • Thereare entities that cannot exist by themselves.
  • 13.
    Other Elements • Someof the other elements considered in the database design are  Simple Attributes: Attributes that cannot be subdivided Eg. city, gender  Composite Attributes: Attributes that can be subdivided Eg. Full name(Last name, First name, Middle Initial)  Single valued Attributes: Attributes with a single value Eg. Employee Id, Student Id  Multivalued Attributes: Attributes with multiple values Eg. Course details
  • 14.
  • 15.
    Dependency • A dependencyis a constraint that applies to or defines the relationship between attributes. • Primary key which uniquely identifies an entity • The column that do not make up the primary key for the table such columns are called the nonkey column • The non key columns are functionally dependent on the primary key columns • There are three types of dependency in tables – Total or full dependency: A nonkey column dependent on all primary key column – Partial dependency : A nonkey column is dependent on part of the primary key – Transitive dependency :A nonkey column is dependent on another nonkey column.
  • 16.
  • 17.
  • 18.
    Normal forms • Redundancy- Data Repetition • Anomaly (Data Inconsistency) • – if a change in piece of data, the change has to made in many places
  • 19.
  • 20.
    Deletion Anomaly • Whichresults when the deletion of information about one entity leads to the deletion of another entity. • If someone decides to delete Botany department , he may end up deleting all student’s data who had the department of Botany.
  • 21.
    Insertion Anomaly When theinformation about an entity cannot be inserted unless the information about another entity is known and it is said to be Insertion Anomaly Jerry is a new Student with department id 6. There is no Department with this Dept_ID 6. Hence , the anomaly. The usual behaviour should be a new department id with 6 and only then Student could have it.
  • 22.
    Update Anomaly • Anupdate anomaly occurs when data is only partially updated in a database. • English department has now Dept_ID 8 , but unfortunately it was not updated in Student table.
  • 23.
  • 24.
  • 25.
    Normalization • Normalization isthe process of organizing the data in the database. • Normalization is used to minimize the redundancy from a relation or set of relations. • Normalization divides the larger table into the smaller table and links them using relationship. • The normal form is used to reduce redundancy from the database table.
  • 26.
    Review • Data Modeling –Entity – Attribute – Connectivity – Cardinality -Notation • Dependency • Database design
  • 27.
    Types of NormalForms • Higher Normal Form - Lower the redundancy i) First Normal Form(1NF) ii) Second Normal Form(2NF) iii) Third Normal Form(3NF)
  • 28.
    First Normal Form(1NF) The table said to be first normal form (or) labeled 1NF if the following conditions exists: • The primary key is defined • Also includes composite key if a single column cannot be used as a primary key. • First normal form disallows the multi-valued attribute/ composite attribute, and their combinations. Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE
  • 29.
    Second Normal Form(2NF) • All 1NF requirements are fulfilled •No partial dependency •Suppose the table is in 1NF and does not have composite key and it is said to be 2NF. Example: A school can store the data of teachers and the subjects they teach. In a school, a teacher can teach more than one subject.
  • 30.
  • 31.
    Third Normal Form(3NF) • Table is said to be third normal form (or) 3NF if the following requirements are satisfied. – All 2NF requirements are fulfilled. – No transitive dependency- non key column is dependent on another non key column.
  • 32.
    Third Normal Form(3NF) • Super key in the table above: – {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on • Candidate key: {EMP_ID} • Non-prime attributes: all attributes except EMP_ID are non-prime. – EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. – The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form. – Hence need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
  • 33.
  • 34.
    Other Normal forms •BCNF - BoyceCodd Normal Form(BCNF) • 4NF –Fourth Normal Form • 5NF - Fifth Normal Form • DKNF –Domain Key Normal Form
  • 35.
    Dependency Diagram • Totaldependencies arrows are drawn above the boxes. • Partial & transitive dependencies arrows are drawn below the boxes. • i)Conversion from 1NF to 2NF • ii)Conversion from 2NF to 3NF
  • 36.
    Conversion from 1NFto 2NF • The composite primary key is in the table remove partial dependency. • 1st each primary key component on a separate line , they will become primary key in two new tables. • Composite key on the third line it will be the composite key on the third table. 1NF to 2NF Decomposition
  • 37.
  • 38.
    Conversion from 2NFto 3NF • Invoice table still has transitive dependency but no partial dependency. • More columns of transitive dependency to a new table. • Keep the primary key of new table as a foreign key at existing table. 2NF to 3NF Decomposition
  • 39.
  • 40.
    Denormaization • Normalization process- splits tables into smaller tables • Smaller tables are joined together by common columns to retrieve information from different tables. • Denomalization process – reverse process • It reduces normal form and increases data redundancy • Duplicate data are stored more storage space is required
  • 41.
  • 42.
  • 43.
    Assignment –Case Study •Assume, a video library maintains a database of movies rented out. Without any normalization, all information is stored in one table as shown below.
  • 44.
  • 45.
  • 46.