Relations can be represented as two-dimensional data tables with rows and columns. The rows of a relation are called tuples.
The columns of a relation are called attributes. The attributes draw values from a domain (a legal pool of values).
Artificial intelligence in the post-deep learning era
Relational database (Unit 2)
1. Relational Database
• Relations can be represented as two-dimensional data tables with rows and
columns.
• The rows of a relation are called tuples.
• The columns of a relation are called attributes.
• The attributes draw values from a domain (a legal pool of values).
• The number of tuples in a relation is called its cardinality while the number of
attributes in a relation is called its degree.
• A relation also consists of a schema and an instance.
• Schema defines the structure of a relation which consists of a fixed set of
attribute-domain pairs.
• An instance of a relation is a time-varying set of tuples where each tuple consists
of attribute-value pairs.
Constraints: Set of rules for the database is known as constraints
• Domain Constraints:
Restrictions on the set of values of the attribute can take can be specified.
• Key Constraints:
A relation is defined to be set of tuples, since a set does not contain duplicates, no
two tuples can be identical.
• Entity Integrity Constraints:
Are the set of rules called by the application and these are applicable on all
instances of the relations
• Referential Integrity Constraint:
This is a special type of integrity constraint that relates two relations and
maintains consistency across the relations.
Data Integrity Data Integrity falls into the following categories:
• Entity integrity
Entity integrity ensures that each row can be uniquely identified by an attribute
called the Primary key. The Primary key cannot have a NULL value.
2. • Domain integrity
Domain integrity refers to the range of valid entries for a given column. It ensures
that there are only valid entries in the column.
• Referential integrity
Referential integrity ensures that for every value of a Foreign key, there is a
matching value of the Primary key.
Relational Algebra
• Relational Algebra is a procedural language used for manipulating relations.
• The relational model gives the structure for relations so that data can be stored in
that format but relational algebra enables us to retrieve information from relations.
The operators of Relational Algebra are:
• Select:
This is a unary operator that select a subset of tuples of the relation, which satisfy
selection condition. This can be represented by
• Project:
This is also a unary operator, that chooses subset of attributes or columns of a
relation and restricts all the tuples of a relation to those attributes. This is
represented by
The operators of Relational Algebra are:
• Cartesian Product:
This is a binary operator that combines information across two relations.
Cartesian product of two relations
3. R = (A1, A2, A3, A4, ….) and S = (B1, B2, B3,…) can be represented as:
Q = R X S = (A1, A2, A3, ….. B1, B2, B3,…).
• Join:
This is also a binary operator which is widely used and this operator concatenates
only tuples that satisfy certain conditions.
This is represented by:
Union, Intersection and Difference:
• To have union relations must have same number of attributes and corresponding
attributes must have same domain.
• Also since the result of each of these operators has to be a relation, duplicates are
removed.
Database Planning
• The database design should be adequate for serving different applications.
• Improper database planning will result in loss of data interrelationship,
repeatability of data, loss of data share, loss of control on data etc.
Basically one can classify the needs into three categories as
below:
• Operational database, which contains data assisting day to day activities of the
organization. Ex.: sales, product stock, etc.
• Control database, which contains data assisting needs of middle management to
monitor and control business activities by effective and efficient management of
men, machines and money. For example: monthly performance summaries, sales
statements etc.
• Strategic planning database, which contains data required for top management for
taking long-term decisions like economic policies of government, similar type of
organisation etc.
Steps in Database Planning
• Before designing the database certain decision has to be taken regarding approach
and type of database.
4. • An organisation may go for single large global database or separate databases
serving different groups of applications.
Single global database
• Results from total systems approach where requirements of entire organisation
and business functions are inter-related.
• This results in minimum redundancy and maximum of data.
• This also allows full control over the data design and access.
• This approach results in more complex database having large number of data
types and relationships and requires considerable time and effort in design and
implementation.
Multiple database approach
• Reduces time and effort required for design and lesser risk in database
management.
• However this approach may lead to problems in sharing of data and inter-
operability.
A database can be designed using top-down and bottom-up
approach. Top-down approach
• Starts with enterprise level model, preparing data model for entire application
domain.
• This approach does not consider specific details of processing.
• The ER model is prepared identifying the main entities and relationships.
• With this approach, there is a possibility of not accounting all user requirements,
missing certain entities or relationships.
Bottom-up approach
• Begins with individual user requirements.
• Data model for each requirement is prepared understating the inputs, outputs and
process.
• These individual models then merged into a single conceptual schema.
5. • The main disadvantage of this approach is it fails to take into account future
requirements.
After deciding on the approach, planning is carried out for
database design. The important points to be kept in mind
while planning are as listed below:
• Decide on the business steps, which will make use of database.
• Establish database administrative function.
• Perform business system analysis
• Build information model
• Developing data distribution plan.
• Develop Implementation Plan.
• Review.
Database Design
Some guidelines to be followed while designing the database:
• A database design should be unambiguous and easy to understand.
• It should avoid / reduce the redundancy.
• Unrelated data should be in separate table so that updating the data will be easy.
• Design should have no inconsistencies.
• An entity should have attributes, which are generally present for all its instances.
Minimise null values to the attributes in the design.
• Define constraints so that correctness of data will be ensured.
• Design should facilitate information addition in future.
• Maintenance should be easy.
• Some of these can be translated into design principles for good relations.
6. From the point of view of user, important criteria for designing
are:
• Meaningful grouping of attributes
• No redundancy
• No inapplicable attributes
• Uniformity in naming and definitions of the data items.
Conversion of ER Diagram to Relations
• The ER model represents the conceptual entities and their interrelationships at a
logical level.
• Using some simple rules one can convert ER model to relational database.
• The relational database obtained in this approach will avoid redundancies and
update anomalies.
• This is possible only if ER diagram is drawn correctly.
The rules, which are to be followed to derive the relational
model from the ER diagram, are:
• Define a table for each strong entity where the table has one column for each
simple attribute. The key of the table will be key of the entity itself.
• Define a table for each weak entity. It will contain the primary key of the strong
entity on which it depends, and it will also include attributes of the weak entity.
• Define a table for each relationship where the table consists of the primary keys of
the participating entity sets and the attributes of the relationship itself.
• For one to one relationships, it is not necessary to define a separate table.
• Try to avoid defining a separate table for many to one binary relationship.
• Consider the example of student and course.
• Here the relationship is many to many.
• From the above rules, we can define three tables for this situation namely student
entity, course entity and study relationship.