Database design


Published on

Published in: Technology
  • Be the first to comment

Database design

  1. 1. Database Design Database design process can be broken down into 5 phases • Planning • Analysis • Design • Implementation • Maintenance
  2. 2. Planning Phase In planning phase the overall Database structure is defined. Therefore; • The purpose of the database is determined – What information will be used in the Database – How information is to be used – What question will be Answered • Feasibility studies are conducted. • Requirements gathering
  3. 3. Analysis phase Databases can be analyzed on different models • Conceptual-model – High-level description of facts – Not system specific • Logical model – Organization of data with some implementation information • Physical model – Actual storage of information (clustering, partitioning, indexing etc.)
  4. 4. Conceptual model • Provide a framework for developing a database structure. • Three database components (entities, attributes and relationship) are described in detail.
  5. 5. Entities • An entity defines a thing that exists and is distinguishable. i.e Person, place, object or concept. • Entities are basic building blocks of the database design. • particular occurrence of an entity is known as entity instance. • A group of similar entities is called entity set or entity class
  6. 6. Attributes Attributes describe properties of entities and relationships • Simple (Scalars) - smallest semantic unit of data, atomic (no internal structure)- singular e.g. city • Composite - group of attributes e.g. address (street, city, state, zip) • Multivalued (list) - multiple values e.g. degrees, courses, skills (not allowed in first normal form) • Domain - conceptual definition of attributes – a named set of scalar values all of the same type e.g. integer a pool of possible values
  7. 7. Relationships A relationship is a connection between entity classes. For example, a relationship between PERSONS and AUTOMOBILES could be an "OWNS" relationship. That is to say, people own automobiles. • The degree of a relationship indicates the number of entities involved. • The cardinality of a relationship indicates the number of instances in entity class E1 that can or must be associated with instances in entity class E2
  8. 8. Types of Relationship Based on cardinality of a relationship, we have 3 types: - • One-One Relationship - For each entity in one class there is at most one associated entity in the other class. For example, for each husband there is at most one current legal wife (in this country at least). A wife has at most one current legal husband. • Many-One Relationships - One entity in class E2 is associated with zero or more entities in class E1, but each entity in E1 is associated with at most one entity in E2. For example, a woman may have many children but a child has only one birth mother. • Many-Many Relationships - There are no restrictions on how many entities in either class are associated with a single entity in the other. An example of a many-to-many relationship would be students taking classes. Each student takes many classes. Each class has many students.
  9. 9. Logical model • After validating your conceptual mode, you can generate a logical model – Entity Classes are modeled as tables – Attributes are modeled as fields – Each instance of an entity is called a record – Domain are modeled as Data types – Primary keys for each table – Foreign keys for relationship
  10. 10. Physical –model • How data will be stored and accessed in a computer system. • Where data will be stored • Estimate the amount of disk space that will be required by the database. • How data will be distributed within an organization or disks • type of indexes to be used (for efficient retrieval and manipulation).
  11. 11. Design Phase Determine how best to represent the information system that was identified in the previous phase Mapping Logical Model and physical model into reality. – Database Management system (DBMS) to be used. – User Views (input forms, output reports) – Security Mechanisms etc.
  12. 12. Implementation phase Actual implementation of the database and associated programming. • Database is analyzed for possible errors • Tables are created with few records for sample to see if the desired results are achieved • Fine adjustments as needed
  13. 13. Entity Relationship Model • Conceptual data model that views the real world as entities and relationships. • A basic component of the model is the Entity-Relationship diagrams (ERDs), • (ERDs) provides a convenient method for visualizing the interrelationships among entities in a given application
  14. 14. The utility of the ER model is: • It maps well to the relational database model.. • It is simple and easy to understand with a minimum of training. • the model can be used as a design plan by the database developer to implement a data model in specific database management software.
  15. 15. Basic Elements in E-R Modeling The basic elements in ER modal are • entities • attributes and • Relationships.
  16. 16. Entities • Data object about which information is to be collected. • Some specific examples of entities are EMPLOYEE, PROJECT, INVOICE. • An entity occurrence (also called an instance) is an individual occurrence of an entity. • Entity set: a collection of similar entities (employees, projects, departments)
  17. 17. Attributes • describe the entity of which they are associated. • A particular instance of an attribute is a value. • Attributes can be classified as identifiers or descriptors. • Identifiers, more commonly called keys, uniquely identify an instance of an entity. • A descriptor describes a non-unique characteristic of an entity instance.
  18. 18. Relationships • Represents an association between two or more entities. An example of a relationship would be: employees are assigned to projects projects have subtasks departments manage one or more projects • Relationships are classified in terms of – degree, – connectivity, – cardinality, – and existence.
  19. 19. Classifying Relationships Degree of a Relationship • number of entities associated with the relationship.  A UNARY RELATIONSHIP exists when an association exists within a single entity  A BINARY RELATIONSHIP exists when two entities(participants) are in the relationship.  A TERNARY RELATIONSHIP exists when three entities (participants) are in the relationship.
  20. 20. Classifying Relationships The connectivity – describes the mapping of associated entity instances in the relationship. – The values of connectivity are "one" or "many". – The basic types of connectivity for relations are: one-to-one, one-to-many, and many-to-many. The cardinality – actual number of related occurrences for each of the two entities.
  21. 21. Classifying Relationships Existence • denotes whether the existence of an entity instance is dependent upon the existence of another, related, entity instance. • Defined as either mandatory or optional. – For mandatory existence an instance of an entity must always occur. "every project must be managed by a single department". – For optional existence the instance of the entity is not required or may occur
  22. 22. ER Notation • There is no universal standard for representing data objects in ER diagrams. • Number of Notation styles is used today, among the more common are information Engineering, Bachman, Chen and Martin.
  23. 23. ER Notation Martin Style. • Entities are represented by labeled rectangles. The label is the name of the entity. Entity names should be singular nouns. • Relationships are represented by a solid line connecting two entities. The name of the relationship is written above the line. Relationship names should be verbs. • Attributes, when included, are listed inside the entity rectangle. Identifier Attributes are underlined. Attribute names should be singular nouns. • Cardinality of many is represented by a line ending in a crow's foot. If the crow's foot is omitted, the cardinality is one. • Existence is represented by placing a circle or a perpendicular bar on the line. Placing a bar line next to the entity shows mandatory existence. Placing a circle next to the entity shows optional existence.
  24. 24. Martin Style.
  25. 25. ER Notation Chen Style • Rectangles represent ENTITY CLASSES • Circles represent ATTRIBUTES • Diamonds represent RELATIONSHIPS • Lines - lines connect entities to relationships. Lines are also used to connect attributes to entities. • Underline - Key attributes of entities are underlined. • Number Notations represents cardinality. • The name of the entity (class) or attribute or relationship is usually placed inside the symbol used for that object. (Sometimes, the name is placed adjacent.) •
  26. 26. Chen Style
  27. 27. Refining The Entity-Relationship Diagram This section discusses four basic rules for modeling relationships 1. Entities Must Participate In Relationships – Entities cannot be modeled unrelated to any other entity. – The exception to this rule is a database with a single table.
  28. 28. Refining The Entity-Relationship Diagram 2. Resolve Many-To-Many Relationships – Many-to-many relationships cannot be used in the data model because they cannot be represented by the relational model. – must be resolved early in the modeling process. – replace the relationship with an association entity and then relate the two original entities to the association entity
  29. 29. This strategy is demonstrated below Figure below: - Here Employees may be assigned to many projects. Each project must have assigned to it more than one employee.
  30. 30. Refining The Entity-Relationship Diagram 3. Transform Complex Relationships into Binary Relationships • Complex relationships are classified as ternary, an association among three entities, or n-ary, an association among more than three, where n is the number of entities involved. • cannot be directly implemented in the relational model. • so they should be resolved early in the modeling process. • The strategy for resolving complex relationships is similar to resolving many-to-many relationships. • Replace the complex relationship with an association entity and then relate the two original entities to the association entity
  31. 31. Here is an example Employees can use different skills on any one or more projects. Each project uses many employees with various skills.
  32. 32. Refining The Entity-Relationship Diagram 4. Eliminate redundant relationships – A redundant relationship is a relationship between two entities that is equivalent in meaning to another relationship between those same two entities.
  33. 33. For example, Figure A shows a redundant relationship between DEPARTMENT and WORKSTATION. This relationship provides the same information as the relationships DEPARTMENT has EMPLOYEES and EMPLOYEEs assigned WORKSTATION. Figure B shows the solution which is to remove the redundant relationship DEPARTMENT assigned WORKSTATIONS.
  34. 34. Tips for Effective ER Diagrams • Make sure that each entity only appears once per diagram. • Name every entity, relationship, and attribute on your diagram. • Examine relationships between entities closely. Are they necessary? Are their any relationships missing? Eliminate any redundant relationships. Don't connect relationships to each other. • Use colors to highlight important portions of your diagram
  35. 35. Normalization • Normalization is the process of refining a database design to produce table schemes in normal form. • A normal form refers to a class of relational schemas that obey some set of rules. • Schemas that obey the rules are said to be in the normal form. • Non–normal form is where data may recur repetitively. • Normalization is aiming at minimizing redundancy in database
  36. 36. Classifying normal forms • There are six commonly recognized normal forms, with the inspired names: – First normal form (or 1NF) – Second normal form (or 2NF) – Third normal form (or 3NF) – Boyce-Codd normal form (or BCNF) – Fourth normal form (or 4NF) – Fifth normal form (or 5NF) • We will consider the first three of these normal forms
  37. 37. First normal form (or 1NF) A relation is in First Normal Form (1NF) if every attribute value is indivisible (atomic) and every column is unique. • First normal form (1NF) sets the very basic rules for an organized database: – Eliminate duplicative columns from the same table. – Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
  38. 38. Second normal form (or 2NF) A relation is in Second Normal Form (2NF) if it is in 1NF and if all of its attributes are dependent on the whole key (i.e. none of the non-key attributes are related only to a part of the key). • Second normal form (2NF) further addresses the concept of removing duplicative data: – Remove subsets of data that apply to multiple rows of a table and place them in separate tables. •     Create relationships between these new tables and their predecessors through the use of foreign keys.
  39. 39. Third normal form (or 3NF) A relation is in Third Normal Form (3NF) if it is in 2NF and there are no transitive dependencies (i.e. none of the non-key attributes are dependent upon another attribute which in turn is dependent on the relation key). • Third normal form (3NF) goes one large step further: Remove columns that are not dependent upon the primary key.
  40. 40. Fourth normal form (or 4NF) • A relation is in 4NF if it has no multi-valued dependencies.