oracle

1,334 views

Published on

basic fundamentals

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,334
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
31
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

oracle

  1. 1. Database Fundamentals Basic & Intermediate
  2. 2. Module Objective: After completing this Module, you should :  Understand what is a Database System  Explain briefly different types of Database Systems  Be able to create a Database environment with ER Modeling  Have a broad overview on Relational Database Management System  Have an introduction to Structured Query Language  Understand how the DBMS & its host computer system intercommunicate  Be aware of the new trends in Database
  3. 3. Module Outline 1. 2. What is a Database System Types of Database Systems 5. 6. Internal Management Database Trends 3. 4. Creating a Database Environment Structured Query Language
  4. 4. 1.0 Database System Learning Objective: At the end of this Topic you will be able to – • Understand what is a Database System • Know how files are organized • Appreciate the advantages of using a DBMS over a traditional file system • Be aware of the Database Architecture
  5. 5. What is a Database System  A Database System is essentially a computerized record-keeping system.  A database-management system (DBMS) consists of a collection of interrelated data and a set of programs to access those data.  Database systems are designed to manage large volume of information
  6. 6. File Organization : Terms and Concepts  Database: Group of related files  File: Group of records of same type  Record: Group of related fields  Field: Group of words or a complete number  Byte: Group of bits that represents a single character  Bit: Smallest unit of data; binary digit (0,1) Data Hierarchy in a Computer System
  7. 7. File Organization : Terms and Concepts  Entity: Person, place, thing, event about which information is maintained  Attribute: Description of a particular entity  Key Field: Identifier field used to retrieve, update, sort a record
  8. 8. File Organization : Terms and Concepts Problems with the Traditional File Environment  Data redundancy  Program-Data dependence  Lack of flexibility  Poor security  Lack of data-sharing and availability  No concurrency control Traditional File Processing
  9. 9. DBMS and its Advantages • A Database Management System is a collection of programs that enables users to create and maintain a database. It is a general purpose software system that facilitates processes of defining, constructing and manipulating databases for various applications. • Advantages of Database approach: • Controlling Redundancy • Restricting Unauthorized access • Providing persistent storage for program objects and data structures • Permitting inference and actions using deduction rules • Providing multiple user interface • Representing complex relationships among data • Enforcing integrity constraints • Providing backup and recovery
  10. 10. Database Management System (DBMS)  Acts as an interface between application programs and physical data files.  Separates logical and physical views of data  Eliminates redundancy of data  Creates and maintains databases  Enforces security of data Figure 7-4
  11. 11. DBMS Architecture • Internal Schema : Describes physical storage structure of database • Conceptual Schema : Describes structure of whole database for a community of users. • External Schema : Each view describes that part of database that a particular user requires, and hides the rest.
  12. 12. DBMS Architecture • Data Independence  Logical data independence : capacity to change conceptual schema without having to change external schema.  Physical data independence : capacity to change internal schema without changing conceptual schema.
  13. 13. Functions of DBMS • Data definition : • • Data manipulation : • • Enforces certain controls for recovery and concurrency Data dictionary: • • Monitors user requests and rejects any unauthorized attempts Data recovery and concurrency : • • Manipulates data in a database Data security and integrity : • • Specifies content and structure of database and defines each data element Stores definitions of data elements, and data characteristics Performance : • Functions should be performed efficiently
  14. 14. Requirements of a DBMS Key elements in a database environment: • Data Administration • Data Planning and Modeling Methodology • Database Technology and Management • Users
  15. 15. Database System : Recap • Why do businesses have trouble finding the information they need in their information systems? • How does a database management system help businesses improve the organization of their information? • What are the advantages of using a DBMS over a traditional file system • State the major functions and requirements of a DBMS
  16. 16. Quiz  If a Customer Database has the following fields : EmpId, EmpName, Salary and DeptName, What would be the ideal Key field and why ?  EmpID  EmpName  DeptName  EmpId+DeptName
  17. 17. 2.0 Types of Databases Learning Objective: At the end of this Topic you will be able to – • Explain briefly the various types of Database Systems • Relational DBMS • Hierarchical DBMS • Network DBMS • Object-Oriented Databases
  18. 18. Relational Database Model • • Represents data as two-dimensional tables called relations Relates data across tables based on common data element Examples: DB2, Oracle, MS SQL Server
  19. 19. Three Basic Operations in a Relational Database • Select: Creates subset of rows that meet specific criteria • Join: Combines relational tables to provide users with information • Project: Enables users to create new tables containing only relevant information
  20. 20. Three Basic Operations in a Relational Database JOIN SELECT PROJECT
  21. 21. Hierarchical Database Model • • • • • It is a pointer based model Organizes data in a tree-like structure Stores data in tables and views relationships as links Supports one-to-many parent-child relationships Prevalent in large legacy systems
  22. 22. Network DBMS     Depicts data logically as many-to-many relationships Organizes data in tables and views relationships as links It is also a pointer based model Organizes data in arbitrary graphs
  23. 23. Hierarchical and Network DBMS Some of the Disadvantages  Outdated  Complex pointer based organization  Less flexible compared to RDBMS  Lack support for ad-hoc and English language-like queries
  24. 24. Object-Oriented Databases  Object-oriented DBMS: Stores data and procedures as objects that can be retrieved and shared automatically  Object-relational DBMS: Provides capabilities of both object-oriented and relational DBMS
  25. 25. Types of Databases : Summary • In a relational database the data is perceived as tables (and nothing but tables) by the user • The relational operators available are used to manipulate the data in the tables
  26. 26. 3.0 Creating a DB environment Learning Objective: At the end of this Topic you will – • Have the ability to model an application system based on the E-R Modeling approach. • Understand the Relational Database concepts like Normalization, Data Integrity, Relational Operations like Union, Intersection etc. • Be able to Design Relational Databases based on E-R Models or System Requirements for an application.
  27. 27. Introduction to Data Modeling  What is Data Modeling? A technique for analyzing requirements and for identifying the information needs of an organization • Why Data Modeling is important? Cannot build a good system without knowing what data needs to be captured and how it needs to be organized
  28. 28. Introduction to Data Modeling • An Overview : • • Data structures include the data objects, the associations between data objects, and the rules which govern operations on the objects • Focuses on what data is required and how it should be organized • • Conceptual representation of the data structures required by a database Independent of hardware or software constraints Data Model And Database Design: • Data Model is to a Database what a Building plan or a blueprint is to a Building • A Database Design translates a data model into a database • A Data Model is the conceptual design of a database
  29. 29. E-R Modeling  Originally proposed by Peter Chen (1976)  Views the real world as entities and relationships  Key component is the E-R Diagram  Most common model used for designing relational databases • Entity- An identifiable object or concept of significance • Attribute- Property of an entity or relationship • Relationship- An association between entities • Identifier- one or more attributes identifying an instance (occurrence) of an entity
  30. 30. Entity relationship diagram
  31. 31. E-R Modeling Entity has DEPARTMENT • Dept No. • Name Identifier Relationship works for Attributes EMPLOYEE • Name • Emp Id.
  32. 32. E-R Modeling • Entity • Any object or thing of significance about which data needs to be collected and maintained • Could be • • • Concrete or tangible like a person or a building Abstract like a concept or activity Analogous to a table in a relational database Examples: EMPLOYEES, PROJECTS, INVOICES
  33. 33. E-R Modeling • Entity Rules • Any thing or object may only be represented by one entity. Entities are mutually exclusive in all cases. • Each entity must be uniquely identifiable. Each instance (occurrence) of an entity must be separate and distinctly identifiable from all other instances of that type of entity. • Entity Classification and Types • Classified as dependent and independent • An independent entity is one that does not rely on another for identification • A dependent entity is one that relies on another for identification • In some, methodologies, the terms used are strong and weak, respectively
  34. 34. E-R Modeling • Entity Classification and Types • Fundamental entity - An entity that exists and is of interest in its own right. Generally, most entities in the data model are fundamental entities. Example :Department and Employee are both fundamental entities • Special Entity Types • Associative Entity -Used to associate two entities in order to reconcile a many-many relationship • Sub-type/super-type- Used in generalization hierarchies to represent a subset of instances of their of parent entity
  35. 35. E-R Modeling Example of Associative entity : ORDER has belongs to ORDER LINE for a appears on ITEM
  36. 36. E-R Modeling • Generalization Hierarchies • Generalization occurs when two or more entities represent categories of the same real-world object. Example: CAR and TRUCK represent categories of the same entity, VEHICLE is the super-type; CAR and TRUCK would be the subtypes
  37. 37. E-R Modeling • Generalization Hierarchies • Form of abstraction that specifies that two or more entities that share common attributes can be generalized into a higher level entity type called a super-type or generic entity. • The lower-level of entities become the sub-type, or categories, to the super-type. Sub-types are dependent entities.
  38. 38. E-R Modeling • Generalization Hierarchies • Sub-types can be either mutually exclusive (disjoint) or overlapping (inclusive) • In an overlapping hierarchy an entity instance can be part of multiple subtypes Example: Entity PERSON represents people at a university. It has three subtypes, FACULTY, STAFF, and STUDENT. A STAFF member could also be registered as a STUDENT PERSON STUDENT STAFF FACULTY
  39. 39. E-R Modeling • Generalization Hierarchies • In a disjoint hierarchy, an entity instance can be in only one subtype. Example: Entity EMPLOYEE, may have two subtypes, CLASSIFIED and WAGES. An employee may be one type or the other but not both
  40. 40. E-R Modeling • Generalization Hierarchies - Nested PERSON STUDENT UNDERGRAD FACULTY GRADUATE
  41. 41. E-R Modeling • Attribute • Attributes describe a property or a characteristic of an entity • A particular instance of an attribute is a value. For example “John Doe” is one value of the attribute Name. • Simple attribute  Contains only atomic values • Composite attribute  Has component attributes FName MI Student Name DOB Simple LName Composite
  42. 42. E-R Modeling • Attribute Classification • Single-valued attribute • Has exactly one value per instance of an entity • Multi-valued attribute • Contains repeating values per instance of an entity Multi-valued Singlevalued Math Module Id Student Physics
  43. 43. E-R Modeling • Identifiers and Descriptors • Attributes can be classified as identifiers or descriptors • Identifiers, more commonly called keys, uniquely identify an instance of an entity. • A descriptor describes a non-unique characteristic of an entity instance. An Example : Entity: Employee Unique Identifier: Employee No. Descriptor: Name, DOJ, DOB
  44. 44. E-R Modeling • Relationship • Represents an association between two or more entities Examples - Employees work for Departments - Departments manage one or more projects - Employees are assigned to projects - Projects have sub-tasks - Orders have line items • Defined in terms of: - Degree - Connectivity - Cardinality - Direction - Type - Existence
  45. 45. E-R Modeling • Degree • • Binary relationships, the association between two entities is the • most common type in the real world. N-ary is the general form for • • Number of entities associated with the relationship degree n Connectivity • • • Mapping of associated entity instances in the relationship. The values of connectivity are "one" or "many”. Cardinality  Actual number of related occurrences for each of the two entities.  The basic types of connectivity for relations are: one-to-one, one-to-many, and manyto-many.
  46. 46. E-R Modeling • Connectivity and Cardinality • A one-to-one (1:1) relationship is when at most one instance of a entity A Is associated with one instance of entity B. For example: Employees in the company are each assigned their own office. For each Employee there exists a unique office and for each office there exists a unique employee. • A one-to-many (1:N) relationships is when for one instance of entity A, there are zero, one, or many instances of entity B, but for one instance of entity B, there is only one instance of entity A. An example : A department has many employees each employee is assigned to one department
  47. 47. E-R Modeling • Connectivity and Cardinality • A many-to-many relationship, is when for one instance of entity A, there are zero, one, or many instances of entity B and for one instance of entity B there are zero, one, or many instances of entity A. An example is: employees can be assigned to no more than two projects at the same time; Project must have assigned at least three employees
  48. 48. E-R Modeling • Direction • Indicates the originating entity of a binary relationship. The entity from which a relationship originates is the parent entity; the entity where the relationship terminates is the child entity. • Type • The direction of a relationship is determined by its connectivity.  Identifying and Non-identifying • An identifying relationship is one in which one of the child entities is also dependent entity. • A non-identifying relationship is one in which both entities are independent.
  49. 49. E-R Modeling • Existence • • • • Denotes whether the existence of an entity instance is dependent upon the existence of another, related, entity instance. Defined as either mandatory or optional. Mandatory and optional relationship  If an instance of an entity must always occur for an entity to be included in a  relationship, then it is mandatory. If the instance of the entity is not required, it  is optional. Example: Mandatory : Every project must be managed by a single department Optional : Employees may be assigned to work on projects
  50. 50. E-R Modeling • E-R Notation • No standard notation • Original notation by Chen • Common notations are: Bachman, crow's foot, and IDEFIX • All styles represent entities as rectangular boxes and relationships as lines connecting boxes • Each style uses a special set of symbols to represent the cardinality of a connection
  51. 51. E-R Modeling • Entities • Represented by labeled rectangles • The label is the name of the entity • Entity names should be singular nouns. • Relationships • Represented by a solid line connecting two entities. • Name written above the line • Relationship names should be verbs Employee Works for Department
  52. 52. E-R Modeling • Attributes • Listed inside the entity rectangle Underlined • Names should be singular nouns Cardinality • Many is represented by a line ending in a crow's foot. If omitted, cardinality is one Existence • Represented by placing a circle or a perpendicular bar on the line • Mandatory existence is shown by the bar next to the entity for an instance that is required • Optional existence is shown by placing a circle next to the entity that is optional • • • Employee •EmpID •EmpName
  53. 53. E-R Modeling : Assignment How to create an E-R Model from Requirements ? Step 1: Identify Entities • Entities are things people talk about, record information about and do work on – by definition • Any keyword (noun) is a candidate • Identify generic object from reference to instances or occurrences • Combine synonyms to represent a single entity An Example : Purchase Order - System Requirements A buyer creates a purchase order (PO) as and when the need arises. A PO is for a Specific vendor. A PO has one or more line items. A buyer cannot create a PO of Total value more than his approval limit. A PO can be sent to the vendor by mail, fax, EDI. A PO can be canceled before it is submitted. A PO can be linked to a sales order…
  54. 54. E-R Modeling Step 1: Identify Entities • Entities Purchase Order (PO) Buyer? Vendor Line Items Sales Order Approval Limit? • Buyer characterizes a PO • Approval Limit characterizes a Buyer What does it tell us? • • • Approval Limit is not an entity Buyer is an entity Approval Limit is an attribute of the entity Buyer
  55. 55. E-R Modeling Step 2: Identify Relationships Look for phrases describing a link between two things or objects Verbs relating two nouns often suggest relationships e.g. A buyer creates a purchase order, A purchase order has one or more Lines Requirements may or may not contain information regarding degree, existence, cardinality of a relationship up front Further questioning may need to be done to determine the above
  56. 56. E-R Modeling Step 2: Identify Relationships Grid Technique PO PO replaced by Buyer Buyer creates a is approver Vendor of Vendor supplies against a - - Line belongs to a - created for item supplied by Line -
  57. 57. E-R Modeling  Step 2 : Identify Relationships • Analyzing Existing Systems (Files, Databases) • Look for   Foreign Keys  Repeating Groups  • Pointers Structured Codes All of the above imply relationships
  58. 58. E-R Modeling • Step 3 : Identify Attributes • An attribute is any detail that server to identify, classify, quantify or express the the state of an entity • Ask the following question for each entity “What information do you need to know or hold about …?” • Potential attributes are easily found by examining paper forms
  59. 59. E-R Modeling • Step 3: Identify Attributes Example Purchase Order Form Purchase Order No __________ Buyer _________ Vendor ___________ Date Created ______ No Item Quantity Value ___ ___________ ______ __________ ___ ___________ ______ __________ ___ ___________ ______ __________ Shipping Address Street _________ City __________ Total Value ______ Zip _______ • Purchase Order No • Vendor • Buyer • Date Created • Item? • Address • City • State • Zip • Total Value?
  60. 60. E-R Modeling E-R Model of the Purchase Order Example creates BUYER created by created for a PURCHASE ORDER has supplies against belongs to exists on ITEM LINE created for VENDOR
  61. 61. E-R Modeling         Major Modeling Techniques Peter Chen‟s original entity/relationship diagrams Information Engineering Richard Barker‟s notation, used by Oracle corporation IDEF1X Object Role Modeling Unified Modeling Language (UML) Extensible Markup Language (XML)
  62. 62. E-R Modeling • Major Modeling Techniques • Data Modeling has sets of two audiences: • User community - Uses the models to verify that the analysts understand their environment and their requirements. • Systems designers - Use the business rules implied by the models as the basis for their design of computer systems. • Different techniques are better for one audience or the other. • All techniques are fundamentally the same • Differences are mainly in syntactic or notational
  63. 63. Relational Model  Objective : • •  To give an informal introduction to relational concepts especially as they relate to relational database design issues. What it is not ? This does not give a complete description of relational theory.
  64. 64. Relational Model  Formally introduced by Dr. E. F. Codd in 1970  Represents data in the form of two-dimension tables  A relational database is a collection of two- dimensional tables  Basic understanding of the model needed to design and use relational databases
  65. 65. Relational Model  Tables, Columns and Rows  Relationships and Keys  Data Integrity  Normalization  What is a table? • Represents some real-world person, place, thing, or event • Two-dimensional • • Columns Rows Course No. Course_Title C_Hrs. Dept. C CIS 120 Intro to CIS 4 Cis MKT 333 Intro to Mkting 3 MKT ECO 473 BA201 CIS 345 Labor Econ. Intro to Stat. Intro to Dbase 3 ECO 5 ECO 4 CIS
  66. 66. Relational Model  Table • • • • Columns represent a property of the person, place, thing or event that the table represents Rows represent an occurrence or instance of what the table represents A data value is stored in the intersection of a row and column Each named column has a domain, which is the set of values that may appear in that column Empid Level DOJ Manager 101412 Employee Name John M3 4/10/98 101667 102235 Nancy M4 1/23/01 101412 101398 Mike S1 8/15/95 101667 101667 Jeff M2 6/2/96 100351 103893 Cindy M3 7/17/95 101284 101116 Rahul S2 2/20/00 101412 102739 Scott C1 4/13/01 101667
  67. 67. Relational Model Table - Terminology In this document Formal Terms Many Database Manuals Table Relation Table Column Attribute Field Row Tuple Record
  68. 68. Relational Model • Salient features of a relational table • Values are atomic (1NF) • Column values are of the same kind (Domain) • Each Row is unique (Primary Key) • Sequence of columns is insignificant • Sequence of rows is insignificant • Each column must have a unique name • Relationships and Keys • Keys - Fundamental to the concept of relational databases • Relationship - An association between two or more tables defined by means of keys
  69. 69. Relational Model • Primary Key • Column or a set of columns that uniquely identify a row in a table • Must be unique and must have a value • Foreign Key • Column or set of columns which references the primary key or a unique key of another table • Rows in two tables are linked by matching the values of the foreign key in one table with the values of the primary key in another •EMP_ID in table EMPLOYEE is the primary key • DEPT_NO in table DEPARTMENT is the primary key • DEPT_NO in table EMPLOYEE is a foreign key Examples
  70. 70. Relational Model • Data Integrity • Ensures correct and consistent navigation and manipulation of relational tables • Two types of integrity rules • Entity integrity • Referential integrity • The entity integrity rule states that the value of the primary key can never be a null value • The referential integrity rule states that if a relational table has a foreign key, then every value of the foreign key must either be null or match the values in the relational table in which that foreign key is a primary key
  71. 71. Relational Model • Data Manipulation • Relational tables are equivalent to sets • Operations that can be performed on sets can be performed on relational tables • Relational Operations such as : • Selection • • • • • • • Projection Join Union Intersection Difference Product Division INTERSECTION UNION DIFFERENCE
  72. 72. Relational Model • Selection • The select operator, sometimes called restrict to prevent confusion with the SQL SELECT command, retrieves subsets of rows from a relational table based on a value(s) in a column or columns A B C D E 1 A 212 Y 2 2 C 45 N 84 3 B 8656 N 4 4 D 324 N 56 5 C 5656 Y 34 6 A 445 N 4 7 B 546 Y 55
  73. 73. Relational Model • Projection • The project operator retrieves subsets of columns from a relational table removing duplicate rows from the result A B C D E 1 A 212 Y 2 2 C 45 N 84 3 B 8656 N 4 4 D 324 N 56 5 C 5656 Y 34 6 A 445 N 4 7 B 546 Y 55
  74. 74. Relational Model • Product • The product of two relational tables, also called the Cartesian Product, is the concatenation of every row in one table with every row in the second. • The product of table A (having m rows) and table B (having n rows) is the table C (having m x n rows). The product is denoted as A X B or A TIMES B ak ax ay bk bx by y 1 A 2 1 A 2 1 A 2 1 A 2 4 D 8 B 4 1 A 2 5 E 10 3 C 6 2 B 4 1 A 2 k Table B x 2 Table A k x y 2 B 4 4 D 8 1 A 2 2 B 4 5 E 10 4 D 8 3 C 6 1 A 2 5 E 10 3 C 6 4 D 8 3 C 6 5 E 10 A TIMES B
  75. 75. Relational Model • Join • • • Combines the product, selection and projection operations Combines (concatenates) data from one row of a table with rows from another or same table Criteria involve a relationship among the columns in the join relational table If the join criterion is based on equality of column value, the result is called an equi join A natural join is an equi join with redundant columns removed Joins can also be done on criteria other than equality. Such joins are called non-equi joins k k a 1 A 2 Table B c Equi-Join 2 B 4 k 3 C 6 1 bb 5 cc b k c 1 A 2 1 aa C 6 3 bb aa 3 a 3 b Table A Natural Join k a b c 1 A 2 aa 3 C 6 bb
  76. 76. Relational Model • Union • • The UNION operation of two tables is formed by appending rows from one table to those of a second to produce a third. Duplicate rows are eliminated Tables in an UNION operation must have the same number of columns and corresponding columns must come from the same domain A Union B k k x 1 A 2 2 B 4 3 C 6 x y 1 Table A k A 2 4 D 8 5 E 10 A 2 2 Table B y 1 y x B 4 3 C 6 4 D 8 5 E 10
  77. 77. Relational Model • • The UNION operation of two tables is formed by appending rows from one table to those of a second to produce a third. Duplicate rows are eliminated Tables in an UNION operation must have the same number of columns and corresponding columns must come from the same domain A Union B k x y k x y 1 A 2 1 A 2 2 B 4 2 B 4 3 C 6 4 D 8 5 E 10 3 Table A 6 C Table B k x y 1 A 2 4 D 8 5 E 10
  78. 78. Relational Model • Intersection • The intersection of two relational tables is a third table that contains common rows. Both tables must be union compatible. The notation for the intersection of A and B is A [intersection] B = C or A INTERSECT B k x y 1 A 2 2 B 4 3 C 6 A Intersect B y 1 A 2 Table B y A 2 D 8 5 x x 4 k k 1 Table A E 10
  79. 79. Relational Model • Difference • The difference of two relational tables is a third that contains those rows that occur in the first table but not in the second. The Difference operation requires that the tables be union compatible. The notation for difference is A MINUS B or A-B. As with arithmetic, the order of subtraction matters. That is, A - B is not the same as B - A. k x y 1 A 2 2 B 4 3 C 6 k x y 1 A 2 4 D 8 5 E 10 A MINUS B Table B y B 4 3 B MINUS A x 2 Table A k C 6 k x y 4 D 8 5 E 10
  80. 80. Relational Model • Division • The division operator results in columns values in one table for which there are other matching column values corresponding to every row in another table. k x y k 1 A 2 1 1 B 4 3 2 A 2 x y 3 B 4 A 2 4 B 4 B 3 A Table A 2 A DIV B 4 Table B
  81. 81. Normalization Normalization theory is based on the concepts of normal forms. A relational table is said to be a particular normal form if it satisfied a certain set of constraints. We shall discuss four normal forms in this Module. What is Functional Dependency ? The concept of functional dependency is the basis for the first three normal forms. A column Y of a relational table is said to be functionally dependent upon column X when values of column Y are uniquely identified by values of column X. Full functional dependence applies to tables with composite keys. Column Y in relational table R is fully functional on X of R where X is a composite key if it is functionally dependent on X and not functionally dependent upon any subset of X.
  82. 82. Normalization Un normalized Relation Remove repeating groups Normalized Relation (1NF) Remove partial dependencies 2 NF Remove transitive dependencies 3 NF Remove remaining Anomalies resulting from FD„s Boyce/Codd NF Remove multivalued dependencies
  83. 83. Normalization An Example : A company obtains parts from a number of suppliers. Each supplier is located in one city. A city can have more than one supplier located there and each city has a status code associated with it. Each supplier may provide many parts. The company creates a simple relational table to store this information: FIRST (s#, status, city, p#, qty) s# status City p# Qty Supplier identification number Status code assigned to city City where supplier is located Part number of part supplied Qty of parts supplied to date Composite primary key is (s#, p#)
  84. 84. Normalization • FIRST NORMAL FORM –1NF A relational table is said to be in the first normal form if all values of the columns are atomic. That is, they contain no repeating values. s# city status p# qty s1 London 20 p1 300 s1 London 20 p2 100 s1 London 20 p3 200 s1 London 20 p4 100 s2 Paris 10 p1 250 s2 Paris 10 p3 100 s3 Tokyo 30 p2 300 s3 Tokyo 30 p4 200
  85. 85. Normalization • SECOND NORMAL FORM – 2NF • Table FIRST contains redundant data. Redundancy causes update anomalies. • Update anomalies - problems that arise when information is inserted, deleted, or updated. • INSERT. The fact that a certain supplier (s5) is located in a particular city (Athens) cannot be added until they supplied a part. • DELETE. If a row is deleted, then not only is the information about quantity and part lost but also information about the supplier. • UPDATE. If supplier s1 moved from London to New York, then six rows would have to be updated with this new information.
  86. 86. Normalization A relational table is in second normal form 2NF if it is in 1NF and every non-key column is fully dependent upon the primary key. That is, every non-key column must be dependent upon the entire primary key. FIRST is in 1NF but not in 2NF because status and city are functionally dependent upon only on the column s# of the composite key (s#, p#). Steps for transforming a 1NF table to 2NF is: 1. Identify any determinants other than the composite key, and the columns they determine. 2. Create and name a new table for each determinant and the unique columns it determines. 3. Move the determined columns from the original table to the new table. Determinate becomes the primary key of the new table. 4. Delete the columns you just moved from the original table except for the determinate which will serve as a foreign key.
  87. 87. Normalization • SECOND NORMAL FORM – 2NF • Modification Anomalies • Tables in 2NF but not in 3NF still contain modification anomalies: • INSERT. The fact that a particular city has a certain status (Rome has a status of 50) cannot be inserted until there is a supplier in the city. • DELETE. Deleting any row in SUPPLIER destroys the status information about the city as well as the association between supplier and city.
  88. 88. Normalization SECOND NORMAL FORM – 2NF PARTS s# p# qty s1 p1 300 s1 p2 100 SECOND s1 p3 200 s# s1 p4 100 s1 London 20 s2 p1 250 s2 Paris 10 s2 p3 100 s3 Tokyo 30 s3 p2 300 s3 p4 200 city status
  89. 89. Normalization • THIRD NORMAL FORM – 2NF A relational table is in third normal form (3NF) if it is already in 2NF and every non-key column is non transitively dependent upon its primary key. In other words, all non-key attributes are functionally dependent only upon the primary key. SUPPLIER s# city status s1 London 20 s2 Paris 10 s3 Tokyo 30 s4 Paris 10 The table supplier is in 2NF but not in 3NF because it contains a transitive dependency SUPPLIER.s# —> SUPPLIER.city SUPPLIER.city —> SUPPLIER.status SUPPLIER.s# —> SUPPLIER.status
  90. 90. Normalization • Steps for transforming a table into 3NF is: 1. Identify any determinants, other the primary key, and the columns they determine. 2. Create and name a new table for each determinant and the unique columns it determines. 3. Move the determined columns from the original table to the new table. The determinant becomes the primary key of the new table. SUPPLIER s# CITY_STATUS city s1 The transformation of SUPPLIER into 3NF city status London London 20 s2 Paris Paris 10 s3 Tokyo Tokyo 30 s4 Paris Rome 50 s5 London
  91. 91. • Normalization Advantages of 3rd Normal form : • Eliminates redundant data which in turn saves space and reduces manipulation anomalies. Example: INSERT: Facts about the status of a city, Rome has a status of 50, can be added even though there is not supplier in that city. DELETE: Information about supplier can be deleted without destroying information about a city. UPDATE: Changing the location of a supplier or the status of a city requires modifying only one row. s# city CITY_STATUS city status s1 s2 SUPPLIER Paris London 20 s3 The transformation of SUPPLIER into 3NF London Tokyo Paris 10 s4 Paris Tokyo 30 s5 London Rome 50
  92. 92. Normalization • Advanced Forms :: BOYCE CODD NORMAL FORM Many practitioners argue that placing entities in 3NF is generally sufficient because it is rare that entities that are in 3NF are not also in 4NF and 5NF. The advanced forms of normalization are:  Boyce-Codd Normal Form  Fourth Normal Form  Fifth Normal Form Boyce-Codd normal form (BCNF) is a more rigorous version of the 3NF. BCNF is based on the concept of determinants. A determinant column is one on which some of the columns are fully functionally dependent. A relational table is in BCNF if and only if every determinant is a candidate key.
  93. 93. Database Design • This section presents and discusses – • • • How to translate the E-R (conceptual) model (diagram) to an RDBMS (logical) schema. Exercise on E-R Modeling and Database Design Some Guidelines • Entities: Create one table for each simple (not a sub-type or super-type) entity. • Attributes: Map each attribute to a candidate column with a more precise format. • Optional attributes become null columns • Mandatory attributes become not null columns • Unique Identifier: Convert the components of the unique identifier to the primary key of the table.
  94. 94. Database Design • Sub-types: A sub-type entity is simply an entity with its own attributes or relationships, but it also inherits any attributes and/or relationships from its parent entity (super-type) • 1:1 relationships: Merge the two entities into a single table, keeping all attributes. Identify (add if needed) the primary key. • 1:Many relationships: Create two tables, one for each entity. Post the primary key from the 1 side to the N side (add attributes), and identify it as a foreign key. (Add the primary key from the 1 side to the attributes on the Many side. The posted attributes are a foreign key.) • M:N (Many:Many) relationships: Create a new (bridge) table and post the primary keys from both entities as attributes in the new table. The posted attributes are foreign keys.
  95. 95. Database Design A few comments…  There are more rules, treating exceptions, but these are good enough in most cases  There may occur reasons to violate the rules.  Always: use common sense and expect iterative development.  Use CASE tools like Erwin wherever possible. Tools can automatically generate SQL table definitions from drawn E-R diagrams.
  96. 96. Database Design:: Assignment Develop an E-R model and database schema for a system to handle purchase orders.
  97. 97. Creating a DB environment : Summary  The first step in designing a database application is to understand what information the database needs to store and what integrity constraints or business rules apply to the data.  Data Model is to a Database what a Building plan or a blueprint is to a Building. It is the conceptual model of the Database.  Given a relational schema we need to decide whether it is a good design or whether we need to decompose it into smaller relations. Normalization gives the guidance to such decomposition.
  98. 98. 4.0 Structured Query Language Learning Objectives: At the end of this Topic you will be able to – • Write simple SQL queries • Get familiar with the various relational operations such as SELECT, PROJECT and JOIN
  99. 99. An Introduction • Structured Query Language - (SQL) is the most widely used commercial relational database language. The SQL has several parts : •DML – The Data Manipulation Language (DML) •DDL – The Data Definition Language (DDL) •Embedded and dynamic SQL •Security •Transaction management •Client-server execution and remote database access SELECT column-list FROM table-names WHERE condition(s)
  100. 100. Query Processing  Query Processing • Query in a High Level Language (typically a 4 GL) • Parsing : The parser converts a query, submitted by a database user and written in a high-level language, into an algebraic operators expression. • Optimization : It is the key Topic for query processing design. It receives the expression and builds a good execution plan. The plan determines the order of execution of the operators and selects suitable algorithms for implementation of the operators. • Code Generation for the Query : The planned code is built with the aim of retrieving the result of the query with high performance. • Code execution by Database Processor : The query plan is executed by the execution engine Topic that delivers the result for the user. • Result of the Query
  101. 101. Query Processing SELECT column-list FROM table-names WHERE condition(s) Conditional Selection 137 150 Door latch Door seal 22.50 6.00
  102. 102. Query Processing • The SQL Select Statement performs three Types of Operations 1. Projection SELECT column-list FROM tables-names WHERE condition(s) 2. Join 3. Selection
  103. 103. Performing Projection SELECT Module_Title, C_Hrs FROM Module Module Result Table Cours e No. Cours e_Title C_Hrs . Dept. C CIS 120 Intro to CIS 4 Cis Intro to C IS 4 M K T 333 Intro to M k ting 3 MKT Intro to M k ting 3 E CO 473 B A 201 CIS 345 Labor E c on. Intro to S tat. Intro to Dbas e 3 E CO 5 E CO 4 CIS Labor E c on. Intro to S tat. Intro to D bas e 3 5 4 C ours e_Title C _H rs .
  104. 104. Performing a Selection Operation SELECT * FROM Module WHERE C_Hrs = 4 Course No. Course Title CIS 120 MKT 333 ECO 473 BA201 CIS 345 Intro to CIS Intro to Mkting Labor Econ. Intro to Stat. Intro to Dbase Course No. Course Title CIS 120 CIS 345 Intro to CIS Intro to Dbase C. Hrs. Dept. C 4 3 3 5 4 Module Cis MKT ECO ECO CIS C. Hrs. Dept. C 4 Cis 4 CIS Result Table
  105. 105. Performing both Projection and Selection SELECT Module_Title, C_Hrs FROM Module WHERE Dept_C =„CIS‟ Result Table Module C ours e_N o C ours e_Title C _ H rs . D ept_C C IS 120 Intro to C IS 4 C IS M K T 333 Intro to M k ting 3 MKT E C O 473 B A 201 C IS 345 Labor E c on. Intro to S tat. Intro to D bas e 3 E CO 5 E CO 4 C IS Cours e_Title Intro to CIS Intro to Dbas e C_ Hrs . 4 4
  106. 106. Performing both Projection and Selection • Basic SELECT Statement WHERE Clause Operators • • =, <, >, <=, >= IN (List) • • • BETWEEN min_val AND max_val • • • WHERE CODE IN („ABC‟, „DEF‟, „HIJ‟) - would return only rows with one of those 3 literal values for the code attribute WHERE Qty_Ord BETWEEN 5 and 15 - would return rows where Qty_Ord is >= 5 and <= 15 - Works on character data using ascending alphabetical order LIKE “literal with wildcards” % used for multiple chars. _ single char. • WHERE Name LIKE „_o%son‟ - returns rows where name has o as the 2nd character and ends with son - Torgeson or Johnson • NOT • WHERE NOT Name = „Johnson‟ - would return all rows where name <> Johnson - lowest priority in operator order AND and OR, Use Parentheses to control order •
  107. 107. Joining Tables  Joining Tables • To appropriately join tables, the tables must be related and we apply a where clause which equates the primary key column of the table on the one side of the relationship with the parallel foreign key column of the many side table.  This type of join is called an Equi-join.  Our example will join Modules and departments where dept_code is the linking “key” column. • The next series of slides takes you through a step by step process of combining data rows from one table with data rows in another table. • The next slides show progressive steps in the join process. • The first slide introduces the SQL Select statement the shows the join operation and a picture of the two tables that the join will operate on.
  108. 108. Joining Tables Joining Two Tables - Select and Tables SELECT * FROM Module C, department D WHERE D.Dept_Code = C.Dept_Code Module Course_No Course_Title C_Hrs Dept_Code CIS 120 Intro to CIS 4 Cis MKT 333 Intro to Mkting 3 MKT ECO 473 BA201 CIS 345 Labor Econ. Intro to Stat. Intro to Dbase 3 ECO 5 ECO 4 CIS Department SQL will compare every row of the 1st table with the first row of the 2nd table. Then it will compare all rows of the 1st with the second row of the second, and so on only rows where the condition is met are placed in the result table. D e p t C o d e D e p t nam e O ffic e # MK T M arke ting 244 C IS C o m p . Info . S ys . 302 ECO E c o no m ic s 244
  109. 109. Joining Tables Joining Two Tables - Row 1 Module to Row 1 Dept SELECT * FROM Module C, department D WHERE D.Dept_Code = C.Dept_Code Course_No CIS 120 Intro to CIS 4 CIS MKT 333 Module Course_Title C_Hrs Dept_Code Intro to Mkting 3 MKT ECO 473 BA201 CIS 345 Labor Econ. Intro to Stat. Intro to Dbase 3 ECO 5 ECO 4 CIS No match so row not placed in results Department Dept Code MKT CIS ECO Dept name Office# Marketing 244 Comp. Info. Sys. 302 Economics 244 RESULT TABLE Course_No Course_Title C_Hrs Dept_Code Dept_Nam e Office#
  110. 110. Joining Tables Joining Two Tables - Row 1 Module to Row 2 Dept SELECT * FROM Module C, department D WHERE D.Dept_Code = C.Dept_Code Course_No CIS 120 Intro to CIS 4 Cis MKT 333 Module Course_Title C_Hrs Dept_Code Intro to Mkting 3 MKT ECO 473 BA201 CIS 345 Labor Econ. Intro to Stat. Intro to Dbase 3 ECO 5 ECO 4 CIS Match on condition causes a result row to be produced. Department Dept Code MKT CIS ECO Dept name Office# Marketing 244 Comp. Info. Sys. 302 Economics 244 RESULT TABLE Course_No Course_Title CIS 120 Intro to CIS C_Hrs Dept_Code Dept_Name Office# 4 Cis Comp. Info S 302
  111. 111. Joining Tables Joining Two Tables - Row 1 Module to Row 3 Dept SELECT * FROM Module C, department D WHERE D.Dept_Code = C.Dept_Code Course_No CIS 120 Intro to CIS 4 Cis MKT 333 Module Course_Title C_Hrs Dept_Code Intro to Mkting 3 MKT ECO 473 BA201 CIS 345 Labor Econ. Intro to Stat. Intro to Dbase 3 ECO 5 ECO 4 CIS Department Dept Code MKT CIS ECO Dept name Office# Marketing 244 Comp. Info. Sys. 302 Economics 244 RESULT TABLE Course_No Course_Title CIS 120 Intro to CIS C_Hrs Dept_Code Dept_Name Office# 4 Cis Comp. Info S 302
  112. 112. 5.0 Internal Management Learning Objective  After completing this topic you will be able to :  Describe the various components of the computer system that provide data storage facilities to a DBMS  Understand how DBMS communicates with the host system  Outline some of the database tuning factors
  113. 113.   Computer file management and DBMS Computer files are stored in external media such as disks and tapes. • Direct access • Sequential access Input output of data and memory management is managed by the Operating system • File manager DBMS • Disk manager File Request DBMS/Host inter-com File Manager Logical Page Req Disk Manager Physical Page Access
  114. 114. Intercommunication  DBMS/Host communication : • A file is a collection of pages. A page is a unit of Input Output. • The DBMS sends a file request to the file manager. • The file manager has no idea where the requested page is physically stored. • The file manager in turn communicates with the disk manager. • The file manager provides the database system with the given page. • The database system converts the same into a logical form as understandable by the user.
  115. 115. Tuning at the internal level  Indexes • • •  Hashing • •  Database indexes are important means of speeding up access to set of records. Especially in a relational database. Index is very useful in existence tests. Once a index is created it is transparent to the user. Hashing is directly determining a page address for a given record without the overhead of creating indexes. The main problem associated with hashing are overflow & underflow. Clusters • • Physically storing related pages in the form of intra file subsets. Inter file clustering to store records from distributed databases in the same physical page.
  116. 116. Internal Management : Summary  Database files are stored in logical page sets.  The underlying physical files that store a database need not map to the logical representation of the DBMS.  Indexes are useful means of speeding up data access in large databases . They incur overheads.  Hashed functions speed up individual record access, however has overflow & underflow problems.  Intra and inter file clustering of the physical records speed up certain operations at the cost of other types of data manipulations.
  117. 117. 6.0 Database Trends Learning Objective – At the end of this Topic you will be : • Familiar with various terms like • OLAP • Data warehousing • Data mining • Aware of the business needs that require data to be analyzed in multiple dimensions
  118. 118. Multidimensional Data Analysis • On-line analytical processing (OLAP) • Multidimensional data analysis • Supports manipulation and analysis of large volumes of data from multiple dimensions/perspectives
  119. 119. Types of databases • Major Types of Databases Databases centralis ed databases dis trib uted databases network databases
  120. 120. Centralized database  Used by single central processor or multiple processors in client/server network disk CPU printer Disk Controller Printer Controller Tape Drive Tape drive Controller System bus Memory Controller Memory
  121. 121. Distributed database  Stored in more than one physical location • • Partitioned database Duplicated database
  122. 122. Multidimensional data model  On-line analytical processing (OLAP) • Multidimensional data analysis • Supports manipulation and analysis of large volumes of data from multiple dimensions/perspectives
  123. 123. Data warehouse  Supports reporting and query tools  Stores current and historical data  Consolidates data for management analysis and decision making
  124. 124. Data warehouse  Data mart • •  Subset of data warehouse Contains summarized or highly focused portion of data for a specified function or group of users Data mining • Tools for analyzing large pools of data • Find hidden patterns and infer rules to predict trends
  125. 125. Databases and the web  Hypermedia database • Organizes data as network of nodes • Links nodes in pattern specified by user • Supports text, graphic, sound, video and executable programs
  126. 126. Databases and the web  Database server • Computer in a client/server environment runs a DBMS to process SQL statements and perform database management tasks  Application server  Software handling all application operations
  127. 127. Database Trends : Summary  The database forms the backend for any kind of application architecture be it a client server, distributed system such as the web etc.  Users want to see data in as many dimensions possible, therefore it is important to be aware of concepts regarding Data warehousing , Data mining and On-line analytical processing (OLAP)
  128. 128. Database Fundamentals: Next Step Resource Type Description Book Case*Method: Entity Relationship Modeling - Richard Barker Book Data & Databases – Joe Celko Book An Introduction to Database Systems – C. J. Date Book The Data Modeling Handbook Rein Gruber and Gregory Book Data Modeling for Information Professionals – Bob Schmidt Book Data Model Patterns – David C. Hay, Richard Barker Reference Topic or Topic

×