Dbms module i


Published on

Published in: Engineering, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Dbms module i

  1. 1. DATABASE MANAGEMENT SYSTEM (FOR BOTH CSE-4TH -SEM/MECH-3RD SEM DEPARTMENT) MODULE-1 Database Management System (DBMS) DBMS contains information about a particular enterprise • Collection of interrelated data • Set of programs to access the data • An environment that is both convenient and efficient to use Database Applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Online retailers: order tracking, customized recommendations • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions Purpose of Database Systems In the early days, database applications were built directly on top of file systems
  2. 2. Drawbacks of using file systems to store data: • Data redundancy and inconsistency • Multiple file formats, duplication of information in different files • Difficulty in accessing data • Need to write a new program to carry out each new task • Data isolation — multiple files and formats • Integrity problems • Integrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitly • Hard to add new constraints or change existing ones Databases touch all aspects of our lives DBMS Architecture Different abstract levels - a widely accepted general architecture for a database - database described by three abstract levels - internal schema (physical database) - conceptual schema (conceptual database) - external schema (view) Objectives - insulation of application programs and data - support of multiple user views - use of schema to store the DB description (mete-data) The Three Schema Architecture : External schema - describes a subset of the database that a particular user group is interested in, according to the format the format user wants, and hides the rest - may contain virtual data that is derived from the files, but is not explicitly
  3. 3. stored Conceptual schema - hides the details of physical storage structures and concentrates on describing entities, data types, relationships, operations, and constraints. Internal schema - describes the physical storage structure of the DB - uses a low-level (physical) data model to describe the complete details of data storage and access paths THREE-LEVEL ARCHITECTURE
  4. 4. EXTERNAL LEVEL (highest level) ̒The user’s view of the database. ̒Consists of a number of different external views of the DB. ̒Describes part of the DB for particular group of users. ̒Provides a powerful and flexible security mechanism by hiding parts of the DB from certain users. The user is not aware of the existence of any attributes that are missing from the view. ̒It permits users to access data in a way that is customized to their needs, so that the same data can be seen by different users in different ways, at the same time. CONCEPTUAL LEVEL ̒The logical structure of the entire database as seen by DBA. ̒What data is stored in the database? ̒The relationships among the data. ̒Complete view of the data requirements of the organization, independent of any storage consideration. ̒Represents: - entities, attributes, relations - constraints on data - semantic information on data - security, integrity information Supports each external view: any data available to a user must be contained in or derivable from the conceptual level. INTERNAL LEVEL ̒Physical representation of the DB on the computer.
  5. 5. ̒How the data is stored in the database. ̒Physical implementation of the DB to achieve optimal run time performance and storage space utilization. - Storage space allocation for data and indexes - Record description for storage - Record placement - Data compression, encryption PHYSICAL LEVEL Managed by the OS under the direction of the DBMS. SCHEMAS, MAPPINGS, INSTANCES DB schema: overall description of the DB. Three different schemas according to the level of abstraction. DBMS: mapping between schemas consistency of schemas Conceptual/Internal Mapping: to find the actual record (combinations) in physical storage that constitutes a logical Record in the conceptual schema. External/Conceptual Mapping: map names in the user’s view Onto the relevant part of the conceptual schema. Instances and Schemas Similar to types and variables in programming languages Schema – the logical structure of the database l Example: The database consists of information about a set of customers and accounts and the relationship between them) l Analogous to type information of a variable in a program l Physical schema: database design at the physical level l Logical schema: database design at the logical level Instance – the actual content of the database at a particular point in time
  6. 6. l Analogous to the value of a variable Database instance: the data in the DB at any particular point in time. DATA INDEPENDENCE The ability to modify a scheme definition in one level without Affecting a scheme definition in a higher level is called data independence. 1. There are two kinds: Logical data independence ̒The ability to modify the conceptual scheme without causing application programs to be rewritten. ̒Immunity of external schemas to changes in the conceptual schema. ̒Usually done when logical structure of database is altered
  7. 7. Physical data independence ̒The ability to modify the internal scheme without having to change the conceptual or external schemas. ̒Modifications at this level are usually to improve performance. Three Schema Architecture Data and meta-data: - three schemas are only meta-data (descriptions of data) - data actually exists only at the physical level Mapping - DBMS must transform a request specified on an external schema into a request against the conceptual schema, and then into the internal schema - requires information in meta-data on how to accomplish the mapping among various levels - overhead (time-consuming) leading to inefficiencies - few DBMSs have implemented the full three-schema architecture Benefits of Three Schema Architecture: Logical data independence - The capacity to change the conceptual schema without having to change external schema or application programs ex: Employee (E#, Name, Address, and Salary) A view including only E# and Name is not affected by changes in any other attributes. Physical data independence - the capacity to change the internal schema without having to change the conceptual (or external) schema - internal schema may change to improve the performance (e.g., creating additional access structure) - easier to achieve logical data independence, because application programs are dependent on logical structures
  8. 8. Data Models Data abstraction - one fundamental characteristic of the database approach - hides details of data storage that are not needed by most database users and applications Data model - a set of data structures and conceptual tools used to describe the structure of a database (data types, relationships, and constraints) - used in the definition of the conceptual, external, and internal schema - must provide means for DB designers to represent the real-world information completely and naturally. Data Models High-level (conceptual) data models - use concepts such as entities, attributes, relationships - object-based models: ER model, OO model Representational (implementation) data models - most frequently used in commercial DBMSs - record-based models: relational, hierarchical, network Low-level (physical) data models - to describe the details of how data is stored - captures aspects of database system implementation: record structures (fixed/variable length) and ordering, access paths (key indexing), etc. Schemas and Instances In any data model, it is important to distinguish between the description of the database and the database itself. Database schema (meta-data)
  9. 9. - overall description of a database, specified by a set of definitions - specified during database design (not change frequently) - similar to the notion of type definition in programs Database instance - current contents of the database (actual data): DB state - may change frequently Distinction between database schema and database state - a database just specified (or defined) is in empty state - initial state would be achieved when the data is loaded - DBMS is responsible to ensure every database state is valid Data Definition and Manipulation Languages Data definition language (DDL) - not a procedural language - notations for describing the types of entities and relationships among entities DDL statements --® data dictionary Data manipulation language (DML) - for accessing and modifying data - non-procedural: specifying "what" to access - procedural: specifying "what" and "how" to get - non-procedural languages could be easy to use but may not be efficient Database Administrator Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise’s information resources and needs. • Database administrator's duties include: • Schema definition
  10. 10. • Storage structure and access method definition • Schema and physical organization modification • Granting user authority to access the database • Specifying integrity constraints • Acting as liaison with users • Monitoring performance and responding to changes in requirements Data Models DATA MODELS Data models are a collection of conceptual tools for describing data, data relationships, and data semantics and data constraints. Components: structural part manipulative part integrity rules There are three different groups: Object-based Data Models Record-based Data Models Physical Data Models Describe data at the conceptual and external levels Describe data at the internal level Object-based Data Models - Entity-relationship model. - Object-oriented model. - Semantic data model. - Functional data model Record-based Data Models ̒Named so because the database is structured in fixed format records of
  11. 11. several types. ̒Each record type defines a fixed number of fields, or attributes. ̒Each field is usually of a fixed length (this simplifies the implementation). ̒The three most widely accepted models are the Relational, network, and hierarchical data model. Physical Data Models 1. Are used to describe data at the lowest level. 2. Very few models, e.g. • Unifying model. • Frame memory. Entity-relationship model - popular high-level conceptual model used in DB design - proposed by P. Chen in 1976 (ACM TODS) - Perception of real-world consisting of a collection of entities and relationships among them OO model. - DB is defined in terms of objects, their properties, and their operations (methods) Relational model - represents a DB as a collection of tables Network model - represents DB as record types and 1:N relationships Hierarchical model - represents data as hierarchical tree structures An entity may be defined as a thing which is recognized as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. E-R Diagram Notation: Entity Key attributes Weak Entity Multivalue
  12. 12. attribute Relationship Attributes Composite Attributes Relationship between two entities ER Diagrams Rectangles represent entity sets. -Diamonds represent relationship sets. -Lines link attributes to entity sets and entity sets to relationship sets. -Ellipses represent attributes -Double ellipses represent multivalve attributes. -Dashed ellipses denote derived attributes. -Underline indicates primary key attributes (will study later)
  13. 13. ER Diagram With Composite, Multivalued, and Derived Attributes Relationship Sets with Attributes: OneToMany Relationship:
  14. 14. ManyToOneRelationships: In a many to one relationship a loan is associated with several (including 0) customers via borrower, a customer is associated with at most one Loan via borrower. ManyToMany Relationship:
  15. 15. Specialization Example: Entities: Entity-a thing (animate or inanimate) of independent physical or conceptual existence and distinguishable. In the University database context, an individual student, faculty member, a class room, courses are entities. Entity Set or Entity Type-Collection of entities all having the same properties. Student entity set –collection of all student entities. Course entity set –collection of all course entities Attributes - Each entity is described by a set of attributes/properties. Student entity Stud Name–name of the student. Roll Number–the roll number of the student. Sex–the gender of the student etc. All entities in an Entity set/type have the same set of attributes Types of Attributes •Simple Attributes-having atomic or indivisible values. example: Dept–a string Phone Number–an eight digit number •Composite Attributes-having several components in the value. example: Qualification with components (Degree Name, Year, University
  16. 16. Name) •Derived Attributes-Attribute value is dependent on some other attribute. Example: Age depends on Date Of Birth. So age is a derived attribute. •Single-valued-having only one value rather than a set of values. -for instance, Place Of Birth–single string value. •Multi-valued -having a set of values rather than a single value. -for instance, Courses Enrolled attribute for student Email Address attribute for student Previous Degree attribute for student. •Attributes can be:-simple single-valued, simple multi-valued, composite single-valued or composite multi-valued Diagrammatic Notation for Entities entity -rectangle attribute -ellipse connected to rectangle multi-valued attribute -double ellipse composite attribute -ellipse connected to ellipse derived attribute-dashed ellipse Relationships •When two or more entities are associated with each other, we have an instance of a Relationship.
  17. 17. •E.g.: student Ramesh enrolls in Discrete Mathematics course •Relationship enrolls has Student and Course as the participating entity sets. •Tuples in enrolls–relationship instances •enrolls is called a relationship Type/Set. ER Diagram for a Banking Enterprise:
  18. 18. Summary off Symbolls Used iin ER Notation:
  19. 19. Cardinality Ratios/Mappings •One-to-one: An E1entity may be associated with at most one E2 entity and similarly an E2entity may be associated with at most one E1 entity. •One-to-many: An E1entity may be associated with many E2entities whereas an E2entity may be associated with at most one E1entity .•Many-to-one: …( similar to above) •Many-to-many: Many E1entities may be associated with a single E2 entity and a single E1entity may be associated with many E2entities
  20. 20. Key Attributes: Primary Key: The primary key of a relational table uniquely identifies each record in the table. It can either be a normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no more than one record per person) or it can be generated by the DBMS. Examples: Imagine we have a STUDENTS table that contains a record for each student at a university. The student's unique student ID number would be a good choice for a primary key in the STUDENTS table. The student's first and last name would not be a good choice, as there is always the chance that more than one student might have the same name. Foreign Key:A foreign key is a field in a relational table that matches the primary key column of another table. The foreign key can be used to cross- reference tables Referential Integrity Referential integrity is a database concept that ensures that relationships between tables remain consistent. When one table has a foreign key to another table, the concept of referential integrity states that you may not add a record to the table that contains the foreign key unless there is a corresponding record in the linked table. It also includes the techniques known as cascading update and cascading delete, which ensure that changes made to the linked table are reflected in the primary table. Referential integrity enforces the following three rules: 1. We may not add a record to the Employees table unless the Managed By attributes points to a valid record in the Managers table.
  21. 21. 2. If the primary key for a record in the Managers table changes, all corresponding records in the Employees table must be modified using a cascading update. 3. If a record in the Managers table is deleted, all corresponding records in the Employees table must be deleted using a cascading delete. Importance of Referential Integrity By providing specification of columns within a referencing table that are foreign keys for columns in some other referenced table, referential integrity is a reliable mechanism which prevents accidental database corruptions when doing inserts, updates, and deletes. It states that a row cannot exist in a table with a non-null value for a referencing column if an equal value does not exist in a referenced column. Once we define the employee_number as a foreign key in employee_phone relation, if we try to insert a row with a primary key value that does not exist in employee table, the system will not allow this insertion. The following summarize the benefits of referential integrity: a) Ensure data integrity and consistency base on primary key and foreign key b) Increases development productivity, because it is not necessary to code SQL statements to enforce referential constraints, the Teradata RDBMS automatically enforces referential integrity. Candidate Key: A candidate key is a combination of attributes that can be uniquely used to identify a database record without any extraneous data. Each table may have one or more candidate keys. One of these candidate keys is selected as the table primary key.
  22. 22. Check Constraint: A check constraint (also known as table check constraint) is a condition that defines valid data when adding or updating an entry in a table of a relational database. A check constraint is applied to each row in the table. The constraint must be a predicate. It can refer to a single or multiple columns of the table. The result of the predicate can be TRUE, FALSE, or UNKNOWN, depending on the presence of NULLs. If the predicate evaluates to UNKNOWN, then the constraint is not violated and the row can be inserted or updated in the table. Each check constraint has to be defined in the CREATE TABLE or ALTER TABLE statement using the syntax: 1. CREATE TABLE table_name (...,CONSTRAINT constraint_name CHECK ( predicate ), ... ) 2. ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK ( predicate ) NOT NULL Constraint: A NOT NULL constraint is functionally equivalent to the following check constraint with an IS NOT NULL predicate:CHECK (column IS NOT NULL) Domain Constraints: A domain is defined as the set of all unique values permitted for an attribute. For example, a domain of date is the set of all possible valid dates, a domain of integer is all possible whole numbers, a domain of day-of-week is Monday, Tuesday ... Sunday. This in effect is defining rules for a particular attribute. If it is determined that an attribute is a date then it should be implemented in the database to prevent
  23. 23. invalid dates being entered. A domain of possible values should be associated with every attribute. These domain constraints are the most basic form of integrity constraint. They are easy to test for when data is entered. Domain types 1. Attributes may have the same domain, e.g. cname and employee-name. 2. It is not as clear whether bname and cname domains ought to be distinct. 3. At the implementation level, they are both character strings. 4. At the conceptual level, we do not expect customers to have the same names as branches, in general. 5. Strong typing of domains allows us to test for values inserted, and whether queries make sense. Newer systems, particularly object-oriented database systems, offer a rich set of domain types that can be extended easily. Database Users: Users are differentiated by the way they expect to interact with the system -Application programmers – interact with system through DML calls -Sophisticated users – form requests in a database query language -Specialized users – write specialized database applications that do not fit into the traditional data processing framework -Naïve users – invoke one of the permanent application programs that have been written previously l Examples, people accessing database over the web, bank tellers,clerical staff
  24. 24. Database Administrator n Coordinates all the activities of the database system; the database administrator has a good understanding of the Enterprise’s information resources and needs. Database administrator's duties include: -Schema definition -Storage structure and access method definition -Schema and physical organization modification -Granting user authority to access the database -Specifying integrity constraints -Acting as liaison with users -Monitoring performance and responding to changes in requirements Keys: -A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity. -A candidate key of an entity set is a minimal super key l Customer_id is candidate key of customer l account_number is candidate key of account -Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.
  25. 25. ! " # #$ % &' (# ' ) $ " # #$ $ *& ) #) && ) $ + " # #$ &' (# ' ) $ ,
  26. 26. " # #$ $ *#')+
  27. 27. " # #$ $ *#') " # #$ - **&
  28. 28. " # #$ . ' ' $ # ' ) $ " # #$ / # ) #'# $ ' ) $