  2. 2. Database System Concepts and Architecture2.1 Data Models, Schemas, and Instances2.2 DBMS Architecture and Data Independence2.3 Database Languages and Interfaces2.4 The Database System Environment2.5 Classification of Database Management Systems2.6 Summary 2-1 2
  3. 3. 2.1 Data Models, Schemas, and Instances ‧data types ‧relationshipsData Model: A set of concepts to describe the structure of adatabase, and certain constraints that the database should obey. Provide data abstractionData Model Operations: Operations for specifying databaseretrievals and updates by referring to the concepts of the datamodel.‧generic operation: insert, delete, modify, retrieve‧user-defined operations 2-2 3
  4. 4. 2.1.1 Categories of Data Models:- Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) ‧entity ‧attribute ‧relationship- Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. ‧record formats ‧record ordering ‧access paths- Implementation (record-oriented) data models: Provide concepts thatfall between the above two, balancing user views with some computerstorage details. ‧relational ‧network ‧hierarchical 2-2 4
  5. 5. 2.1.2 Schemas, Instances and Database State cf databaseDatabase Schema (meta-data): The description of a database. Includesdescriptions of the database structure and the constraints that should holdon the database.Schema Diagram: A diagrammatic display of (some aspects of ) adatabase schema. (refer to Fig 2.1 2-5)Database Instance: The actual data stored in a database at a particularmoment in time. Also called database state ( or occurrence, snapshot)(refer to Fig 1.2 2-6) Each schema construct has its own current set of instances.The database schema changes very infrequently. The database statechanges every time the database is updated. Schema is also calledintension, whereas state is called extension. 2-3 5
  6. 6. Figure 2.1 Schema diagram for UNIVERSITY databaseschema construct Known data: name of record types, data items 2-4a 6
  7. 7. Figure 1.2 UNIVERSITY Database2-4 7
  8. 8. define empty state load initial state update valid state state satisfy database schemaupdate 2-3 8
  9. 9. 2.2 DBMS Architecture and Data Independence2.2.1 Three-Schema ArchitectureProposed to support DBMS characteristics of:- Insulation of programs and data/program and operations (program-data and program-operation independence)- Support of multiple views of the data.- Use of catalog (database description)Defines DBMS schema at three levels: (see 2-9)- Internal schema at the internal level to describe data storage structures and access paths. Typically uses a physical data model.- Conceptual schema at the conceptual level to describe the structure and constraints for the whole database. Uses a conceptual or an implementation data model.- External schema at the external level to describe the various user views. Usually uses the same data model as the conceptual level or high-level data model.Mappings among schema levels are also needed. Programs refer to an external schema, 2-5 9and are mapped by the DBMS to the internal schema for execution
  10. 10. Figure 2.2 The Three-schema architecture 2-6 2-6 10
  11. 11. 2.2.2 Data Independence By adding or removing a record type or data item to · expand the database (2-11) · reduce the databaseLogical Data Independence: The capacity to change the conceptual schema withouthaving to change the external schemas and their application programs.Physical Data Independence: The capacity to change the internal schema withouthaving to change the conceptual schema. Reorganize physical files to improve performance e.g. List all sections offered in Fall 1998When a schema at a lower level is changed, only the mappings between thisschema and higher-lever schemas need to be changed in a DBMS that fully supportsdata independence. The higher-level schemas themselves are unchanged. Hence, theapplication programs need not be changed since they refer to the external schemas. Disadvantages of two levels of mappings: Overhead during compilation or execution of a query or program 2-7 11
  12. 12. UNIVERSITY Conceptual SchemaSTUDENT (Name, Student Number, Class, Major)COURSE (Course Name, Course Number, Credit, Dept)PREREQUISITE (Course Number, Prerequisite Number)SECTION (Section Id, Course Number, Semester, Year, Instructor)GRADE_REPORT(Student Number, Section Id , Grade)UNIVERSITY External SchemaTRANSCRIPT(Student Name, Course Number, Grade, Semester, Year, Section Id) derived from STUDENT, SECTION, GRADE_REPORTPREREQUISITES(Course Name, Course Number, Prerequisites) derived from PREREQUISITE, COURSEChange GRADE-REPORT Schema ConstructGRADE_REPORT (Student Number, Student Name, Section Id, Course Number, Grade)Change Mapping (& View Definition)TRANSCRIPT derived from SECTION, GRADE_REPORT 2-7a 12
  13. 13. 2.3 Database Languages and Interfaces provide appropriate languages and interfaces for each category of users.2.3.1 DBMS LanguagesData Definition Language (DDL): Used by the DBA and database designers tospecify the conceptual schema of a database. In many DBMSs, the DDL is alsoused to define internal and external schemas (views). In some DBMSs, separatestorage definition language (SDL) and view definition language (VDL) areused to define internal and external schemas. DDL CompilerData Manipulation Language (DML): Used to specify database retrievals andupdates (insertion, deletion, modifications)- DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language).- Alternatively, stand-alone DML commands can be applied directly (query language). 2-8 13
  14. 14. Types of DML-Procedural DML:• Also called record-at-a-time (record-oriented) or low-level DML• Must be embedded in a programming language.• Searches for and retrieves individual database records and uses looping and other constructs of the host programming language to retrieve multiple records.-Declarative or non-procedural DML:• Also called set-at-a-time (set-oriented) or high-level DML.• Can be used as a stand-alone query language or can be embedded in a programming language.• Searches for and retrieves information from multiple related database records in a single command.- host language: general-purpose language- data sublanguage: DML- C++ 2-9 14
  15. 15. 2.3.2 DBMS Interfaces- Stand-alone query language interfaces. (casual end user)- Programmer interfaces for embedding DML in programming languages: (programmer) -Pre-compiler Approach -Procedure (Subroutine) Call Approach- User-friendly interfaces: -Menu-based Interfaces for Browsing. -Forms-based Interfaces. -Graphical User Interfaces. -Natural language Interfaces -Combination of the above-Interfaces for Parametic Users (using function keys)- Interfaces for the DBA: -Creating accounts, granting authorizations -Setting system parameters 2-10 15 -Changing schemas or access path
  16. 16. 2.4 The Database System Environment2.4.1 DBMS Component Modules Figure 2.3 2-11 16
  17. 17. 2.4.2 Database System UtilitiesTo perform certain functions such as:- Loading data stored in files into a database. Conversion tool- Backing up the database periodically on storage.- File reorganizing database file structures.- Report generation utilities.- Performance monitoring utilities.- Other functions, such as sorting, user monitoring, data compression, etc. 2-12 17
  18. 18. 2.4.3 Tools, Application Environments, and Communications FacilitiesData dictionary utility:- Used to store schema descriptions and other information such as design decisions, application program descriptions, user information, usage standards, etc. (comment)-Active data dictionary is accessed by DBMS software and users/DBA.-Passive data dictionary is accessed by users/DBA only.Communications Facilities- Allow users at locations remote from the database system site to access the database. DB (DBMS)/DC (Data Communication System) 2-12 18
  19. 19. 2.5 Classification of Database Management SystemsBased on the data model used:•Data models -Traditional: Relational, Network (see 2-19), Hierarchical - Emerging: Object-oriented, Semantic, Entity- Relationship, other.Other classifications:•Number of users : Single-user (typically used with personal computers) vs. multi-user (most DBMSs)•Number of sites: Centralized (uses a single computer) vs. distributed (uses multiple computers). Homogeneous vs. Heterogeneous• Cost of DBMS software. $10,000~100,000 $100~3,000•Types of access paths used. (inverted file structures, …)•Purpose general purpose special purpose e.g. airline reservations, telephone directory, on-line transaction processing system 2-13 19
  20. 20. Figure 2.4 A Network Schema 2-14 20