IKJ SAMUEL
Database System Concepts and Architecture



2.1 Data Models, Schemas, and Instances
2.2 DBMS Architecture and Data Independence
2.3 Database Languages and Interfaces
2.4 The Database System Environment
2.5 Classification of Database Management Systems
2.6 Summary


         2-1                                        2
2.1 Data Models, Schemas, and Instances
                                                       ‧data types
                                                       ‧relationships
Data Model: A set of concepts to describe the structure of a
database, and certain constraints that the database should obey.
      Provide data abstraction



Data Model Operations: Operations for specifying database
retrievals and updates by referring to the concepts of the data
model.
‧generic operation: insert, delete, modify, retrieve
‧user-defined operations




                2-2                                                     3
2.1.1 Categories of Data Models:
- Conceptual (high-level, semantic) data models: Provide concepts that
  are close to the way many users perceive data.
  (Also called entity-based or object-based data models.)
  ‧entity ‧attribute ‧relationship


- Physical (low-level, internal) data models: Provide concepts that
  describe details of how data is stored in the computer.
 ‧record formats ‧record ordering ‧access paths


- Implementation (record-oriented) data models: Provide concepts that
fall between the above two, balancing user views with some computer
storage details.
 ‧relational    ‧network ‧hierarchical
                 2-2                                                    4
2.1.2 Schemas, Instances and Database State
        cf database
Database Schema (meta-data): The description of a database. Includes
descriptions of the database structure and the constraints that should hold
on the database.


Schema Diagram: A diagrammatic display of (some aspects of ) a
database schema. (refer to Fig 2.1 2-5)


Database Instance: The actual data stored in a database at a particular
moment in time. Also called database state ( or occurrence, snapshot)
(refer to Fig 1.2 2-6)


          Each schema construct has its own current set of instances.

The database schema changes very infrequently. The database state
changes every time the database is updated. Schema is also called
intension, whereas state is called extension.
                  2-3                                                         5
Figure 2.1   Schema diagram for UNIVERSITY database




schema construct




                   Known data:
                       name of record types, data items

                   2-4a                                   6
Figure 1.2
      UNIVERSITY Database




2-4                         7
define

      empty state

              load

      initial state

             update
                       valid state
         state         satisfy database schema


update




2-3                                          8
2.2 DBMS Architecture and Data Independence


 2.2.1 Three-Schema Architecture
Proposed to support DBMS characteristics of:
- Insulation of programs and data/program and operations
  (program-data and program-operation independence)
- Support of multiple views of the data.
- Use of catalog (database description)


Defines DBMS schema at three levels: (see 2-9)
- Internal schema at the internal level to describe data storage structures and access
  paths. Typically uses a physical data model.
- Conceptual schema at the conceptual level to describe the structure and constraints
  for the whole database. Uses a conceptual or an implementation data model.
- External schema at the external level to describe the various user views. Usually
  uses the same data model as the conceptual level or high-level data model.


Mappings among schema levels are also needed. Programs refer to an external
schema,            2-5                                                               9
Figure 2.2   The Three-schema architecture   2-6




      2-6                                          10
2.2.2 Data Independence                    By adding or removing a record type or data
                                                item to
                                                 · expand the database (2-11)
                                                 · reduce the database
Logical Data Independence: The capacity to change the conceptual schema without
having to change the external schemas and their application programs.

Physical Data Independence: The capacity to change the internal schema without
having to change the conceptual schema.

                Reorganize physical files to improve performance
                e.g. List all sections offered in Fall 1998
When a schema at a lower level is changed, only the mappings between this
schema and higher-lever schemas need to be changed in a DBMS that fully supports
data independence. The higher-level schemas themselves are unchanged. Hence, the
application programs need not be changed since they refer to the external schemas.




     Disadvantages of two levels of mappings:
     Overhead during compilation or execution of a query or program

                      2-7                                                            11
UNIVERSITY Conceptual Schema
STUDENT (Name, Student Number, Class, Major)
COURSE (Course Name, Course Number, Credit, Dept)
PREREQUISITE (Course Number, Prerequisite Number)
SECTION (Section Id, Course Number, Semester, Year, Instructor)
GRADE_REPORT(Student Number, Section Id , Grade)


UNIVERSITY External Schema
TRANSCRIPT(Student Name, Course Number, Grade, Semester, Year, Section Id)
   derived from STUDENT, SECTION, GRADE_REPORT
PREREQUISITES(Course Name, Course Number, Prerequisites)
   derived from PREREQUISITE, COURSE


Change GRADE-REPORT Schema Construct
GRADE_REPORT (Student Number, Student Name, Section Id, Course Number,
Grade)


Change Mapping (& View Definition)
TRANSCRIPT derived from SECTION, GRADE_REPORT
                      2-7a                                                   12
2.3 Database Languages and Interfaces
      provide appropriate languages and interfaces for each category of users.

 2.3.1 DBMS Languages
Data Definition Language (DDL): Used by the DBA and database designers to
specify the conceptual schema of a database. In many DBMSs, the DDL is also
used to define internal and external schemas (views). In some DBMSs, separate
storage definition language (SDL) and view definition language (VDL) are
used to define internal and external schemas.
             DDL Compiler

Data Manipulation Language (DML): Used to specify database retrievals and
updates (insertion, deletion, modifications)

- DML commands (data sublanguage) can be embedded in a general-purpose
  programming language (host language).

- Alternatively, stand-alone DML commands can be applied directly (query
  language).
                    2-8                                                          13
Types of DML

-Procedural DML:
• Also called record-at-a-time (record-oriented) or low-level DML
• Must be embedded in a programming language.
• Searches for and retrieves individual database records and uses looping
  and other constructs of the host programming language to retrieve multiple
  records.

-Declarative or non-procedural DML:
• Also called set-at-a-time (set-oriented) or high-level DML.
• Can be used as a stand-alone query language or can be embedded in a
  programming language.
• Searches for and retrieves information from multiple related database
  records in a single command.

- host language: general-purpose language
- data sublanguage: DML
- C++

                  2-9                                                      14
2.3.2 DBMS Interfaces
- Stand-alone query language interfaces. (casual end user)

- Programmer interfaces for embedding DML in programming
  languages: (programmer)
      -Pre-compiler Approach
      -Procedure (Subroutine) Call Approach

- User-friendly interfaces:
      -Menu-based Interfaces for Browsing.
      -Forms-based Interfaces.
      -Graphical User Interfaces.
      -Natural language Interfaces
      -Combination of the above

-Interfaces for Parametic Users (using function keys)

- Interfaces for the DBA:
       -Creating accounts, granting authorizations
       -Setting system parameters
       -Changing schemas or access path
                2-10                                         15
2.4 The Database System Environment
2.4.1 DBMS Component Modules




 Figure 2.3

                2-11                           16
2.4.2 Database System Utilities


To perform certain functions such as:

- Loading data stored in files into a database. Conversion tool
- Backing up the database periodically on storage.
- File reorganizing database file structures.
- Report generation utilities.
- Performance monitoring utilities.
- Other functions, such as sorting, user monitoring,
  data compression, etc.


                2-12                                         17
2.4.3 Tools, Application Environments, and
              Communications Facilities

Data dictionary utility:
- Used to store schema descriptions and other information such as design
  decisions, application program descriptions, user information, usage
  standards, etc. (comment)
-Active data dictionary is accessed by DBMS software and users/DBA.
-Passive data dictionary is accessed by users/DBA only.


Communications Facilities
- Allow users at locations remote from the database system site to access
  the database.
  DB (DBMS)/DC (Data Communication System)


                   2-12                                                 18
2.5 Classification of Database Management Systems
Based on the data model used:
•Data models
     -Traditional: Relational, Network (see 2-19), Hierarchical
     - Emerging: Object-oriented, Semantic, Entity- Relationship, other.

Other classifications:
•Number of users : Single-user (typically used with personal computers) vs.
 multi-user (most DBMSs)

•Number of sites:
 Centralized (uses a single computer) vs. distributed (uses multiple computers).
 Homogeneous vs. Heterogeneous

•Types of access paths used. (inverted file structures, …)
•Purpose    general purpose
            special purpose
            e.g. airline reservations, telephone directory, on-line transaction
                 processing system

                     2-13                                                         19
Figure 2.4   A Network Schema




                2-14            20

Database system

  • 1.
  • 2.
    Database System Conceptsand Architecture 2.1 Data Models, Schemas, and Instances 2.2 DBMS Architecture and Data Independence 2.3 Database Languages and Interfaces 2.4 The Database System Environment 2.5 Classification of Database Management Systems 2.6 Summary 2-1 2
  • 3.
    2.1 Data Models,Schemas, and Instances ‧data types ‧relationships Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey. Provide data abstraction Data Model Operations: Operations for specifying database retrievals and updates by referring to the concepts of the data model. ‧generic operation: insert, delete, modify, retrieve ‧user-defined operations 2-2 3
  • 4.
    2.1.1 Categories ofData Models: - Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) ‧entity ‧attribute ‧relationship - Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. ‧record formats ‧record ordering ‧access paths - Implementation (record-oriented) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details. ‧relational ‧network ‧hierarchical 2-2 4
  • 5.
    2.1.2 Schemas, Instancesand Database State cf database Database Schema (meta-data): The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. Schema Diagram: A diagrammatic display of (some aspects of ) a database schema. (refer to Fig 2.1 2-5) Database Instance: The actual data stored in a database at a particular moment in time. Also called database state ( or occurrence, snapshot) (refer to Fig 1.2 2-6) Each schema construct has its own current set of instances. The database schema changes very infrequently. The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension. 2-3 5
  • 6.
    Figure 2.1 Schema diagram for UNIVERSITY database schema construct Known data: name of record types, data items 2-4a 6
  • 7.
    Figure 1.2 UNIVERSITY Database 2-4 7
  • 8.
    define empty state load initial state update valid state state satisfy database schema update 2-3 8
  • 9.
    2.2 DBMS Architectureand Data Independence 2.2.1 Three-Schema Architecture Proposed to support DBMS characteristics of: - Insulation of programs and data/program and operations (program-data and program-operation independence) - Support of multiple views of the data. - Use of catalog (database description) Defines DBMS schema at three levels: (see 2-9) - Internal schema at the internal level to describe data storage structures and access paths. Typically uses a physical data model. - Conceptual schema at the conceptual level to describe the structure and constraints for the whole database. Uses a conceptual or an implementation data model. - External schema at the external level to describe the various user views. Usually uses the same data model as the conceptual level or high-level data model. Mappings among schema levels are also needed. Programs refer to an external schema, 2-5 9
  • 10.
    Figure 2.2 The Three-schema architecture 2-6 2-6 10
  • 11.
    2.2.2 Data Independence By adding or removing a record type or data item to · expand the database (2-11) · reduce the database Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. Reorganize physical files to improve performance e.g. List all sections offered in Fall 1998 When a schema at a lower level is changed, only the mappings between this schema and higher-lever schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. Disadvantages of two levels of mappings: Overhead during compilation or execution of a query or program 2-7 11
  • 12.
    UNIVERSITY Conceptual Schema STUDENT(Name, Student Number, Class, Major) COURSE (Course Name, Course Number, Credit, Dept) PREREQUISITE (Course Number, Prerequisite Number) SECTION (Section Id, Course Number, Semester, Year, Instructor) GRADE_REPORT(Student Number, Section Id , Grade) UNIVERSITY External Schema TRANSCRIPT(Student Name, Course Number, Grade, Semester, Year, Section Id) derived from STUDENT, SECTION, GRADE_REPORT PREREQUISITES(Course Name, Course Number, Prerequisites) derived from PREREQUISITE, COURSE Change GRADE-REPORT Schema Construct GRADE_REPORT (Student Number, Student Name, Section Id, Course Number, Grade) Change Mapping (& View Definition) TRANSCRIPT derived from SECTION, GRADE_REPORT 2-7a 12
  • 13.
    2.3 Database Languagesand Interfaces provide appropriate languages and interfaces for each category of users. 2.3.1 DBMS Languages Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas. DDL Compiler Data Manipulation Language (DML): Used to specify database retrievals and updates (insertion, deletion, modifications) - DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language). - Alternatively, stand-alone DML commands can be applied directly (query language). 2-8 13
  • 14.
    Types of DML -ProceduralDML: • Also called record-at-a-time (record-oriented) or low-level DML • Must be embedded in a programming language. • Searches for and retrieves individual database records and uses looping and other constructs of the host programming language to retrieve multiple records. -Declarative or non-procedural DML: • Also called set-at-a-time (set-oriented) or high-level DML. • Can be used as a stand-alone query language or can be embedded in a programming language. • Searches for and retrieves information from multiple related database records in a single command. - host language: general-purpose language - data sublanguage: DML - C++ 2-9 14
  • 15.
    2.3.2 DBMS Interfaces -Stand-alone query language interfaces. (casual end user) - Programmer interfaces for embedding DML in programming languages: (programmer) -Pre-compiler Approach -Procedure (Subroutine) Call Approach - User-friendly interfaces: -Menu-based Interfaces for Browsing. -Forms-based Interfaces. -Graphical User Interfaces. -Natural language Interfaces -Combination of the above -Interfaces for Parametic Users (using function keys) - Interfaces for the DBA: -Creating accounts, granting authorizations -Setting system parameters -Changing schemas or access path 2-10 15
  • 16.
    2.4 The DatabaseSystem Environment 2.4.1 DBMS Component Modules Figure 2.3 2-11 16
  • 17.
    2.4.2 Database SystemUtilities To perform certain functions such as: - Loading data stored in files into a database. Conversion tool - Backing up the database periodically on storage. - File reorganizing database file structures. - Report generation utilities. - Performance monitoring utilities. - Other functions, such as sorting, user monitoring, data compression, etc. 2-12 17
  • 18.
    2.4.3 Tools, ApplicationEnvironments, and Communications Facilities Data dictionary utility: - Used to store schema descriptions and other information such as design decisions, application program descriptions, user information, usage standards, etc. (comment) -Active data dictionary is accessed by DBMS software and users/DBA. -Passive data dictionary is accessed by users/DBA only. Communications Facilities - Allow users at locations remote from the database system site to access the database. DB (DBMS)/DC (Data Communication System) 2-12 18
  • 19.
    2.5 Classification ofDatabase Management Systems Based on the data model used: •Data models -Traditional: Relational, Network (see 2-19), Hierarchical - Emerging: Object-oriented, Semantic, Entity- Relationship, other. Other classifications: •Number of users : Single-user (typically used with personal computers) vs. multi-user (most DBMSs) •Number of sites: Centralized (uses a single computer) vs. distributed (uses multiple computers). Homogeneous vs. Heterogeneous •Types of access paths used. (inverted file structures, …) •Purpose general purpose special purpose e.g. airline reservations, telephone directory, on-line transaction processing system 2-13 19
  • 20.
    Figure 2.4 A Network Schema 2-14 20