CHAPTER 1: DATABASE ENVIRONMENT

Chapter Objectives
       At the end of the chapter, you should be able to:

        define the basic terminologies;

        know the differences between a file processing approach and a database approach;

        list the components of database environment.


Essential Reading
       Modem Database Management (4th Edition), Fred R. McFadden & Jeffrey A. Hoffer (1994),
       Benjamin/Cummings. [Chapter 1, page 5 - 30]



Useful Websites to learn Database and Programming:

http://erwinglobio.wix.com/ittraining

http://ittrainingsolutions.webs.com/

http://erwinglobio.sulit.com.ph/

http://erwinglobio.multiply.com/




Prof. Erwin M. Globio, MSIT                                                            1-1
DB212                                              CHAPTER 1: DATABASE ENVIRONMENT


1.1     Basic Terminologies




                         DATA                                              INFORMATION
                                                   PROCESS




        Data are facts concerning things such as people, objects or events. For example, an invoice
        which consists of data, order-no, customer particulars etc. can be considered as data.
        Information is data that have been processed and presented in a form suitable for human
        interpretation, often with the purpose of revealing trends or patterns.
        To convert data into information, we need to process the data. The process involves
        acquisition, storage, manipulation, retrieval and distribution.
        A database is a shared collection of logically related data, designed to meet the information
        needs of multiple users in an organization.




                                                                DATABASE

                                        file
                                           FILES




                                                   record 1
                                                   record 2                DATA
                                                   record 3
                                                   ………



        There are two generic database system designs: centralized and distributed.
        With a centralized database, all data are located at a single site.

           Advantage
               Provide greater control over accessing and updating data
           Disadvantage
               Vulnerable to failure
               Examples: Personal Computer Database, Central Computer Database, Client/Server
                Databases




1-2                                                              Prof. Erwin M. Globio, MSIT
DB212                                            CHAPTER 1: DATABASE ENVIRONMENT


        A distributed database is a single logical database that is spread physically across computers
        in multiple locations.
        Homogeneous Database need to comprise of the following conditions:
         The compatible operating systems used at each location are the same or highly
            compatible.
           The data models used at each location are the same.
           The DBMS used at each location are the same or highly compatible.
           The data at the various locations have common definitions and formats.
        Heterogeneous Databases means different computers and operating systems, different data
        models, different DBMS may be possible.

1.2     File Processing Approach
        File processing approach is a traditional approach to information system design focuses on the
        data processing needs of individual departments in the organization.


        1.2.1 Disadvantages of File Processing Approach
           Uncontrolled Redundancy
               In file processing system, each application has its own files, an approach that
                inevitably leads to a high level of data redundancy (that is, duplication of data). There
                are several disadvantages to recording the same data item in multiple files:
               Valuable storage space is wasted
               The same data may have to be input several times to update all occurrences of a data
                item
               Inconsistencies (or various versions) often result
           Inconsistent Data
               When the same data are stored in multiple locations, inconsistencies are inevitable.
                Inconsistencies in stored data are one of the most common sources of errors in
                computer applications. They lead to inconsistent documents and reports and
                undermine the confidence of users in the integrity of the information systems.
           Inflexibility
               A file processing system resembles a mass-production facility. It produces numerous
                documents and reports routinely and efficiently, provided that these outputs were
                anticipated in the original design of the system. Such systems, however , are often
                quite inflexible and cannot easily respond to requests for a new or redesigned
                product. This often leads to considerable frustration on the part of the users, who
                cannot understand why the computer system cannot give them information in a new
                format when they know it exists in the applications files.




Prof. Erwin M. Globio, MSIT                                                                   1-3
DB212                                             CHAPTER 1: DATABASE ENVIRONMENT


           Limited Data
                Sharing with the traditional applications approach, each application has its own
                 private files, and users have little opportunity to share data outside of their own
                 applications. The consequences of such limited data sharing may be:
                The same data may have to be entered several times to update files with duplicate
                 data.
                In developing new applications, the designer often cannot (or does) exploit data
                 containing in existing files; instead new files are designed that duplicate much of the
                 existing data.
           Poor Enforcement of Standards
            Unfortunately, data standards (i.e. data names, formats and access restrictions) are
            difficult to make known and enforce in a traditional file processing environment, mainly
            because the responsibility for system design and operation has been decentralized. Two
            types of inconsistencies may result from poor enforcement of standards: Synonyms and
            homonyms. A synonym results when two different names are used for the same data item
            - For example, student number and matriculation number. A homonym is a single name
            that is used for two different data items - for example, in a bank the term balance might be
            used to designate a checking account balance in one department and a savings account
            balance in a different department.
           Excessive Program Maintenance
                In file processing systems, descriptions of files, records, and data items are embedded
                 within individual application programs. Therefore, any modification to a data (such
                 as change of data name, data format, or method of access) requires that the program
                 (or programs) also be modified.

1.3     Database Approach
        The database approach emphasizes the integration and sharing of data across the organization.

        1.3.1 Data-driven vs Process-driven Design
        In file processing system, a process-driven approach has traditionally been used to design
        information system.
        With the process-driven approach, organizational processes are first identified and analyzed.
        Processes and data flows between processes are described using tools such as DFD. Designers
        then work backward from the required to convert inputs into outputs. Design of data files are a
        by-product of process design.
        With database approach, information systems professional discovered that a data-driven
        approach is often preferable. In the data-driven approach, entities that the organization must
        manage are focuses on first. Attributes and relationships of those entities are identified. After
        creating suitable models of the data structures and related business rules, designers develop
        the applications required to manage the data.




1-4                                                                Prof. Erwin M. Globio, MSIT
DB212                                             CHAPTER 1: DATABASE ENVIRONMENT


        The best is to strike a balance between data-driven design and process-driven design.
         Requirement Analysis
                        Requirement Analysis                     Requirement Analysis




                          Process Design                            Database Design



                            Data Design                              Process Design




                          Implementation                             Implementation

                      Process-driven Design                        Data-driven Design



        1.3.2 Benefits of the Database Approach
           Minimal Data Redundancy
            With the database approach, data files are integrated into a single, logical structure. We
            are not suggesting all redundancy is controlled. It is designed into the system to improve
            performances (or provide some other benefits), and the system is (or should be) aware of
            redundancy.
           Consistency of Data
            By controlling data redundancy, we greatly reduce the opportunities for inconsistency.
            For example, if each address is stored only once, we cannot have disagreement on the
            stored values. When controlled redundancy is permitted in the database, the database
            system itself should enforce consistency by updating each occurrence of a data item when
            a change occurs.
           Integration of Data
            In a database, data are organized into a single, logical structure, with logical relationships
            defined between associated data entities. This makes it easy for users to relate one item of
            data to another.
           Sharing of Data
            Most database systems today permit multiple users to share a database concurrently,
            although certain restrictions are imposed such that each user would be able to view a
            subset of the conceptual database model.
           Ease of Application Development
            A major advantage of the database is that it greatly reduces the cost and time for
            developing new business applications as programmer is relieved from the burden of
            designing, building, and maintaining master files. In a database system, data are
            independent of the application programs that use them. Within limits, either data or the
            application programs that use the data can be changed without necessitating a change in
            the other factor.




Prof. Erwin M. Globio, MSIT                                                                    1-5
DB212                                            CHAPTER 1: DATABASE ENVIRONMENT



        1.3.3 Costs of the Database Approach
           New, Specialized Personnel
            Frequently, organizations that adopt the database approach or purchase a database
            management system (DBMS) need to hire train individuals to maintain the new database
            software, develop and enforce new programming standards, design databases to achieve
            the highest possible performance, and manage the staff of new people.
           Need for Explicit Backup
            Minimal data redundancy, with all its associated benefits, may also fail to provide backup
            copies of data. Such backup or independently produced copies are helpful in restoring
            damaged files and in providing validity checks on crucial data. To ensure that data are
            accurate and available whenever needed, either database management software or
            additional procedures have to provide these essential capabilities.
           Interference with Shared Data
            The concurrent access to shared data via several application programs can lead to some
            problems. First, when two concurrent users both want to change the same or related data,
            inaccurate results can occur if access to the data is not properly synchronized. Second,
            when data are used exclusively for updating, different users can obtain control of different
            segments of the database and lock up any use of the data (so called deadlock).
           Organizational Conflict
            A shared database requires a consensus on data definitions and ownership as well as
            responsibilities for accurate data maintenance. Experience has shown that conflicts on
            how to define data, data length and coding, rights to update shared data, and associated
            issues are frequent and difficult managerial issues to resolve.
        Components of the Database Environment

                    Data                          System                            End
                administrators                   developers                         users




                                                                                 Application
                                               User interface                     programs




                 REPOSITORY                        DBMS                             Database


           RepositoryCentralized knowledge base containing all data definitions, screen and report
            formats and definitions of other organizations and system components.




1-6                                                              Prof. Erwin M. Globio, MSIT
DB212                                              CHAPTER 1: DATABASE ENVIRONMENT




            Database management system (DBMS)
             Commercial software system used to create, maintain and provide controlled access to the
             database and repository.
            Database
              A shared collection of logically related data, designed to meet the information needs of
             multiple users in an organization.
            Application programs
              Computer programs that are used to create and maintain the database and provide
             information to users.
            User interface
             Languages, menu and other facilities by which users interact with various system
             components.
            Data administrators
             Persons who are responsible for the overall information resources of an organization.
            System developers
             Persons such as system analysts and programmer who design new application programs.
            End users
             Persons throughout the organization who add, delete and modify data in the database and
             who request or receive information from it.

1.5     Review Question
        1.        Discuss the characteristics of traditional file processing system. Why is the
                  system criticized?
        2.        How did the database approach eliminate problems of file processing system?
        3.        Describe the components of a database system, with the aid of a diagram.
        4.        Explain why data redundancy is so common in traditional application systems.
        5.        Where are data definitions maintained in each of the following environment?
             a.   Traditional file processing system
             b.   Database system




Prof. Erwin M. Globio, MSIT                                                                       1-7

Database Management System 1

  • 1.
    CHAPTER 1: DATABASEENVIRONMENT Chapter Objectives At the end of the chapter, you should be able to:  define the basic terminologies;  know the differences between a file processing approach and a database approach;  list the components of database environment. Essential Reading Modem Database Management (4th Edition), Fred R. McFadden & Jeffrey A. Hoffer (1994), Benjamin/Cummings. [Chapter 1, page 5 - 30] Useful Websites to learn Database and Programming: http://erwinglobio.wix.com/ittraining http://ittrainingsolutions.webs.com/ http://erwinglobio.sulit.com.ph/ http://erwinglobio.multiply.com/ Prof. Erwin M. Globio, MSIT 1-1
  • 2.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT 1.1 Basic Terminologies DATA INFORMATION PROCESS Data are facts concerning things such as people, objects or events. For example, an invoice which consists of data, order-no, customer particulars etc. can be considered as data. Information is data that have been processed and presented in a form suitable for human interpretation, often with the purpose of revealing trends or patterns. To convert data into information, we need to process the data. The process involves acquisition, storage, manipulation, retrieval and distribution. A database is a shared collection of logically related data, designed to meet the information needs of multiple users in an organization. DATABASE file FILES record 1 record 2 DATA record 3 ……… There are two generic database system designs: centralized and distributed. With a centralized database, all data are located at a single site.  Advantage  Provide greater control over accessing and updating data  Disadvantage  Vulnerable to failure  Examples: Personal Computer Database, Central Computer Database, Client/Server Databases 1-2 Prof. Erwin M. Globio, MSIT
  • 3.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT A distributed database is a single logical database that is spread physically across computers in multiple locations. Homogeneous Database need to comprise of the following conditions:  The compatible operating systems used at each location are the same or highly compatible.  The data models used at each location are the same.  The DBMS used at each location are the same or highly compatible.  The data at the various locations have common definitions and formats. Heterogeneous Databases means different computers and operating systems, different data models, different DBMS may be possible. 1.2 File Processing Approach File processing approach is a traditional approach to information system design focuses on the data processing needs of individual departments in the organization. 1.2.1 Disadvantages of File Processing Approach  Uncontrolled Redundancy  In file processing system, each application has its own files, an approach that inevitably leads to a high level of data redundancy (that is, duplication of data). There are several disadvantages to recording the same data item in multiple files:  Valuable storage space is wasted  The same data may have to be input several times to update all occurrences of a data item  Inconsistencies (or various versions) often result  Inconsistent Data  When the same data are stored in multiple locations, inconsistencies are inevitable. Inconsistencies in stored data are one of the most common sources of errors in computer applications. They lead to inconsistent documents and reports and undermine the confidence of users in the integrity of the information systems.  Inflexibility  A file processing system resembles a mass-production facility. It produces numerous documents and reports routinely and efficiently, provided that these outputs were anticipated in the original design of the system. Such systems, however , are often quite inflexible and cannot easily respond to requests for a new or redesigned product. This often leads to considerable frustration on the part of the users, who cannot understand why the computer system cannot give them information in a new format when they know it exists in the applications files. Prof. Erwin M. Globio, MSIT 1-3
  • 4.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT  Limited Data  Sharing with the traditional applications approach, each application has its own private files, and users have little opportunity to share data outside of their own applications. The consequences of such limited data sharing may be:  The same data may have to be entered several times to update files with duplicate data.  In developing new applications, the designer often cannot (or does) exploit data containing in existing files; instead new files are designed that duplicate much of the existing data.  Poor Enforcement of Standards Unfortunately, data standards (i.e. data names, formats and access restrictions) are difficult to make known and enforce in a traditional file processing environment, mainly because the responsibility for system design and operation has been decentralized. Two types of inconsistencies may result from poor enforcement of standards: Synonyms and homonyms. A synonym results when two different names are used for the same data item - For example, student number and matriculation number. A homonym is a single name that is used for two different data items - for example, in a bank the term balance might be used to designate a checking account balance in one department and a savings account balance in a different department.  Excessive Program Maintenance  In file processing systems, descriptions of files, records, and data items are embedded within individual application programs. Therefore, any modification to a data (such as change of data name, data format, or method of access) requires that the program (or programs) also be modified. 1.3 Database Approach The database approach emphasizes the integration and sharing of data across the organization. 1.3.1 Data-driven vs Process-driven Design In file processing system, a process-driven approach has traditionally been used to design information system. With the process-driven approach, organizational processes are first identified and analyzed. Processes and data flows between processes are described using tools such as DFD. Designers then work backward from the required to convert inputs into outputs. Design of data files are a by-product of process design. With database approach, information systems professional discovered that a data-driven approach is often preferable. In the data-driven approach, entities that the organization must manage are focuses on first. Attributes and relationships of those entities are identified. After creating suitable models of the data structures and related business rules, designers develop the applications required to manage the data. 1-4 Prof. Erwin M. Globio, MSIT
  • 5.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT The best is to strike a balance between data-driven design and process-driven design. Requirement Analysis Requirement Analysis Requirement Analysis Process Design Database Design Data Design Process Design Implementation Implementation Process-driven Design Data-driven Design 1.3.2 Benefits of the Database Approach  Minimal Data Redundancy With the database approach, data files are integrated into a single, logical structure. We are not suggesting all redundancy is controlled. It is designed into the system to improve performances (or provide some other benefits), and the system is (or should be) aware of redundancy.  Consistency of Data By controlling data redundancy, we greatly reduce the opportunities for inconsistency. For example, if each address is stored only once, we cannot have disagreement on the stored values. When controlled redundancy is permitted in the database, the database system itself should enforce consistency by updating each occurrence of a data item when a change occurs.  Integration of Data In a database, data are organized into a single, logical structure, with logical relationships defined between associated data entities. This makes it easy for users to relate one item of data to another.  Sharing of Data Most database systems today permit multiple users to share a database concurrently, although certain restrictions are imposed such that each user would be able to view a subset of the conceptual database model.  Ease of Application Development A major advantage of the database is that it greatly reduces the cost and time for developing new business applications as programmer is relieved from the burden of designing, building, and maintaining master files. In a database system, data are independent of the application programs that use them. Within limits, either data or the application programs that use the data can be changed without necessitating a change in the other factor. Prof. Erwin M. Globio, MSIT 1-5
  • 6.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT 1.3.3 Costs of the Database Approach  New, Specialized Personnel Frequently, organizations that adopt the database approach or purchase a database management system (DBMS) need to hire train individuals to maintain the new database software, develop and enforce new programming standards, design databases to achieve the highest possible performance, and manage the staff of new people.  Need for Explicit Backup Minimal data redundancy, with all its associated benefits, may also fail to provide backup copies of data. Such backup or independently produced copies are helpful in restoring damaged files and in providing validity checks on crucial data. To ensure that data are accurate and available whenever needed, either database management software or additional procedures have to provide these essential capabilities.  Interference with Shared Data The concurrent access to shared data via several application programs can lead to some problems. First, when two concurrent users both want to change the same or related data, inaccurate results can occur if access to the data is not properly synchronized. Second, when data are used exclusively for updating, different users can obtain control of different segments of the database and lock up any use of the data (so called deadlock).  Organizational Conflict A shared database requires a consensus on data definitions and ownership as well as responsibilities for accurate data maintenance. Experience has shown that conflicts on how to define data, data length and coding, rights to update shared data, and associated issues are frequent and difficult managerial issues to resolve. Components of the Database Environment Data System End administrators developers users Application User interface programs REPOSITORY DBMS Database  RepositoryCentralized knowledge base containing all data definitions, screen and report formats and definitions of other organizations and system components. 1-6 Prof. Erwin M. Globio, MSIT
  • 7.
    DB212 CHAPTER 1: DATABASE ENVIRONMENT  Database management system (DBMS) Commercial software system used to create, maintain and provide controlled access to the database and repository.  Database A shared collection of logically related data, designed to meet the information needs of multiple users in an organization.  Application programs Computer programs that are used to create and maintain the database and provide information to users.  User interface Languages, menu and other facilities by which users interact with various system components.  Data administrators Persons who are responsible for the overall information resources of an organization.  System developers Persons such as system analysts and programmer who design new application programs.  End users Persons throughout the organization who add, delete and modify data in the database and who request or receive information from it. 1.5 Review Question 1. Discuss the characteristics of traditional file processing system. Why is the system criticized? 2. How did the database approach eliminate problems of file processing system? 3. Describe the components of a database system, with the aid of a diagram. 4. Explain why data redundancy is so common in traditional application systems. 5. Where are data definitions maintained in each of the following environment? a. Traditional file processing system b. Database system Prof. Erwin M. Globio, MSIT 1-7