Database Lecture Prelims


Published on

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Database Lecture Prelims

  1. 1. Traditional File-Based System • Manual file is set up to hold all external and internal correspondence relating to a project, product, employee or client, such file is labeled and stored. • For Security, cabinets have locks or may be located in secure areas in the building. Limitations of File-Based Approach 1. Separation and isolation of data – Data inconsistency 2. Duplication of data – Data Redundancy 3. Data dependence – Limited Data Sharing 4. Incompatible file formats – Inflexibility 5. Fixed queries / proliferation (production or creation of application programs) - Inadequate data manipulate capabilities. Database – a shared collection of logically related data, and description of this data, designed to meet the information needs of an organization. Database Management System (DBMS) – a software system that enables users to define, create, maintain, and control access to the database.Characteristics of Database Approach 1. Self-describing nature of dbase system The database contains not only the dbase itself but also a complete definition or description of the dbase structure and constraints. The definition is stored in the system catalog (also called as meta-data). 2. Insulation between programs and date, and data abstraction. The structure of data files is stored in the DBMS catalog separately from the access programs (program independence). In DBMS environment, in adding another piece of data, the just need to change the description of data in meta-data, no programs are changed.Data abstraction - characteristic that allows program data independence and programoperation independence 3. Support of multiple views of the data, A database typically has many users, each of whom may lecture different perspective or views of the dbase. View- may be a subset of dbase. 4. Sharing of data and multi-user transaction processing • Must allow multiple users to access the dbase at the same time. • Concurrency control – the DBMS must include software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. • On-Line Transaction Processing (OLTP) – DBMS should ensure that each subset or record can be accessed by only on user at a time.
  2. 2. Intended Uses of DBMS 1. Controlling redundancy 2. Restricting unauthorized access 3. Persistent storage for program objects and data structure. 4. Database interfacing using deduction rules (permitting interferencing and action using rules). 5. Providing multiple user interfaces. 6. Representing complex relationships among data. 7. Enforcing integrity constraints 8. Providing back-up and recovery. Disadvantages of DBMS 1. Complexity 2. Size 3. Cost of DBMS 4. Additional hardware costs 5. Cost of conversion 6. Performance 7. Higher Impact Actors of the Scene 1. Database Administrators (DBA) – administers database environment such as database itself and DBMS and related software. DBA is responsible for authorizing access to the dbase, for coordinating and monitoring its use, and for acquiring software and hardware resources a. Accountable for problems such as breach of security or peer system response time. 2. Database Designer – responsible for identifying the data to be stored in the dbase and for choosing appropriate structures to represent and store this data. 3. End-users – people whose jobs require access to the dbase for querying updating and generating reports.Categories of End- Users 1. Casual end user – occasionally access the dbase, but they need different information each time 2. Typically middle or high- level managers or other occasional browsers 3. Naïve or parametric end users- make up a sizeable potion of dbase end user • Their main job function revolves around constantly querying and updating the database. • example: bank tellers, reservation clerks for airlines, hotels and car rentals 4. Stand-alone users- maintain personal dbases by using ready0made program packages that provide easy-to-use or graphics-based interfaces
  3. 3. System analysts and application programmer (software engineers) 1. System analysts – determine the requirements of end-users and develop specifications for canned transactions (using standard types of queries and updates) 2. Application programmers – implement these specifications as programs, then they test, debug document and maintain the canned transactions. Implications of Dbase Approach 1. Potential for enforcing standards • Dbase approach permits the DBA to define and enforce standards among dbase users in a large organization • Standards can be define for names and formats of data elements, display formats, report structures, terminologies 2. Reduced application development time • A prime selling feature of dbase approach is that developing a new application - such as retrieval of certain data from the dbase for printing new application – takes very little time 3. Flexibility • It may be necessary to change the structure of a dbase as requirements change. Modern DBMS allow certain type of changes to the structure of the dbase without affecting the stored data and the existing application programs. 4. Availability of up-to-date information • DBMS makes the dbase available to all users. As soon as one user’s update is applied to the dbase, all other users can immediately see this update 5. Economies of Scale • DBMS approach permits consolidation of data and applications, thus reducing the amount of wasteful overlay between activities of data- processing personnel in different projects or department.1. External Level – The user’s view of database. This level describes that part of thedbase that is relevant to each user.2. Conceptual level – The community view of the dbase. This level describes what datais stored in the dbase and the relationships among data.3. Internal level – The physical representation of the dbase on the computer. This leveldescribes how the data is stored in the dbase.4. Physical level – may be managed by operating system under the direction of theDBMS.- DBMS consists of items only the operating system knows such as exactly how thesequencing is implemented and whether the fields of internal records are stored ascontiguous bytes on the disk.Dbase Schema – overall description of the dbase.Database Languages2 Parts
  4. 4. 1. Data Definition Language (DDL) used to specify the database schema • Language that allows the DBA or user to describe and name entities, attributes, and relationships required for the application, together with any associated integrity and security constraints • The result of the compilation of the DDL statements is a set of table stored specified file collectively called system catalog. • Meta-data contains definitions of records, data items.2. Data Manipulation Language (DML) is used both read and update the database • A language that provides a set of operations to support the basic data manipulation operations on the data held in the database.Data Manipulation Operations:1. Insertion of the new data into the database.2. Modification of the data stored in the database.3. Retrieved of the contained in the database4. Deletion of the data from the databaseThese languages are also called data sublanguages (e.g.. COBOL, Fenran, Pascal, Ada , C, C++, Java, or VB)Data Model – a collection of concepts that can be used to describe a set of data, theoperations to manipulate the data, and a set of integrity rules for the data3 Broad Categories of Data Model 1. Object-Based Data Models • Use concepts such as entities, attributes, and relationships. • Entity-distinct object (a person, place, thing, concept, event) • Attribute- properly that describe some aspect of the object. • Relationship – is an association between entities. • Entity – Relationship mode – one common type of object based data model. 2. Record-Based Data Model • The database is consists of a number of fixed- formal records possibly of differing types • Relation data model – data and relationships are represented as tables, each of which has a number of columns with a unique name. • Network data model – the records are organized as generalized graph structures with records appearing as nodes and sets as edges in the graph. • Hierarchical data model - Data is represented as collection of records and relationships are represented by sets. However it allows a node ( record) to have only one parent  Files are arranged in a top-down structure that resembles a tree or genealogy chart. It is a restricted type of network model.
  5. 5. 3. Physical Data Model - Describe how data is stored in the computer representing information such as record structure, record orderings, and access paths. Conceptual modeling – or conceptual database design, process of constructing a model of the information use in an enterprise that is independent of implementation details such as the target DBMS, application programs and programming language.Functions of DBMS 1. Data storage, retrieval, and update A DBMS must furnish users with the ability to store, retrieve, and update data in the database. 2. A user-accessible catalog A DBMS must furnish a catalog in which descriptions of data items are stored and which is accessible to users. 3. Transaction support. A DBMS must furnish a mechanism which will ensure either that all the updates corresponding to a given transaction are made or that none of them is made. 4. Concurrency control services A DBMS must furnish mechanism to ensure that the database is updated correctly when multiple users are updating the database concurrently. 5. Recovery services A DBMS must furnish a mechanism for recovering the database in the event that the database is damaged in anyway. 6. Authorization services. A DBMS must furnish a mechanism to ensure that only authorized users can access the database. 7. Support for the communication. DBMS must be capable of integrating with communication software. 8. Integrity services DBMS must furnish a means to ensure that both data in the database and changes to the data follow certain rules. 9. Services to promote data independence. A DBMS must include facilities to support the independence of programs from the actual structure of the database. 10. Utility services A DBMS should provide a set of utility services. COMPONENTS OF DBMS 1. Computer-aided software engineering (CASE) tools – Automated tools used to design databases and application programs • The term “Computer-aided software engineering” (CASE) can refer to the software used for the automated development of systems software, i.e, computer code. The CASE functions include analysis, design ,and programming CASE tools automate methods for designing, documenting,
  6. 6. and producing structured computer code in the desired programming language. 2. Repository - Centralized storehouse for all data definitions, data relationships, screen and report formats, and other system components. A repository contains an extended set of metadata important for managing databases as well as other components of an information system. Metadata – are data that describe the properties or characteristics of other data. Some of these properties include data definitions, data structures, and rules or constraints.3. Database Management System (DBMS) – Commercial software (and occasionally,hardware and firmware) system used to define, create, maintain, and provide controlledaccess to the database and also to the repository.4. Database - An organized collection of logically related data, usually designed to meetthe information needs of multiple users in an organization.5. Application programs – Computer programs that are used to create and maintain thedatabase and provide information to users.6. User interface - Languages, menus, and other facilities by which users interact withvarious system components, such as CASE tools, application programs, the DBMS, andthe repository.7. Data administrators - Persons who are responsible for the overall informationresources of an organization. Data administrators use CASE tools to improve theproductivity of database planning and design.8. System developers – Persons such as systems analysts and programmers who designnew application programs. System developers often use CASE tools for systemrequirements analysis and program design.9. End Users – Persons throughout the organization who add, delete, and modify data inthe database and who request or receive information from it.Brief History of the Relational Model • Was first proposed by E.F. Codd • Although interest in the relational model came from several directions, the most significant research may be attributed to tree projects: 1. Prototype Relational DBMS System R. • At IBM’s San Jose Research Laboratory in California. • Was designed to prove the practicality of the relational model. 2. INGRES( Interactive Graphics Retrieval System). • Project at the University of California at Berkely. • Involved the development of a prototype RDBMS 3. Peterlee Relational Test Vehicle at the IBM UK Scientific Cantre. • Principally for research into such issues as query processing and optimization, and functional extensionRelational Data Structure and Basic Terminologies. Relation
  7. 7. • A relation is a table with columns and rows • Relations are used to hold information about the objects to be represented in the database.Relational Model - is based on the mathematical concept of a relation, which isphysically represented as a table.Attributes – a named column of a relationTuple – is a row of a relation.Degree – the degree of a relation is the number of attributes it contains Example: Branch relation has a degree four.Cardinality – the cardinality of a relation is the number of tuples it containsRelational database • A collection of normalized relations with distinct relation names. • Consists of relations that are appropriately structured Database Relations Relation Scheme – a named relation defined by a set of attribute and domain name pairs Domain – Branch Number, Street Name, City Name, Post Code {(B005, 22 Deer RD., London, SW14EH)}Properties of Relations • The relation has a name that is distinct from all other relation names. • Each cell of relation contains exactly one atomic (single) value. • Each attribute has a distinct name. • The values of an attribute are all from the same domain. • Each tuple is distinct, there are no duplicate tuples. • The order of attributes has no significance. • The order of tuples has no significance.Candidate key – a superkey such that no proper subset is a superkey within the relationA candidate key for a relation has two properties:1. Uniqueness – In each tuple of relation the values of candidate key are uniquelyidentify that tuple.2. Irreducibility – no proper subset of candidate key has the uniqueness properly.Primary key – the candidate key that is selected to identify tuples uniquely with therelationAlternate key - candidate keys that are not selected to be the primary keyForeign Key – an attribute or set of attributes with one relation that matches thecandidate key of some possibly the same relation.Relational IntegrityNull • Represents a value for an attribute that is currently unknown or is not applicable for this tuple. • Deals with incomplete or exceptional data.
  8. 8. • Null represents the absence of a value and is not the same as zero or spaces, which are values.2 principal rules for the relational model: 1. Entity Integrity • In a base relation, no attribute of a primary key can be null 2. Referential Integrity • If foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or foreign key value must be wholly null. Enterprise Constraints • Additional rules specified by users or database administrators • Eg., the maximum number of staff in a branch Views • Base Relation - A named relation , corresponding to an entity in conceptual schema whose tuples are physically stored in database. • View – Dynamic result of one or more relational operations operating on the base relations to produce another relation. o A view is a virtual relation that does not actually exist in the database but is produced upon request, at time of request. Contents of a view are defined as a query on one or more base relations