SlideShare a Scribd company logo
1 of 131
Download to read offline
By: Engineer Muhammad
     Suleman Memon
M.E(Information Technology)
   B.E(Computer System)
A  database is a simple, yet flexible and
powerful tool for storing and retrieving data.
Every company, every website, has lots of data.
The more of your data that you keep in your
database - the better.
Far from being a tool only useful to big
businesses, even if you just want a simple guest
book or page hit counter, a database is perfect.
Whichever database you use - it'll be a
relational database.
   This is the industry standard design these
    days.
   Relational databases use the principles of
    set theory.
   Set theory is a field of mathematics that
    describes how to deal with sets of data.
   Relational databases are quite intuitive and
    easy to understand.
   All data is held in tables.
   A table has columns (along the top) and rows.
    You create the tables you need. You define
    the table names.
   You define what the column names are in
    each table.
   You define what type of data the columns
    are...
   There are a number of different data types
    available which represent the different types
    of data you find in real life.
   There are analogous types in all databases
    and programming languages. Each has
    variations, but they're all fundamentally the
    same.
They are:
•   Numerical Types. i.e. Numbers. There are
    fundamentally two types: integer and float.
    Integers are whole numbers (i.e. 1, 2, 100,
    999999). Floats are numbers with decimal
    places (i.e. (1.1, 22.5, 3.1415927).
•   String Types. i.e. Text. There are two types
    here: Fixed length, and variable length. 'char'
    is the only fixed length type in MySQL - from
    1-255 characters.
•   'varchar' is a variable length field that can be
    1-255 characters. There are several
•   'text' types of varying lengths in MySQL.
   Date and Time Types For storing dates &
    times.
   Binary Data This is arbitrary data, could be
    images, programs absolutely anything.
   All Relational Databases use indexes.
    Similar to the index in a book, indexes provide
    a quick way to find the exact data item you
    want.
   Imagine you have a database of 100,000
    customers, and you want to find just one.
   If you just
   read the 'customers' table from start to finish
    until you find the one your searching for, you
   could end up having to read all 100,000
    records.
   This would be very slow.
   Most relational databases use a b-tree
    index structure.
   This is a clever algorithm that guarantees
    that you can find a data item by reading at
    most 3 rows from the index.
   Databases commonly have
   millions of rows - so you can see the
    necessity for indexes!
   Indexes are a large part of databases and
    their design.
   Defining a column as the primary key
    implicitly creates an index.
   f you have a primary key on a table - it has
    an index.
   You can add a number of indexes to each
    table you have.
   You'd use the create index command -
    more later...
   Indexes are used automatically by the
    database itself when you issue a query (ask
    for data).
   It uses the index to find the data in the
    table .
    For example, we want to get a customer's
   details from the example 'customers' table
    above...
   If we submit the following SQL query, the
    database will use the index it created for
    primary key column 'customer_id', and get
    everything for customer 1:
      select * from customers where
    customer_id = 1;
   The database uses the index because it can
    use it.
   The query contains the 'customer_id' so it
   can look in the index and find the location
    of customer '1'.
   If there's no index on the column in the
    query, the database will have to go through
    the whole table! This is called a full table
    scan .
 These days, when you talk about databases
  in the wild, you are primarily talking about
  two types: analytical databases and
  operational databases.
Analytic Databases
 Analytic databases (a.k.a. OLAP- On Line
  Analytical Processing) are primarily static,
  read-only databases which store archived,
  historical data used for analysis.
   For example, a company might store sales
    records over the last ten years in an analytic
    database and use that database to analyze
    marketing strategies in relationship to
    demographics.
   On the web, you will often see analytic
    databases in the form of inventory catalogs
    such as the one shown previously from
    Amazon.com.
   An inventory catalog analytical database
    usually holds descriptive information about all
    available products in the inventory.
   Web pages are generated dynamically by
    querying the list of available products in the
    inventory against some search parameters.
   The dynamically-generated page will
    display the information about each item
    (such as title, author, ISBN) which is stored
    in the database.
   Operational databases (a.k.a. OLTP On Line
    Transaction Processing), on the other hand,
    are used to manage more dynamic bits of
    data.
   These types of databases allow you to do
    more than simply view archived data.
   Operational databases allow you to modify
    that data (add, change or delete data).
   These types of databases are usually used
    to track real-time information.
   For example, a company might have an
    operational database used to track
    warehouse/stock quantities.
   As customers order products from an online
    web store, an operational database can be
    used to keep track of how many items have
    been sold and when the company will need
    to reorder stock
 Besides differentiating databases according
  to function, databases can also be
  differentiated according to how they model
  the data.
What is a data model?
 Well, essentially a data model is a
  "description" of both a container for data
  and a methodology for storing and
  retrieving data from that container.
 Actually, there isn't really a data model
  "thing".
   Data models are abstractions, oftentimes
    mathematical algorithms and concepts.
   You cannot really touch a data model.
   But nevertheless, they are very useful.
   The analysis and design of data models has
    been the cornerstone of the evolution of
    databases.
   As models have advanced so has database
    efficiency.
   Before the 1980's, the two most commonly
    used Database Models were the hierarchical
    and network systems.
   As its name implies, the Hierarchical Database
    Model defines hierarchically-arranged data.
   Perhaps the most intuitive way to visualize this
    type of relationship is by visualizing an upside
    down tree of data.
   In this tree, a single table acts as the "root" of
    the database from which other tables "branch"
    out.
   You will be instantly familiar with this
    relationship because that is how all windows-
    based directory management systems (like
    Windows Explorer) work these days.
   Relationships in such a system are thought
    of in terms of children and parents such
    that a child may only have one parent but a
    parent can have multiple children.
   Parents and children are tied together by
    links called "pointers" (perhaps physical
    addresses inside the file system).
   A parent will have a list of pointers to each
    of their children.
   This child/parent rule assures that data is
    systematically accessible.
   To get to a low-level table, you start at the root
    and work your way down through the tree until
    you reach your target.
   Of course, as you might imagine, one problem
    with this system is that the user must know
    how the tree is structured in order to find
    anything!
   The hierarchical model however, is much more
    efficient than the flat-file model we discussed
    earlier because there is not as much need for
    redundant data.
   If a change in the data is necessary, the
    change might only need to be processed
    once. Consider the student flatfile database
    example from our discussion of what
    databases are:
Examples of hierarchical data represented
    as relational tables
   An organization could store employee
    information in a table that contains
    attributes/columns such as employee
    number, first name, last name, and
    Department number.
   The organization provides each employee
    with computer hardware as needed, but
    computer equipment may only be used by
    the employee to which it is assigned.
   The organization could store the computer
    hardware information in a separate table
    that includes each part's serial number,
    type, and the employee that uses it.
   In many ways, the Network Database model
    was designed to solve some of the more
    serious problems with the Hierarchical
    Database Model.
   Specifically, the Network model solves the
    problem of data redundancy by representing
    relationships in terms of sets rather than
    hierarchy.
   The model had its origins in the Conference on
    Data Systems Languages (CODASYL) which had
    created the Data Base Task Group to explore
    and design a method to replace the hierarchical
    model.
   The network model is very similar to the
    hierarchical model actually.
   In fact, the hierarchical model is a subset of
    the network model.
   However, instead of using a single-parent
    tree hierarchy, the network model uses set
    theory to provide a tree-like hierarchy with
    the exception that child tables were allowed
    to have more than one parent.
   his allowed the network model to support
    many-to-many relationships.
   Visually, a Network Database looks like a
    hierarchical Database in that you can see it
    as a type of tree.
   However, in the case of a Network
    Database, the look is more like several trees
    which share branches.
   Thus, children can have multiple parents
    and parents can have multiple children.
   (RDBMS - relational database management
    system) A database based on the relational
    model developed by E.F. Codd.
   A relational database allows the definition
    of data structures, storage and retrieval
    operations and integrity constraints.
   In such a database the data and relations
    between them are organised in tables. A
    table is a collection of records and each
    record in a table contains the same fields.
Properties of Relational Tables:
 Values Are Atomic
 Each Row is Unique
 Column Values Are of the Same Kind
 The Sequence of Columns is Insignificant
 The Sequence of Rows is Insignificant
 Each Column Has a Unique Name
   Certain fields may be designated as keys,
    which means that searches for specific
    values of that field will use indexing to
    speed them up.
   Where fields in two different tables take
    values from the same set, a join operation
    can be performed to select related records
    in the two tables by matching values in
    those fields.
   Often, but not always, the fields will have
    the same name in both tables.
   For example, an "orders" table might
    contain (customer-ID, product-code) pairs
    and a "products" table might contain
    (product-code, price) pairs so to calculate a
    given customer's bill you would sum the
    prices of all products ordered by that
    customer by joining on the product-code
    fields of the two tables.
   This can be extended to joining multiple
    tables on multiple fields.
   Because these relationships are only
    specified at retreival time, relational
    databases are classed as dynamic database
    management system.
   The RELATIONAL database model is based
    on the Relational Algebra.
   Object/relational database management
    systems (ORDBMSs) add new object storage
    capabilities to the relational systems at the
    core of modern information systems.
   These new facilities integrate management
    of traditional fielded data, complex objects
    such as time-series and geospatial data and
    diverse binary media such as audio, video,
    images, and applets.
   By encapsulating methods with data
    structures, an ORDBMS server can execute
    comple x analytical and data manipulation
    operations to search and transform
    multimedia and other complex objects.
   As an evolutionary technology, the
    object/relational (OR) approach has
    inherited the robust transaction- and
    performance-management features of it s
    relational ancestor and the flexibility of its
    object-oriented cousin.
   database designers can work with familiar
    tabular structures and data definition
    languages (DDLs) while assimilating new
    object-management possibilities.
   Query and procedural languages and call
    interfaces in ORDBMSs are familiar: SQL3,
    vendor procedural languages, and ODBC,
    JDBC, and proprie tary call interfaces are all
    extensions of RDBMS languages and
    interfaces.
   And the leading vendors are, of course,
    quite well known: IBM, Inform ix, and
    Oracle.
   Object DBMSs add database functionality to
    object programming languages.
   They bring much more than persistent
    storage of programming language objects.
   Object DBMSs extend the semantics of the
    C++, Smalltalk and Java object
    programming languages to provide full-
    featured database programming capability,
    while retaining native language
    compatibility.
   A major benefit of this approach is the
    unification of the application and database
    development into a seamless data model
    and language environment.
   As a result, applications require less code,
    use more natural data modeling, and code
    bases are easier to maintain.
   Object developers can write complete
    database applications with a modest
    amount of additional effort.
   According to Rao (1994), "The object-
    oriented database (OODB) paradigm is the
    combination of object-oriented
    programming language (OOPL) systems and
    persistent systems.
   The power of the OODB comes from the
    seamless treatment of both persistent data,
    as found in databases, and transient data,
    as found in executing programs."
   In contrast to a relational DBMS where a
    complex data structure must be flattened
    out to fit into tables or joined together from
    those tables to form the in-memory
    structure, object DBMSs have no
    performance overhead to store or retrieve a
    web or hierarchy of interrelated objects.
   This one-to-one mapping of object
    programming language objects to database
    objects has two benefits over other storage
    approaches:
   It provides higher performance management of
    objects, and it enables better management of
    the complex interrelationships between
    objects.
   This makes object DBMSs better suited to
    support applications such as financial portfolio
    risk analysis systems, telecommunications
    service applications, world wide web document
    structures, design and manufacturing systems,
    and hospital patient record systems, which
    have complex relationships between data.
   In semistructured data model, the information
    that is normally associated with a schema is
    contained within the data, which is sometimes
    called ``self-describing''.
   In such database there is no clear separation
    between the data and the schema, and the
    degree to which it is structured depends on the
    application.
   In some forms of semistructured data there is
    no separate schema, in others it exists but only
    places loose constraints on the data.
   Semi-structured data is naturally modelled in
    terms of graphs which contain labels which
    give semantics to its underlying structure.
   Such databases subsume the modelling
    power of recent extensions of flat relational
    databases, to nested databases which allow
    the nesting (or encapsulation) of entities, and
    to object databases which, in addition, allow
    cyclic references between objects.
   The associative model divides the real-world
    things about which data is to be recorded into
    two sorts:
   Entities are things that have discrete,
    independent existence.
   An entity’s existence does not depend on any
    other thing.
   Associations are things whose existence
    depends on one or more other things, such
    that if any of those things ceases to exist, then
    the thing itself ceases to exist or becomes
    meaningless.
An associative database comprises two data
     structures:
1.    A set of items, each of which has a unique
      identifier, a name and a type.
2.    A set of links, each of which has a unique
      identifier, together with the unique identifiers
      of three other things, that represent the
      source source, verb and target of a fact that is
      recorded about the source in the database.
      Each of the three things identified by the
      source, verb and target may be either a link or
      an item.
   The best way to understand the rationale of
    EAV design is to understand row modeling (of
    which EAV is a generalized form).
   Consider a supermarket database that must
    manage thousands of products and brands,
    many of which have a transitory existence.
   Here, it is intuitively obvious that product
    names should not be hard-coded as names of
    columns in tables. Instead, one stores product
    descriptions in a Products table:
    purchases/sales of individual items are
    recorded in other tables as separate rows with
    a product ID referencing this table.
   Conceptually an EAV design involves a
    single table with three columns, an entity
    (such as an olfactory receptor ID), an
    attribute (such as species, which is actually
    a pointer into the metadata table) and a
    value for the attribute (e.g., rat). In EAV
    design, one row stores a single fact.
   In a conventional table that has one column
    per attribute, by contrast, one row stores a
    set of facts. EAV design is appropriate when
    the number of parameters that potentially
    apply to an entity is vastly more than those
    that actually apply to an individual entity.
   The context data model combines features of
    all the above models.
   It can be considered as a collection of object-
    oriented, network and semistructured models
    or as some kind of object database.
   In other words this is a flexible model, you can
    use any type of database structure depending
    on task. Such data model has been
    implemented in DBMS ConteXt.
   The fundamental unit of information storage of
    ConteXt is a CLASS.
   Class contains METHODS and describes
    OBJECT.
   The Object contains FIELDS and PROPERTY. The
    field may be composite, in this case the field
    contains SubFields etc.
   The property is a set of fields that belongs to
    particular Object. (similar to AVL database). In
    other words, fields are permanent part of
    Object but Property is its variable part.
   The header of Class contains the definition of
    the internal structure of the Object, which
    includes the description of each field, such as
    their type, length, attributes and name.
   Context data model has a set of predefined
    types as well as user defined types.
   The predefined types include not only
    character strings, texts and digits but also
    pointers (references) and aggregate types
    (structures).
   A context model comprises three main data
    types: REGULAR, VIRTUAL and REFERENCE.
   Database design is the process of
    producing a detailed data model of a
    database.
   This logical data model contains all the
    needed logical and physical design choices
    and physical storage parameters needed to
    generate a design in a Data Definition
    Language, which can then be used to create
    a database.
   A fully attributed data model contains
    detailed attributes for each entity.
   The term database design can be used to
    describe many different parts of the design of
    an overall database system.
   Principally, and most correctly, it can be
    thought of as the logical design of the base
    data structures used to store the data.
   In the relational model these are the tables
    and views.
Conceptual schema:
 A conceptual schema or conceptual data model
  is a map of concepts and their relationships.
 This describes the semantics of an
  organization and represents a series of
  assertions about its nature.
 Specifically, it describes the things of
  significance to an organization (entity classes),
  about which it is inclined to collect information,
  and characteristics of (attributes) and
  associations between pairs of those things of
  significance (relationships).
   Because a conceptual schema represents
    the semantics of an organization, and not a
    database design, it may exist on various
    levels of abstraction.
   Conceptual data models take a more
    abstract perspective, identifying the
    fundamental things, of which the things an
    individual deals with are just examples.
   The model does allow for what is called
    inheritance in object oriented terms.
   A data structure diagram (DSD) is a data
    model or diagram used to describe
    conceptual data models by providing
    graphical notations which document entities
    and their relationships, and the constraints
    that binds them.
   Once the relationships and dependencies
    amongst the various pieces of information have
    been determined, it is possible to arrange the
    data into a logical structure which can then be
    mapped into the storage objects supported by
    the database management system.
   Ensuring, via normalisation procedures and the
    definition of integrity rules, that the stored
    database will be non-redundant and properly
    connected.
   logical data structuring) is based on the
    identification of: the entities, their attributes,
    and the relationships between the entities.
Entity:
 Something about which an enterprise needs
  to keep data.
Attributes:
 The properties of an entity.
Relationships
 The connections between entities.
 An Entity may be physical
Example:
           an Employee; a Part; a Machine
 Or conceptual
Example:
           a Project; an Order; a Course.
 Each instance of an entity is different from
  all others - one or more attributes will
  typically form a 'primary key' attribute -
  unique to a particular instance.
   Attributes are the properties of an entity .
   Data which describes or is 'owned' by an
    entity.
    Attributes (data) equate to facts - specific
    details about entities - details of interest.
   In the real world, objects do not exist in
    isolation.
   Our understanding of real world objects is in
    terms of their relationships with other objects;
    for example, 'the earth circles the sun'; 'he is a
    carpenter' ; etc.
   Any real world object which we are going to
    include in a data model as an entity type must
    have some relationship with at least one other
    entity within the model (even if we are not
    going to implement that relationship within our
    database system).
One-to-one:
 Both tables can have only one record on
  either side of the relationship.
 Each primary key value relates to only one
  (or no) record in the related table.
 Most one-to-one relationships are forced
  by business rules and don't flow naturally
  from the data.
 In the absence of such a rule, you can
  usually combine both tables into one table
  without breaking any normalization rules.
One-to-One Relationships
 Contd:
               For example: a Factory may
               have many Managers
               during its lifetime; a Manager
               might be in charge of
 different Factories during his career.
One-to-many:
 The primary key table contains only one
  record that relates to none, one, or many
  records in the related table.
 This relationship is similar to the one
  between you and a parent.
 You have only one mother, but your mother
  may have several children.
One-to-many                             Contd:
A formal description:
  of the relationship shown in the diagram
  above is:
 One Factory may make zero or more
  Components.
 One Component is made in one (and only
  one) Factory.
One-to-one:                             Contd:
What this means in a database system is that:
 one record in a table called Factory may be
  related to a number of records in a
  Component table;
but
 a record in the Component table can only
  be related to one record in the Factory
  table.
One-to-Many Relationships summarised:
            For any occurrence of A, there may
  be 0,     1,   or many, occurrences of B.
            For any occurrence of B, there can
  only      be one     occurrence of A.
From another perspective:
 If an 'A' record exists there may be zero or
  more related 'B' records.
  Any 'B' record can only be related to a single
  'A' record.
Many-to-many:
 Each record in both tables can relate to any
  number of records (or no records) in the
  other table.
 For instance, if you have several siblings, so
  do your siblings (have many siblings).
 Many-to-many relationships require a third
  table, known as an associate or linking
  table, because relational systems can't
  directly accommodate the relationship.
Many-to-many:                          Contd:




   Minimally, a many-many relationship will
    require insertion of a 'link entity'.
   Further analysis may show that the link
    entity has attributes of its own - often
    qualifiers in respect of quantity or time.
Many-to-many:
 Contd:
   The physical design of the database specifies
    the physical configuration of the database on
    the storage media.
   This includes detailed specification of data
    elements, data types, indexing options and
    other parameters residing in the DBMS data
    dictionary.
   It is the detailed design of a system that
    includes modules & the database's hardware &
    software specifications of the system.
   In the case of relational databases the storage
    objects are tables which store data in rows and
    columns.
• The purpose of normailization
• Data redundancy and Update
  Anomalies
• Functional Dependencies
• The Process of Normalization
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
Normalization is a technique for producing a
set of relations with desirable properties, given
the data requirements of an enterprise.

The process of normalization is a formal method
that identifies relations based on their primary or
candidate keys and the functional dependencies
among their attributes.
Relations that have redundant data may have
problems called update anomalies, which are
classified as ,
      Insertion anomalies
      Deletion anomalies
      Modification anomalies
To insert a new staff with branchNo B007 into the
StaffBranch relation;
To delete a tuple that represents the last member of staff
located at a branch B007;
To change the address of branch B003.
StaffBranch
 staffNo   sName         position     salary   branchNo   bAddress
 SL21      John White    Manager      30000    B005       22 Deer Rd, London
 SG37      Ann Beech     Assistant    12000    B003       163 Main St,Glasgow
 SG14      David Ford    Supervisor   18000    B003       163 Main St,Glasgow
 SA9       Mary Howe     Assistant    9000     B007       16 Argyll St, Aberdeen
 SG5       Susan Brand   Manager      24000    B003       163 Main St,Glasgow
 SL41      Julie Lee     Assistant    9000     B005       22 Deer Rd, London

Figure 1 StraffBranch relation
Staff
   staffNo   sName         position     salary   branceNo

   SL21      John White    Manager      30000    B005
   SG37      Ann Beech     Assistant    12000    B003
   SG14      David Ford    Supervisor   18000    B003
   SA9       Mary Howe     Assistant    9000     B007
   SG5       Susan Brand   Manager      24000    B003
   SL41      Julie Lee     Assistant    9000     B005

Branch
 branceNo    bAddress
 B005        22 Deer Rd, London
 B007        16 Argyll St, Aberdeen
 B003        163 Main St,Glasgow


Figure 2 Straff and Branch relations
Functional dependency describes the relationship between
 attributes in a relation.
 For example, if A and B are attributes of relation R, and B is
 functionally dependent on A ( denoted A B), if each value of
 A is associated with exactly one value of B. ( A and B may each
 consist of one or more attributes.)



                       B is functionally
           A                                       B
                        dependent on A

Determinant      Refers to the attribute or group of attributes on
                 the left-hand side of the arrow of a functional
                 dependency
Trival functional dependency means that the right-hand
side is a subset ( not necessarily a proper subset) of the left-
hand side.
For example: (See Figure 1)
                 staffNo, sName  sName
                 staffNo, sName  staffNo

They do not provide any additional information about possible integrity
constraints on the values held by these attributes.

We are normally more interested in nontrivial dependencies because they
represent integrity constraints for the relation.
Main characteristics of functional dependencies in normalization

• Have a one-to-one relationship between attribute(s) on
  the left- and right- hand side of a dependency;

• hold for all time;

• are nontrivial.
Identifying the primary key
 Functional dependency is a property of the meaning or
 semantics of the attributes in a relation. When a
 functional
 dependency is present, the dependency is specified as a
 constraint between the attributes.
An important integrity constraint to consider first is the
identification of candidate keys, one of which is
selected to
be the primary key for the relation using functional
dependency.
Inference Rules
A set of all functional dependencies that are implied by a given
set of functional dependencies X is called closure of X, written
X+. A set of inference rule is needed to compute X+ from X.

Armstrong’s axioms

1.   Relfexivity:       If B is a subset of A, them A  B
2.   Augmentation:      If A  B, then A, C  B
3.   Transitivity:      If A  B and B  C, then A C
4.   Self-determination:           AA
5.   Decomposition: If A  B,C then A  B and A C
6.   Union:             If A  B and A  C, then A B,C
7.   Composition:       If A  B and C  D, then A,C B,
Minial Sets of Functional Dependencies
A set of functional dependencies X is minimal if it satisfies
the following condition:
  • Every dependency in X has a single attribute on its
  right-hand side

  • We cannot replace any dependency A  B in X with
  dependency C B, where C is a proper subset of A, and
  still have a set of dependencies that is equivalent to X.

  •   We cannot remove any dependency from X and still have a
      set of dependencies that is equivalent to X.
Example of A Minial Sets of Functional
  Dependencies
 A set of functional dependencies for the StaffBranch relation
 satisfies the three conditions for producing a minimal set.

         staffNo  sName
         staffNo  position
         staffNo  salary
         staffNo  branchNo
         staffNo  bAddress
         branchNo  bAddress
         branchNo, position  salary
         bAddress, position  salary
• Multivalued Attributes (or repeating groups):
  non-key attributes or groups of non-key
  attributes the values of which are not uniquely
  identified by (directly or indirectly) (not
  functionally dependent on) the value of the
  Primary Key (or its part).
   STUDENT

    Stud_ID   Name      Course_ID    Units
     101      Lennon     MSI 250     3.00
     101      Lennon     MSI 415     3.00
     125      Johnson    MSI 331     3.00
• Partial Dependency – when an non-key
  attribute is determined by a part, but not
  the whole, of a COMPOSITE primary key.
                            Partial
                          Dependency
       CUSTOMER

        Cust_ID   Name     Order_ID
         101      AT&T       1234
         101      AT&T        156
         125      Cisco      1250
• Transitive Dependency – when a non-
  key attribute determines another non-
  key attribute.     Transitive
                        Dependency

EMPLOYEE

 Emp_ID    F_Name   L_Name   Dept_ID   Dept_Name
   111      Mary     Jones      1         Acct
   122      Sarah   Smith       2        Mktg
•   Normalization is often executed as a series of steps.
    Each step corresponds to a specific normal form that has
    known properties.

•   As normalization proceeds, the relations become
    progressively more restricted in format, and also less
    vulnerable to update anomalies.

•   For the relational data model, it is important to recognize
    thatit is only first normal form (1NF) that is critical in
    creating relations. All the subsequent normal forms are
    optional.
•   Unnormalized – There are
    multivalued attributes or repeating
    groups
•   1 NF – No multivalued attributes or
    repeating groups.
•   2 NF – 1 NF plus no partial
    dependencies
•   3 NF – 2 NF plus no transitive
    dependencies
All attributes are directly
• ISBN  Title            or indirectly determined
• ISBN  Publisher           by the primary key;
                          therefore, the relation is
• Publisher  Address           at least in 1 NF



 BOOK

   ISBN       Title     Publisher          Address
• ISBN  Title          The relation is at least in 1NF.
• ISBN  Publisher        There is no COMPOSITE
                         primary key, therefore there
• Publisher  Address   can’t be partial dependencies.
                         Therefore, the relation is at
                                 least in 2NF


 BOOK

   ISBN      Title      Publisher           Address
Publisher is a non-key attribute,
                          and it determines Address,
• ISBN  Title             another non-key attribute.
                         Therefore, there is a transitive
• ISBN  Publisher      dependency, which means that
• Publisher  Address     the relation is NOT in 3 NF.



 BOOK

   ISBN       Title     Publisher          Address
We know that the relation is at
• ISBN  Title          least in 2NF, and it is not in 3
• ISBN  Publisher       NF. Therefore, we conclude
• Publisher  Address
                          that the relation is in 2NF.



 BOOK

   ISBN       Title      Publisher            Address
• Option 2: Remove the entire repeating group
 from the relation. Create another relation which
 would contain all the attributes of the repeating
 group, plus the primary key from the first
 relation. In this new relation, the primary key
 from the original relation and the determinant
 of the repeating group will comprise a primary
 key.
 STUDENT

  Stud_ID   Name      Course_ID    Units
    101     Lennon     MSI 250     3.00
    101     Lennon     MSI 415     3.00
    125     Johnson    MSI 331     3.00
STUDENT

        Stud_ID      Name
          101     Lennon
          125        Jonson


STUDENT_COURSE

Stud_ID     Course            Units
  101      MSI 250             3
  101      MSI 415             3
  125      MSI 331             3
Composite
                 Primary Key


STUDENT

Stud_ID   Name       Course_ID   Units
  101     Lennon       MSI 250   3.00
  101     Lennon       MSI 415   3.00
  125     Johnson      MSI 331   3.00
• Goal: Remove Partial Dependencies
                                 Partial
   Composite                  Dependencies
   Primary Key


 STUDENT

  Stud_ID        Name   Course_ID      Units
    101      Lennon      MSI 250       3.00
    101      Lennon      MSI 415       3.00
    125      Johnson     MSI 331       3.00
• Remove attributes that are dependent from the
  part but not the whole of the primary key from
  the original relation. For each partial
  dependency, create a new relation, with the
  corresponding part of the primary key from the
  original as the primary key.
  STUDENT

   Stud_ID   Name      Course_ID   Units
    101      Lennon     MSI 250    3.00
    101      Lennon     MSI 415    3.00
    125      Johnson    MSI 331    3.00
CUSTOMER
                                             STUDENT_COURSE
Stud_ID    Name      Course_ID     Units
  101      Lennon     MSI 250       3.00
  101      Lennon     MSI 415       3.00      Stud_ID   Course_ID
  125      Johnson    MSI 331       3.00
                                               101       MSI 250
                                               101       MSI 415
                                               125       MSI 331


        STUDENT                            COURSE


          Stud_ID                Name      Course_ID       Units
            101                  Lennon     MSI 250        3.00
            101                  Lennon     MSI 415        3.00
            125                  Johnson    MSI 331        3.00
• Goal: Get rid of transitive
  dependencies.
                      Transitive
                     Dependency
EMPLOYEE

 Emp_ID    F_Name   L_Name    Dept_ID Dept_Name
   111      Mary     Jones         1     Acct
   122      Sarah   Smith          2    Mktg
• Remove the attributes, which are dependent on
  a non-key attribute, from the original relation.
  For each transitive dependency, create a new
  relation with the non-key attribute which is a
  determinant in the transitive dependency as a
  primary key, and the dependent non-key
  attribute as a dependent.
EMPLOYEE

 Emp_ID     F_Name    L_Name   Dept_ID Dept_Name
   111       Mary      Jones      1        Acct
   122       Sarah     Smith      2        Mktg
EMPLOYEE

 Emp_ID    F_Name   L_Name   Dept_ID Dept_Name
   111      Mary     Jones     1        Acct
   122      Sarah    Smith     2       Mktg

               EMPLOYEE

                    Emp_ID             F_Name       L_Name    Dept_ID
                      111                  Mary       Jones     1
                      122                  Sarah     Smith      2

                                DEPARTMENT

                                   Dept_ID Dept_Name
                                       1           Acct
                                       2           Mktg
Repeating group = (propertyNo, pAddress,
                                         rentStart, rentFinish, rent, ownerNo, oName)
 Unnormalized form (UNF)
 A table that contains one or more repeating groups.

ClientNo   cName     propertyNo   pAddress      rentStart   rentFinish   rent   ownerNo   oName
                                  6 lawrence                                              Tina
                                                1-Jul-00    31-Aug-01    350    CO40      Murphy
                     PG4          St,Glasgow
           John
CR76
           kay                                                                            Tony
                     PG16         5 Novar Dr,                                             Shaw
                                                1-Sep-02    1-Sep-02     450    CO93
                                  Glasgow

                                  6 lawrence                                              Tina
                     PG4                        1-Sep-99    10-Jun-00    350    CO40      Murphy
                                  St,Glasgow

                                                                                          Tony
           Aline                  2 Manor Rd,
CR56                 PG36                       10-Oct-00   1-Dec-01     370    CO93      Shaw
           Stewart                Glasgow

                                                                                          Tony
                                  5 Novar Dr,                                             Shaw
                     PG16                       1-Nov-02    1-Aug-03     450    CO93
                                  Glasgow

Figure 3 ClientRental unnormalized table
First Normal Form is a relation in which the intersection of each
row and column contains one and only one value.
There are two approaches to removing repeating groups from
unnormalized tables:

   1. Removes the repeating groups by entering appropriate
      data in the empty columns of rows containing the
      repeating data.

   2. Removes the repeating group by placing the repeating
      data, along with a copy of the original key attribute(s), in
      a separate relation. A primary key is identified for the
      new relation.
The ClientRental relation is defined as follows,
ClientRental first approach, we remove the repeating group
  With the ( clientNo, propertyNo, cName, pAddress, rentStart,
rentFinish, rent, ownerNo, oName) entering the appropriate client
  (property rented details) by
  data into each row.
ClientNo   propertyNo   cName     pAddress      rentStart   rentFinish   rent   ownerNo   oName
                        John      6 lawrence                                              Tina
CR76       PG4                                  1-Jul-00    31-Aug-01    350    CO40
                        Kay       St,Glasgow                                              Murphy
                        John      5 Novar Dr,                                             Tony
CR76       PG16                                 1-Sep-02    1-Sep-02     450    CO93
                        Kay       Glasgow                                                 Shaw
                        Aline     6 lawrence                                              Tina
CR56       PG4                                  1-Sep-99    10-Jun-00    350    CO40
                        Stewart   St,Glasgow                                              Murphy
                                                                                          Tony
                        Aline     2 Manor Rd,
CR56       PG36                                 10-Oct-00   1-Dec-01     370    CO93      Shaw
                        Stewart   Glasgow
                                                                                          Tony
                        Aline     5 Novar Dr,
CR56       PG16                                 1-Nov-02    1-Aug-03     450    CO93      Shaw
                        Stewart   Glasgow


Figure 4 1NF ClientRental relation with the first approach
Client the second
With                 (clientNo, cName)
                  approach, we remove the repeating group
PropertyRentalOwner  (clientNo, propertyNo, pAddress, rentStart,
(property rented details) by placing the repeating data along wit
                     rentFinish, rent, ownerNo, oName)
a copy of the original key attribute (clientNo) in a separte relatio
 ClientNo   cName
 CR76       John Kay
 CR56       Aline Stewart
 ClientNo   propertyNo      pAddress      rentStart   rentFinish   rent   ownerNo   oName
                            6 lawrence                                              Tina
 CR76       PG4                           1-Jul-00    31-Aug-01    350    CO40
                            St,Glasgow                                              Murphy
                            5 Novar Dr,                                             Tony
 CR76       PG16                          1-Sep-02    1-Sep-02     450    CO93
                            Glasgow                                                 Shaw
                            6 lawrence                                              Tina
 CR56       PG4                           1-Sep-99    10-Jun-00    350    CO40
                            St,Glasgow                                              Murphy
                            2 Manor Rd,                                             Tony
 CR56       PG36                          10-Oct-00   1-Dec-01     370    CO93
                            Glasgow                                                 Shaw
                            5 Novar Dr,                                             Tony
 CR56       PG16                          1-Nov-02    1-Aug-03     450    CO93
                            Glasgow                                                 Shaw

 Figure 5 1NF ClientRental relation with the second approach
Full functional dependency indicates that if A and B
are
attributes of a relation, B is fully functionally
dependent on A if B is functionally dependent on A,
but not on any proper subset of A.

A functional dependency AB is partially dependent if
there is some attributes that can be removed from A and
the dependency still holds.
Second normal form (2NF) is a relation that is in first
normal form and every non-primary-key attribute is
fully functionally dependent on the primary key.

The normalization of 1NF relations to 2NF involves
the
removal of partial dependencies. If a partial
dependency exists, we remove the function
dependent attributes from
the relation by placing them in a new relation along
with
a copy of their determinant.
The ClientRental relation has the following functional
dependencies:

fd1    clientNo, propertyNo  rentStart, rentFinish (Primary Key)
fd2    clientNo  cName                                       (Partial
dependency)
fd3    propertyNo  pAddress, rent, ownerNo, oName            (Partial
dependency)
fd4    ownerNo  oName                              (Transitive Dependency)
fd5    clientNo, rentStart  propertyNo, pAddress,
       rentFinish, rent, ownerNo, oName                       (Candidate key)
fd6    propertyNo, rentStart  clientNo, cName, rentFinish (Candidate key)
After removing the partial dependencies, the creation of the three
 Client         (clientNo, cName)
new relations called Client, Rental, andrentStart, rentFinish)
 Rental         (clientNo, propertyNo, PropertyOwner
PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName
Client                                 Rental
                                       ClientNo    propertyNo   rentStart   rentFinish
 ClientNo   cName                      CR76        PG4          1-Jul-00    31-Aug-01
 CR76       John Kay
                                       CR76        PG16         1-Sep-02    1-Sep-02
 CR56       Aline Stewart
                                       CR56        PG4          1-Sep-99    10-Jun-00
                          CR56     PG36      10-Oct-00 1-Dec-01
 Client          (clientNo, cName)
                          CR56     PG16      1-Nov-02  1-Aug-03
 Rental          (clientNo, propertyNo, rentStart, rentFinish)
 PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName)
 propertyNo    pAddress                rent     ownerNo   oName
 PG4           6 lawrence St,Glasgow   350      CO40      Tina Murphy

 PG16          5 Novar Dr, Glasgow     450      CO93      Tony Shaw

 PG36          2 Manor Rd, Glasgow     370      CO93      Tony Shaw


 Figure 6 2NF ClientRental relation
Transitive dependency
 A condition where A, B, and C are attributes of a relation such th
 if A  B and B  C, then C is transitively dependent on A via B
 (provided that A is not functionally dependent on B or C).
Third normal form (3NF)
A relation that is in first and second normal form, and in
which
no non-primary-key attribute is transitively dependent on
the
primary key.

The normalization of 2NF relations to 3NF involves the
removal of transitive dependencies by placing the
attribute(s) in a new relation along with a copy of the
determinant.
The functional dependencies for the Client, Rental and
PropertyOwner relations are as follows:

Client
fd2      clientNo  cName
         (Primary Key)

Rental
fd1      clientNo, propertyNo  rentStart, rentFinish     (Primary Key)
fd5      clientNo, rentStart  propertyNo, rentFinish     (Candidate
key)
fd6      propertyNo, rentStart  clientNo, rentFinish     (Candidate
key)

PropertyOwner
fd3     propertyNo  pAddress, rent, ownerNo, oName
        (Primary Key)
fd4     ownerNo  oName                          (Transitive
Dependency)
The resulting 3NF relations have the forms:

Client       (clientNo, cName)
Rental       (clientNo, propertyNo, rentStart, rentFinish)
PropertyOwner (propertyNo, pAddress, rent, ownerNo)
Owner        (ownerNo, oName)
Client                                Rental

ClientNo   cName                      ClientNo        propertyNo    rentStart     rentFinish
CR76       John Kay                   CR76            PG4           1-Jul-00      31-Aug-01
CR56       Aline Stewart              CR76            PG16          1-Sep-02      1-Sep-02
                                      CR56            PG4           1-Sep-99      10-Jun-00
                                      CR56            PG36          10-Oct-00     1-Dec-01
                                      CR56            PG16          1-Nov-02      1-Aug-03



PropertyOwner                                                  Owner

propertyNo    pAddress                rent     ownerNo             ownerNo      oName

PG4           6 lawrence St,Glasgow   350      CO40                CO40         Tina Murphy

PG16          5 Novar Dr, Glasgow     450      CO93                CO93         Tony Shaw

PG36          2 Manor Rd, Glasgow     370      CO93



Figure 7 2NF ClientRental relation
Boyce-Codd normal form (BCNF)
A relation is in BCNF, if and only if, every determinant
is a
candidate key.
The difference between 3NF and BCNF is that for a
functional
dependency A  B, 3NF allows this dependency in a
relation
if B is a primary-key attribute and A is not a candidate
key,
whereas BCNF insists that for this dependency to
remain in a
relation, A must be a candidate key.
fd1 clientNo, interviewDate  interviewTime, staffNo, roomNo           (Primary
Key)
fd2 staffNo, interviewDate, interviewTime clientNo           (Candidate key)
fd3 roomNo, interviewDate, interviewTime  clientNo, staffNo
         (Candidate key)
fd4 staffNo, interviewDate  roomNo                  (not a candidate key)

As a consequece the ClientInterview relation may suffer from update anmalies.
For example, two tuples have to be updated if the roomNo need be changed for
staffNo SG5 on the 13-May-02.
  ClientInterview
  ClientNo   interviewDate   interviewTime   staffNo   roomNo
  CR76       13-May-02       10.30           SG5       G101
  CR76       13-May-02       12.00           SG5       G101
  CR74       13-May-02       12.00           SG37      G102
  CR56       1-Jul-02        10.30           SG5       G102




  Figure 8 ClientInterview relation
To transform the ClientInterview relation to BCNF, we must remove the violating
functional dependency by creating two new relations called Interview and SatffRoom
as shown below,

Interview (clientNo, interviewDate, interviewTime, staffNo)
StaffRoom(staffNo, interviewDate, roomNo)

 Interview
 ClientNo    interviewDate   interviewTime   staffNo
 CR76        13-May-02       10.30           SG5
 CR76        13-May-02       12.00           SG5
 CR74        13-May-02       12.00           SG37
 CR56        1-Jul-02        10.30           SG5

StaffRoom
 staffNo     interviewDate   roomNo
 SG5         13-May-02       G101
 SG37        13-May-02       G102
 SG5         1-Jul-02        G102

 Figure 9 BCNF Interview and StaffRoom relations
Multi-valued dependency (MVD)
represents a dependency between attributes (for example, A,
B and C) in a relation, such that for each value of A there is a
set of values for B and a set of value for C. However, the set of
values for B and C are independent of each other.
A multi-valued dependency can be further defined as
being
trivial or nontrivial. A MVD A > B in relation R is
defined as being trivial if
    • B is a subset of A
    or
    •AU B= R
A MVD is defined as being nontrivial if neither of the above
two conditions is satisfied.
Fourth normal form (4NF)
A relation that is in Boyce-Codd normal form and
contains
no nontrivial multi-valued dependencies.
Fifth normal form (5NF)
 A relation that has no join dependency.
Lossless-join dependency
A property of decomposition, which ensures that no spurious
tuples are generated when relations are reunited through a
natural join operation.

Join dependency
Describes a type of dependency. For example, for a relation R
with subsets of the attributes of R denoted as A, B, …, Z, a
relation R satisfies a join dependency if, and only if, every legal
value of R is equal to the join of its projections on A, B, …, Z.
   Atomicity requires that database
    modifications must follow an "all or nothing"
    rule.
   Each transaction is said to be atomic. If one
    part of the transaction fails, the entire
    transaction fails and the database state is left
    unchanged.
   To be compliant with the 'A', a system must
    guarantee the atomicity in each and every
    situation, including power failures / errors /
    crashes.
   This guarantees that 'an incomplete
    transaction' cannot exist.
   The consistency property ensures that any
    transaction the database performs will take it
    from one consistent state to another.
   Consistency states that only consistent (valid
    according to all the rules defined) data will be
    written to the database.
   Quite simply, whatever rows will be affected
    by the transaction will remain consistent with
    each and every rule that is applied to them
    (including but not limited to: constraints,
    cascades, triggers).
   While this is extremely simple and clear, it's
    worth noting that this consistency
    requirement applies to everything changed by
    the transaction, without any limit (including
    triggers firing other triggers launching
    cascades that eventually fire other triggers
    etc.) at all.
   Isolation refers to the requirement that no
    transaction should be able to interfere with
    another transaction
   In other words, it should not be possible that
    two transactions that affect the same rows
    run concurrently, as the outcome would be
    unpredicted and the system thus made
    unreliable at all.
   In effect the only strict way to respect the
    isolation property is to use a serial model
    where no two transactions can occur on the
    same data at the same time and where the
    result is predictable (i.e. transaction B will
    happen after transaction A in every single
    possible case).
   Durability means that once a transaction has
    been committed, it will remain so.
   In other words, every committed transaction
    is protected against power loss/crash/errors
    and cannot be lost by the system and can
    thus be guaranteed to be completed.
   In a relational database, for instance, once a
    group of SQL statements execute, the results
    need to be stored permanently.
   If the database crashes right after a group of
    SQL statements execute, it should be possible
    to restore the database state to the point
    after the last transaction committed.
   The transaction subtracts 10 from A and adds
    10 to B.
   If it succeeds, it would be valid, because the
    data continues to satisfy the constraint.
   However, assume that after removing 10 from
    A, the transaction is unable to modify B.
   If the database retains A's new value,
    atomicity and the constraint would both be
    violated. Atomicity requires that both parts of
    this transaction complete or neither.
   Consistency is a very general term that
    demands the data meets all validation rules.
   Also, it may be implied that both A and B
    must be integers.
   A valid range for A and B may also be
    implied. All validation rules must be checked
    to ensure consistency.
   Assume that a transaction attempts to
    subtract 10 from A without altering B.
   Because consistency is checked after each
    transaction, it is known that A + B = 100
    before the transaction begins.
   If the transaction removes 10 from A
    successfully, atomicity will be achieved.
   However, a validation check will show that A
    + B = 90.
   That is not consistent according to the rules
    of the database.
   The entire transaction must be cancelled and
    the affected rows rolled back to their pre-
    transaction state.

More Related Content

What's hot

Oracle Architecture
Oracle ArchitectureOracle Architecture
Oracle ArchitectureNeeraj Singh
 
Sql interview questions and answers
Sql interview questions and  answersSql interview questions and  answers
Sql interview questions and answerssheibansari
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
 
Lect 08 materialized view
Lect 08 materialized viewLect 08 materialized view
Lect 08 materialized viewBilal khan
 
Database 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanDatabase 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanIftikhar Ahmad
 
View, Store Procedure & Function and Trigger in MySQL - Thaipt
View, Store Procedure & Function and Trigger in MySQL - ThaiptView, Store Procedure & Function and Trigger in MySQL - Thaipt
View, Store Procedure & Function and Trigger in MySQL - ThaiptFramgia Vietnam
 
Complete dbms notes
Complete dbms notesComplete dbms notes
Complete dbms notesTanya Makkar
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19cMaria Colgan
 

What's hot (20)

Oracle Architecture
Oracle ArchitectureOracle Architecture
Oracle Architecture
 
Memory management
Memory managementMemory management
Memory management
 
Memory management
Memory managementMemory management
Memory management
 
Memory management
Memory managementMemory management
Memory management
 
Sql interview questions and answers
Sql interview questions and  answersSql interview questions and  answers
Sql interview questions and answers
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Kdd process
Kdd processKdd process
Kdd process
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Dbms
DbmsDbms
Dbms
 
Lect 08 materialized view
Lect 08 materialized viewLect 08 materialized view
Lect 08 materialized view
 
INTRODUCTION TO DATABASE
INTRODUCTION TO DATABASEINTRODUCTION TO DATABASE
INTRODUCTION TO DATABASE
 
Backup And Recovery
Backup And RecoveryBackup And Recovery
Backup And Recovery
 
RAID
RAIDRAID
RAID
 
Database 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanDatabase 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvan
 
View, Store Procedure & Function and Trigger in MySQL - Thaipt
View, Store Procedure & Function and Trigger in MySQL - ThaiptView, Store Procedure & Function and Trigger in MySQL - Thaipt
View, Store Procedure & Function and Trigger in MySQL - Thaipt
 
Sqlite
SqliteSqlite
Sqlite
 
Complete dbms notes
Complete dbms notesComplete dbms notes
Complete dbms notes
 
Triggers
TriggersTriggers
Triggers
 
Advanced Database System
Advanced Database SystemAdvanced Database System
Advanced Database System
 
What to Expect From Oracle database 19c
What to Expect From Oracle database 19cWhat to Expect From Oracle database 19c
What to Expect From Oracle database 19c
 

Similar to Introduction to database (20)

Databases and its representation
Databases and its representationDatabases and its representation
Databases and its representation
 
Presentation1
Presentation1Presentation1
Presentation1
 
DATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptxDATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptx
 
Database
DatabaseDatabase
Database
 
Databasell
DatabasellDatabasell
Databasell
 
Choosing your NoSQL storage
Choosing your NoSQL storageChoosing your NoSQL storage
Choosing your NoSQL storage
 
Database
DatabaseDatabase
Database
 
Database Management Systems ( Dbms )
Database Management Systems ( Dbms )Database Management Systems ( Dbms )
Database Management Systems ( Dbms )
 
Data models and ro
Data models and roData models and ro
Data models and ro
 
2. Chapter Two.pdf
2. Chapter Two.pdf2. Chapter Two.pdf
2. Chapter Two.pdf
 
Info systems databases
Info systems databasesInfo systems databases
Info systems databases
 
data base system to new data science lerne
data base system to new data science lernedata base system to new data science lerne
data base system to new data science lerne
 
Database
DatabaseDatabase
Database
 
Database
DatabaseDatabase
Database
 
Database systems Handbook 2V.pdf
Database systems Handbook 2V.pdfDatabase systems Handbook 2V.pdf
Database systems Handbook 2V.pdf
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
 
Database systems Handbook.pdf
Database systems Handbook.pdfDatabase systems Handbook.pdf
Database systems Handbook.pdf
 
Database systems Handbook.pdf
Database systems Handbook.pdfDatabase systems Handbook.pdf
Database systems Handbook.pdf
 
Database systems Handbook dbms.pdf
Database systems Handbook dbms.pdfDatabase systems Handbook dbms.pdf
Database systems Handbook dbms.pdf
 
Database systems Handbook dbms.pdf
Database systems Handbook dbms.pdfDatabase systems Handbook dbms.pdf
Database systems Handbook dbms.pdf
 

Recently uploaded

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Recently uploaded (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 

Introduction to database

  • 1. By: Engineer Muhammad Suleman Memon M.E(Information Technology) B.E(Computer System)
  • 2. A database is a simple, yet flexible and powerful tool for storing and retrieving data. Every company, every website, has lots of data. The more of your data that you keep in your database - the better. Far from being a tool only useful to big businesses, even if you just want a simple guest book or page hit counter, a database is perfect. Whichever database you use - it'll be a relational database.
  • 3. This is the industry standard design these days.  Relational databases use the principles of set theory.  Set theory is a field of mathematics that describes how to deal with sets of data.  Relational databases are quite intuitive and easy to understand.
  • 4. All data is held in tables.  A table has columns (along the top) and rows.  You create the tables you need. You define the table names.  You define what the column names are in each table.  You define what type of data the columns are...
  • 5. There are a number of different data types available which represent the different types of data you find in real life.  There are analogous types in all databases and programming languages. Each has variations, but they're all fundamentally the same.
  • 6. They are: • Numerical Types. i.e. Numbers. There are fundamentally two types: integer and float. Integers are whole numbers (i.e. 1, 2, 100, 999999). Floats are numbers with decimal places (i.e. (1.1, 22.5, 3.1415927). • String Types. i.e. Text. There are two types here: Fixed length, and variable length. 'char' is the only fixed length type in MySQL - from 1-255 characters. • 'varchar' is a variable length field that can be 1-255 characters. There are several • 'text' types of varying lengths in MySQL.
  • 7. Date and Time Types For storing dates & times.  Binary Data This is arbitrary data, could be images, programs absolutely anything.
  • 8. All Relational Databases use indexes.  Similar to the index in a book, indexes provide a quick way to find the exact data item you want.  Imagine you have a database of 100,000 customers, and you want to find just one.  If you just  read the 'customers' table from start to finish until you find the one your searching for, you  could end up having to read all 100,000 records.
  • 9. This would be very slow.  Most relational databases use a b-tree index structure.  This is a clever algorithm that guarantees that you can find a data item by reading at most 3 rows from the index.  Databases commonly have  millions of rows - so you can see the necessity for indexes!
  • 10. Indexes are a large part of databases and their design.  Defining a column as the primary key implicitly creates an index.  f you have a primary key on a table - it has an index.  You can add a number of indexes to each table you have.
  • 11. You'd use the create index command - more later...  Indexes are used automatically by the database itself when you issue a query (ask for data).  It uses the index to find the data in the table .  For example, we want to get a customer's  details from the example 'customers' table above...
  • 12. If we submit the following SQL query, the database will use the index it created for primary key column 'customer_id', and get everything for customer 1: select * from customers where customer_id = 1;  The database uses the index because it can use it.  The query contains the 'customer_id' so it  can look in the index and find the location of customer '1'.
  • 13. If there's no index on the column in the query, the database will have to go through the whole table! This is called a full table scan .
  • 14.  These days, when you talk about databases in the wild, you are primarily talking about two types: analytical databases and operational databases. Analytic Databases  Analytic databases (a.k.a. OLAP- On Line Analytical Processing) are primarily static, read-only databases which store archived, historical data used for analysis.
  • 15. For example, a company might store sales records over the last ten years in an analytic database and use that database to analyze marketing strategies in relationship to demographics.  On the web, you will often see analytic databases in the form of inventory catalogs such as the one shown previously from Amazon.com.  An inventory catalog analytical database usually holds descriptive information about all available products in the inventory.
  • 16. Web pages are generated dynamically by querying the list of available products in the inventory against some search parameters.  The dynamically-generated page will display the information about each item (such as title, author, ISBN) which is stored in the database.
  • 17. Operational databases (a.k.a. OLTP On Line Transaction Processing), on the other hand, are used to manage more dynamic bits of data.  These types of databases allow you to do more than simply view archived data.  Operational databases allow you to modify that data (add, change or delete data).  These types of databases are usually used to track real-time information.
  • 18. For example, a company might have an operational database used to track warehouse/stock quantities.  As customers order products from an online web store, an operational database can be used to keep track of how many items have been sold and when the company will need to reorder stock
  • 19.  Besides differentiating databases according to function, databases can also be differentiated according to how they model the data. What is a data model?  Well, essentially a data model is a "description" of both a container for data and a methodology for storing and retrieving data from that container.  Actually, there isn't really a data model "thing".
  • 20. Data models are abstractions, oftentimes mathematical algorithms and concepts.  You cannot really touch a data model.  But nevertheless, they are very useful.  The analysis and design of data models has been the cornerstone of the evolution of databases.  As models have advanced so has database efficiency.  Before the 1980's, the two most commonly used Database Models were the hierarchical and network systems.
  • 21. As its name implies, the Hierarchical Database Model defines hierarchically-arranged data.  Perhaps the most intuitive way to visualize this type of relationship is by visualizing an upside down tree of data.  In this tree, a single table acts as the "root" of the database from which other tables "branch" out.  You will be instantly familiar with this relationship because that is how all windows- based directory management systems (like Windows Explorer) work these days.
  • 22. Relationships in such a system are thought of in terms of children and parents such that a child may only have one parent but a parent can have multiple children.  Parents and children are tied together by links called "pointers" (perhaps physical addresses inside the file system).  A parent will have a list of pointers to each of their children.
  • 23.
  • 24. This child/parent rule assures that data is systematically accessible.  To get to a low-level table, you start at the root and work your way down through the tree until you reach your target.  Of course, as you might imagine, one problem with this system is that the user must know how the tree is structured in order to find anything!  The hierarchical model however, is much more efficient than the flat-file model we discussed earlier because there is not as much need for redundant data.
  • 25. If a change in the data is necessary, the change might only need to be processed once. Consider the student flatfile database example from our discussion of what databases are:
  • 26. Examples of hierarchical data represented as relational tables  An organization could store employee information in a table that contains attributes/columns such as employee number, first name, last name, and Department number.  The organization provides each employee with computer hardware as needed, but computer equipment may only be used by the employee to which it is assigned.  The organization could store the computer hardware information in a separate table that includes each part's serial number, type, and the employee that uses it.
  • 27. In many ways, the Network Database model was designed to solve some of the more serious problems with the Hierarchical Database Model.  Specifically, the Network model solves the problem of data redundancy by representing relationships in terms of sets rather than hierarchy.  The model had its origins in the Conference on Data Systems Languages (CODASYL) which had created the Data Base Task Group to explore and design a method to replace the hierarchical model.
  • 28. The network model is very similar to the hierarchical model actually.  In fact, the hierarchical model is a subset of the network model.  However, instead of using a single-parent tree hierarchy, the network model uses set theory to provide a tree-like hierarchy with the exception that child tables were allowed to have more than one parent.  his allowed the network model to support many-to-many relationships.
  • 29. Visually, a Network Database looks like a hierarchical Database in that you can see it as a type of tree.  However, in the case of a Network Database, the look is more like several trees which share branches.  Thus, children can have multiple parents and parents can have multiple children.
  • 30. (RDBMS - relational database management system) A database based on the relational model developed by E.F. Codd.  A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints.  In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields.
  • 31. Properties of Relational Tables:  Values Are Atomic  Each Row is Unique  Column Values Are of the Same Kind  The Sequence of Columns is Insignificant  The Sequence of Rows is Insignificant  Each Column Has a Unique Name
  • 32. Certain fields may be designated as keys, which means that searches for specific values of that field will use indexing to speed them up.  Where fields in two different tables take values from the same set, a join operation can be performed to select related records in the two tables by matching values in those fields.  Often, but not always, the fields will have the same name in both tables.
  • 33. For example, an "orders" table might contain (customer-ID, product-code) pairs and a "products" table might contain (product-code, price) pairs so to calculate a given customer's bill you would sum the prices of all products ordered by that customer by joining on the product-code fields of the two tables.  This can be extended to joining multiple tables on multiple fields.
  • 34. Because these relationships are only specified at retreival time, relational databases are classed as dynamic database management system.  The RELATIONAL database model is based on the Relational Algebra.
  • 35. Object/relational database management systems (ORDBMSs) add new object storage capabilities to the relational systems at the core of modern information systems.  These new facilities integrate management of traditional fielded data, complex objects such as time-series and geospatial data and diverse binary media such as audio, video, images, and applets.
  • 36. By encapsulating methods with data structures, an ORDBMS server can execute comple x analytical and data manipulation operations to search and transform multimedia and other complex objects.  As an evolutionary technology, the object/relational (OR) approach has inherited the robust transaction- and performance-management features of it s relational ancestor and the flexibility of its object-oriented cousin.
  • 37. database designers can work with familiar tabular structures and data definition languages (DDLs) while assimilating new object-management possibilities.  Query and procedural languages and call interfaces in ORDBMSs are familiar: SQL3, vendor procedural languages, and ODBC, JDBC, and proprie tary call interfaces are all extensions of RDBMS languages and interfaces.
  • 38. And the leading vendors are, of course, quite well known: IBM, Inform ix, and Oracle.
  • 39. Object DBMSs add database functionality to object programming languages.  They bring much more than persistent storage of programming language objects.  Object DBMSs extend the semantics of the C++, Smalltalk and Java object programming languages to provide full- featured database programming capability, while retaining native language compatibility.
  • 40. A major benefit of this approach is the unification of the application and database development into a seamless data model and language environment.  As a result, applications require less code, use more natural data modeling, and code bases are easier to maintain.  Object developers can write complete database applications with a modest amount of additional effort.
  • 41. According to Rao (1994), "The object- oriented database (OODB) paradigm is the combination of object-oriented programming language (OOPL) systems and persistent systems.  The power of the OODB comes from the seamless treatment of both persistent data, as found in databases, and transient data, as found in executing programs."
  • 42. In contrast to a relational DBMS where a complex data structure must be flattened out to fit into tables or joined together from those tables to form the in-memory structure, object DBMSs have no performance overhead to store or retrieve a web or hierarchy of interrelated objects.  This one-to-one mapping of object programming language objects to database objects has two benefits over other storage approaches:
  • 43. It provides higher performance management of objects, and it enables better management of the complex interrelationships between objects.  This makes object DBMSs better suited to support applications such as financial portfolio risk analysis systems, telecommunications service applications, world wide web document structures, design and manufacturing systems, and hospital patient record systems, which have complex relationships between data.
  • 44. In semistructured data model, the information that is normally associated with a schema is contained within the data, which is sometimes called ``self-describing''.  In such database there is no clear separation between the data and the schema, and the degree to which it is structured depends on the application.  In some forms of semistructured data there is no separate schema, in others it exists but only places loose constraints on the data.
  • 45. Semi-structured data is naturally modelled in terms of graphs which contain labels which give semantics to its underlying structure.  Such databases subsume the modelling power of recent extensions of flat relational databases, to nested databases which allow the nesting (or encapsulation) of entities, and to object databases which, in addition, allow cyclic references between objects.
  • 46. The associative model divides the real-world things about which data is to be recorded into two sorts:  Entities are things that have discrete, independent existence.  An entity’s existence does not depend on any other thing.  Associations are things whose existence depends on one or more other things, such that if any of those things ceases to exist, then the thing itself ceases to exist or becomes meaningless.
  • 47. An associative database comprises two data structures: 1. A set of items, each of which has a unique identifier, a name and a type. 2. A set of links, each of which has a unique identifier, together with the unique identifiers of three other things, that represent the source source, verb and target of a fact that is recorded about the source in the database. Each of the three things identified by the source, verb and target may be either a link or an item.
  • 48. The best way to understand the rationale of EAV design is to understand row modeling (of which EAV is a generalized form).  Consider a supermarket database that must manage thousands of products and brands, many of which have a transitory existence.  Here, it is intuitively obvious that product names should not be hard-coded as names of columns in tables. Instead, one stores product descriptions in a Products table: purchases/sales of individual items are recorded in other tables as separate rows with a product ID referencing this table.
  • 49. Conceptually an EAV design involves a single table with three columns, an entity (such as an olfactory receptor ID), an attribute (such as species, which is actually a pointer into the metadata table) and a value for the attribute (e.g., rat). In EAV design, one row stores a single fact.  In a conventional table that has one column per attribute, by contrast, one row stores a set of facts. EAV design is appropriate when the number of parameters that potentially apply to an entity is vastly more than those that actually apply to an individual entity.
  • 50. The context data model combines features of all the above models.  It can be considered as a collection of object- oriented, network and semistructured models or as some kind of object database.  In other words this is a flexible model, you can use any type of database structure depending on task. Such data model has been implemented in DBMS ConteXt.  The fundamental unit of information storage of ConteXt is a CLASS.
  • 51. Class contains METHODS and describes OBJECT.  The Object contains FIELDS and PROPERTY. The field may be composite, in this case the field contains SubFields etc.  The property is a set of fields that belongs to particular Object. (similar to AVL database). In other words, fields are permanent part of Object but Property is its variable part.  The header of Class contains the definition of the internal structure of the Object, which includes the description of each field, such as their type, length, attributes and name.
  • 52. Context data model has a set of predefined types as well as user defined types.  The predefined types include not only character strings, texts and digits but also pointers (references) and aggregate types (structures).  A context model comprises three main data types: REGULAR, VIRTUAL and REFERENCE.
  • 53. Database design is the process of producing a detailed data model of a database.  This logical data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a Data Definition Language, which can then be used to create a database.  A fully attributed data model contains detailed attributes for each entity.
  • 54. The term database design can be used to describe many different parts of the design of an overall database system.  Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data.  In the relational model these are the tables and views.
  • 55. Conceptual schema:  A conceptual schema or conceptual data model is a map of concepts and their relationships.  This describes the semantics of an organization and represents a series of assertions about its nature.  Specifically, it describes the things of significance to an organization (entity classes), about which it is inclined to collect information, and characteristics of (attributes) and associations between pairs of those things of significance (relationships).
  • 56. Because a conceptual schema represents the semantics of an organization, and not a database design, it may exist on various levels of abstraction.  Conceptual data models take a more abstract perspective, identifying the fundamental things, of which the things an individual deals with are just examples.  The model does allow for what is called inheritance in object oriented terms.
  • 57. A data structure diagram (DSD) is a data model or diagram used to describe conceptual data models by providing graphical notations which document entities and their relationships, and the constraints that binds them.
  • 58. Once the relationships and dependencies amongst the various pieces of information have been determined, it is possible to arrange the data into a logical structure which can then be mapped into the storage objects supported by the database management system.  Ensuring, via normalisation procedures and the definition of integrity rules, that the stored database will be non-redundant and properly connected.  logical data structuring) is based on the identification of: the entities, their attributes, and the relationships between the entities.
  • 59. Entity:  Something about which an enterprise needs to keep data. Attributes:  The properties of an entity. Relationships  The connections between entities.
  • 60.  An Entity may be physical Example: an Employee; a Part; a Machine  Or conceptual Example: a Project; an Order; a Course.  Each instance of an entity is different from all others - one or more attributes will typically form a 'primary key' attribute - unique to a particular instance.
  • 61. Attributes are the properties of an entity .  Data which describes or is 'owned' by an entity. Attributes (data) equate to facts - specific details about entities - details of interest.
  • 62. In the real world, objects do not exist in isolation.  Our understanding of real world objects is in terms of their relationships with other objects; for example, 'the earth circles the sun'; 'he is a carpenter' ; etc.  Any real world object which we are going to include in a data model as an entity type must have some relationship with at least one other entity within the model (even if we are not going to implement that relationship within our database system).
  • 63. One-to-one:  Both tables can have only one record on either side of the relationship.  Each primary key value relates to only one (or no) record in the related table.  Most one-to-one relationships are forced by business rules and don't flow naturally from the data.  In the absence of such a rule, you can usually combine both tables into one table without breaking any normalization rules.
  • 64. One-to-One Relationships Contd: For example: a Factory may have many Managers during its lifetime; a Manager might be in charge of different Factories during his career.
  • 65. One-to-many:  The primary key table contains only one record that relates to none, one, or many records in the related table.  This relationship is similar to the one between you and a parent.  You have only one mother, but your mother may have several children.
  • 66.
  • 67. One-to-many Contd: A formal description: of the relationship shown in the diagram above is:  One Factory may make zero or more Components.  One Component is made in one (and only one) Factory.
  • 68.
  • 69. One-to-one: Contd: What this means in a database system is that:  one record in a table called Factory may be related to a number of records in a Component table; but  a record in the Component table can only be related to one record in the Factory table.
  • 70. One-to-Many Relationships summarised: For any occurrence of A, there may be 0, 1, or many, occurrences of B. For any occurrence of B, there can only be one occurrence of A. From another perspective:  If an 'A' record exists there may be zero or more related 'B' records. Any 'B' record can only be related to a single 'A' record.
  • 71. Many-to-many:  Each record in both tables can relate to any number of records (or no records) in the other table.  For instance, if you have several siblings, so do your siblings (have many siblings).  Many-to-many relationships require a third table, known as an associate or linking table, because relational systems can't directly accommodate the relationship.
  • 72. Many-to-many: Contd:  Minimally, a many-many relationship will require insertion of a 'link entity'.  Further analysis may show that the link entity has attributes of its own - often qualifiers in respect of quantity or time.
  • 74. The physical design of the database specifies the physical configuration of the database on the storage media.  This includes detailed specification of data elements, data types, indexing options and other parameters residing in the DBMS data dictionary.  It is the detailed design of a system that includes modules & the database's hardware & software specifications of the system.  In the case of relational databases the storage objects are tables which store data in rows and columns.
  • 75. • The purpose of normailization • Data redundancy and Update Anomalies • Functional Dependencies • The Process of Normalization • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF)
  • 76. Normalization is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. The process of normalization is a formal method that identifies relations based on their primary or candidate keys and the functional dependencies among their attributes.
  • 77. Relations that have redundant data may have problems called update anomalies, which are classified as , Insertion anomalies Deletion anomalies Modification anomalies
  • 78. To insert a new staff with branchNo B007 into the StaffBranch relation; To delete a tuple that represents the last member of staff located at a branch B007; To change the address of branch B003. StaffBranch staffNo sName position salary branchNo bAddress SL21 John White Manager 30000 B005 22 Deer Rd, London SG37 Ann Beech Assistant 12000 B003 163 Main St,Glasgow SG14 David Ford Supervisor 18000 B003 163 Main St,Glasgow SA9 Mary Howe Assistant 9000 B007 16 Argyll St, Aberdeen SG5 Susan Brand Manager 24000 B003 163 Main St,Glasgow SL41 Julie Lee Assistant 9000 B005 22 Deer Rd, London Figure 1 StraffBranch relation
  • 79. Staff staffNo sName position salary branceNo SL21 John White Manager 30000 B005 SG37 Ann Beech Assistant 12000 B003 SG14 David Ford Supervisor 18000 B003 SA9 Mary Howe Assistant 9000 B007 SG5 Susan Brand Manager 24000 B003 SL41 Julie Lee Assistant 9000 B005 Branch branceNo bAddress B005 22 Deer Rd, London B007 16 Argyll St, Aberdeen B003 163 Main St,Glasgow Figure 2 Straff and Branch relations
  • 80. Functional dependency describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, and B is functionally dependent on A ( denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consist of one or more attributes.) B is functionally A B dependent on A Determinant Refers to the attribute or group of attributes on the left-hand side of the arrow of a functional dependency
  • 81. Trival functional dependency means that the right-hand side is a subset ( not necessarily a proper subset) of the left- hand side. For example: (See Figure 1) staffNo, sName  sName staffNo, sName  staffNo They do not provide any additional information about possible integrity constraints on the values held by these attributes. We are normally more interested in nontrivial dependencies because they represent integrity constraints for the relation.
  • 82. Main characteristics of functional dependencies in normalization • Have a one-to-one relationship between attribute(s) on the left- and right- hand side of a dependency; • hold for all time; • are nontrivial.
  • 83. Identifying the primary key Functional dependency is a property of the meaning or semantics of the attributes in a relation. When a functional dependency is present, the dependency is specified as a constraint between the attributes. An important integrity constraint to consider first is the identification of candidate keys, one of which is selected to be the primary key for the relation using functional dependency.
  • 84. Inference Rules A set of all functional dependencies that are implied by a given set of functional dependencies X is called closure of X, written X+. A set of inference rule is needed to compute X+ from X. Armstrong’s axioms 1. Relfexivity: If B is a subset of A, them A  B 2. Augmentation: If A  B, then A, C  B 3. Transitivity: If A  B and B  C, then A C 4. Self-determination: AA 5. Decomposition: If A  B,C then A  B and A C 6. Union: If A  B and A  C, then A B,C 7. Composition: If A  B and C  D, then A,C B,
  • 85. Minial Sets of Functional Dependencies A set of functional dependencies X is minimal if it satisfies the following condition: • Every dependency in X has a single attribute on its right-hand side • We cannot replace any dependency A  B in X with dependency C B, where C is a proper subset of A, and still have a set of dependencies that is equivalent to X. • We cannot remove any dependency from X and still have a set of dependencies that is equivalent to X.
  • 86. Example of A Minial Sets of Functional Dependencies A set of functional dependencies for the StaffBranch relation satisfies the three conditions for producing a minimal set. staffNo  sName staffNo  position staffNo  salary staffNo  branchNo staffNo  bAddress branchNo  bAddress branchNo, position  salary bAddress, position  salary
  • 87. • Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 88. • Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key. Partial Dependency CUSTOMER Cust_ID Name Order_ID 101 AT&T 1234 101 AT&T 156 125 Cisco 1250
  • 89. • Transitive Dependency – when a non- key attribute determines another non- key attribute. Transitive Dependency EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sarah Smith 2 Mktg
  • 90. Normalization is often executed as a series of steps. Each step corresponds to a specific normal form that has known properties. • As normalization proceeds, the relations become progressively more restricted in format, and also less vulnerable to update anomalies. • For the relational data model, it is important to recognize thatit is only first normal form (1NF) that is critical in creating relations. All the subsequent normal forms are optional.
  • 91. Unnormalized – There are multivalued attributes or repeating groups • 1 NF – No multivalued attributes or repeating groups. • 2 NF – 1 NF plus no partial dependencies • 3 NF – 2 NF plus no transitive dependencies
  • 92. All attributes are directly • ISBN  Title or indirectly determined • ISBN  Publisher by the primary key; therefore, the relation is • Publisher  Address at least in 1 NF BOOK ISBN Title Publisher Address
  • 93. • ISBN  Title The relation is at least in 1NF. • ISBN  Publisher There is no COMPOSITE primary key, therefore there • Publisher  Address can’t be partial dependencies. Therefore, the relation is at least in 2NF BOOK ISBN Title Publisher Address
  • 94. Publisher is a non-key attribute, and it determines Address, • ISBN  Title another non-key attribute. Therefore, there is a transitive • ISBN  Publisher dependency, which means that • Publisher  Address the relation is NOT in 3 NF. BOOK ISBN Title Publisher Address
  • 95. We know that the relation is at • ISBN  Title least in 2NF, and it is not in 3 • ISBN  Publisher NF. Therefore, we conclude • Publisher  Address that the relation is in 2NF. BOOK ISBN Title Publisher Address
  • 96. • Option 2: Remove the entire repeating group from the relation. Create another relation which would contain all the attributes of the repeating group, plus the primary key from the first relation. In this new relation, the primary key from the original relation and the determinant of the repeating group will comprise a primary key. STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 97. STUDENT Stud_ID Name 101 Lennon 125 Jonson STUDENT_COURSE Stud_ID Course Units 101 MSI 250 3 101 MSI 415 3 125 MSI 331 3
  • 98. Composite Primary Key STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 99. • Goal: Remove Partial Dependencies Partial Composite Dependencies Primary Key STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 100. • Remove attributes that are dependent from the part but not the whole of the primary key from the original relation. For each partial dependency, create a new relation, with the corresponding part of the primary key from the original as the primary key. STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 101. CUSTOMER STUDENT_COURSE Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 Stud_ID Course_ID 125 Johnson MSI 331 3.00 101 MSI 250 101 MSI 415 125 MSI 331 STUDENT COURSE Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Johnson MSI 331 3.00
  • 102. • Goal: Get rid of transitive dependencies. Transitive Dependency EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sarah Smith 2 Mktg
  • 103. • Remove the attributes, which are dependent on a non-key attribute, from the original relation. For each transitive dependency, create a new relation with the non-key attribute which is a determinant in the transitive dependency as a primary key, and the dependent non-key attribute as a dependent. EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sarah Smith 2 Mktg
  • 104. EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sarah Smith 2 Mktg EMPLOYEE Emp_ID F_Name L_Name Dept_ID 111 Mary Jones 1 122 Sarah Smith 2 DEPARTMENT Dept_ID Dept_Name 1 Acct 2 Mktg
  • 105. Repeating group = (propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName) Unnormalized form (UNF) A table that contains one or more repeating groups. ClientNo cName propertyNo pAddress rentStart rentFinish rent ownerNo oName 6 lawrence Tina 1-Jul-00 31-Aug-01 350 CO40 Murphy PG4 St,Glasgow John CR76 kay Tony PG16 5 Novar Dr, Shaw 1-Sep-02 1-Sep-02 450 CO93 Glasgow 6 lawrence Tina PG4 1-Sep-99 10-Jun-00 350 CO40 Murphy St,Glasgow Tony Aline 2 Manor Rd, CR56 PG36 10-Oct-00 1-Dec-01 370 CO93 Shaw Stewart Glasgow Tony 5 Novar Dr, Shaw PG16 1-Nov-02 1-Aug-03 450 CO93 Glasgow Figure 3 ClientRental unnormalized table
  • 106. First Normal Form is a relation in which the intersection of each row and column contains one and only one value. There are two approaches to removing repeating groups from unnormalized tables: 1. Removes the repeating groups by entering appropriate data in the empty columns of rows containing the repeating data. 2. Removes the repeating group by placing the repeating data, along with a copy of the original key attribute(s), in a separate relation. A primary key is identified for the new relation.
  • 107. The ClientRental relation is defined as follows, ClientRental first approach, we remove the repeating group With the ( clientNo, propertyNo, cName, pAddress, rentStart, rentFinish, rent, ownerNo, oName) entering the appropriate client (property rented details) by data into each row. ClientNo propertyNo cName pAddress rentStart rentFinish rent ownerNo oName John 6 lawrence Tina CR76 PG4 1-Jul-00 31-Aug-01 350 CO40 Kay St,Glasgow Murphy John 5 Novar Dr, Tony CR76 PG16 1-Sep-02 1-Sep-02 450 CO93 Kay Glasgow Shaw Aline 6 lawrence Tina CR56 PG4 1-Sep-99 10-Jun-00 350 CO40 Stewart St,Glasgow Murphy Tony Aline 2 Manor Rd, CR56 PG36 10-Oct-00 1-Dec-01 370 CO93 Shaw Stewart Glasgow Tony Aline 5 Novar Dr, CR56 PG16 1-Nov-02 1-Aug-03 450 CO93 Shaw Stewart Glasgow Figure 4 1NF ClientRental relation with the first approach
  • 108. Client the second With (clientNo, cName) approach, we remove the repeating group PropertyRentalOwner (clientNo, propertyNo, pAddress, rentStart, (property rented details) by placing the repeating data along wit rentFinish, rent, ownerNo, oName) a copy of the original key attribute (clientNo) in a separte relatio ClientNo cName CR76 John Kay CR56 Aline Stewart ClientNo propertyNo pAddress rentStart rentFinish rent ownerNo oName 6 lawrence Tina CR76 PG4 1-Jul-00 31-Aug-01 350 CO40 St,Glasgow Murphy 5 Novar Dr, Tony CR76 PG16 1-Sep-02 1-Sep-02 450 CO93 Glasgow Shaw 6 lawrence Tina CR56 PG4 1-Sep-99 10-Jun-00 350 CO40 St,Glasgow Murphy 2 Manor Rd, Tony CR56 PG36 10-Oct-00 1-Dec-01 370 CO93 Glasgow Shaw 5 Novar Dr, Tony CR56 PG16 1-Nov-02 1-Aug-03 450 CO93 Glasgow Shaw Figure 5 1NF ClientRental relation with the second approach
  • 109. Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency AB is partially dependent if there is some attributes that can be removed from A and the dependency still holds.
  • 110. Second normal form (2NF) is a relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. The normalization of 1NF relations to 2NF involves the removal of partial dependencies. If a partial dependency exists, we remove the function dependent attributes from the relation by placing them in a new relation along with a copy of their determinant.
  • 111. The ClientRental relation has the following functional dependencies: fd1 clientNo, propertyNo  rentStart, rentFinish (Primary Key) fd2 clientNo  cName (Partial dependency) fd3 propertyNo  pAddress, rent, ownerNo, oName (Partial dependency) fd4 ownerNo  oName (Transitive Dependency) fd5 clientNo, rentStart  propertyNo, pAddress, rentFinish, rent, ownerNo, oName (Candidate key) fd6 propertyNo, rentStart  clientNo, cName, rentFinish (Candidate key)
  • 112. After removing the partial dependencies, the creation of the three Client (clientNo, cName) new relations called Client, Rental, andrentStart, rentFinish) Rental (clientNo, propertyNo, PropertyOwner PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName Client Rental ClientNo propertyNo rentStart rentFinish ClientNo cName CR76 PG4 1-Jul-00 31-Aug-01 CR76 John Kay CR76 PG16 1-Sep-02 1-Sep-02 CR56 Aline Stewart CR56 PG4 1-Sep-99 10-Jun-00 CR56 PG36 10-Oct-00 1-Dec-01 Client (clientNo, cName) CR56 PG16 1-Nov-02 1-Aug-03 Rental (clientNo, propertyNo, rentStart, rentFinish) PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName) propertyNo pAddress rent ownerNo oName PG4 6 lawrence St,Glasgow 350 CO40 Tina Murphy PG16 5 Novar Dr, Glasgow 450 CO93 Tony Shaw PG36 2 Manor Rd, Glasgow 370 CO93 Tony Shaw Figure 6 2NF ClientRental relation
  • 113. Transitive dependency A condition where A, B, and C are attributes of a relation such th if A  B and B  C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Third normal form (3NF) A relation that is in first and second normal form, and in which no non-primary-key attribute is transitively dependent on the primary key. The normalization of 2NF relations to 3NF involves the removal of transitive dependencies by placing the attribute(s) in a new relation along with a copy of the determinant.
  • 114. The functional dependencies for the Client, Rental and PropertyOwner relations are as follows: Client fd2 clientNo  cName (Primary Key) Rental fd1 clientNo, propertyNo  rentStart, rentFinish (Primary Key) fd5 clientNo, rentStart  propertyNo, rentFinish (Candidate key) fd6 propertyNo, rentStart  clientNo, rentFinish (Candidate key) PropertyOwner fd3 propertyNo  pAddress, rent, ownerNo, oName (Primary Key) fd4 ownerNo  oName (Transitive Dependency)
  • 115. The resulting 3NF relations have the forms: Client (clientNo, cName) Rental (clientNo, propertyNo, rentStart, rentFinish) PropertyOwner (propertyNo, pAddress, rent, ownerNo) Owner (ownerNo, oName)
  • 116. Client Rental ClientNo cName ClientNo propertyNo rentStart rentFinish CR76 John Kay CR76 PG4 1-Jul-00 31-Aug-01 CR56 Aline Stewart CR76 PG16 1-Sep-02 1-Sep-02 CR56 PG4 1-Sep-99 10-Jun-00 CR56 PG36 10-Oct-00 1-Dec-01 CR56 PG16 1-Nov-02 1-Aug-03 PropertyOwner Owner propertyNo pAddress rent ownerNo ownerNo oName PG4 6 lawrence St,Glasgow 350 CO40 CO40 Tina Murphy PG16 5 Novar Dr, Glasgow 450 CO93 CO93 Tony Shaw PG36 2 Manor Rd, Glasgow 370 CO93 Figure 7 2NF ClientRental relation
  • 117. Boyce-Codd normal form (BCNF) A relation is in BCNF, if and only if, every determinant is a candidate key. The difference between 3NF and BCNF is that for a functional dependency A  B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key, whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key.
  • 118. fd1 clientNo, interviewDate  interviewTime, staffNo, roomNo (Primary Key) fd2 staffNo, interviewDate, interviewTime clientNo (Candidate key) fd3 roomNo, interviewDate, interviewTime  clientNo, staffNo (Candidate key) fd4 staffNo, interviewDate  roomNo (not a candidate key) As a consequece the ClientInterview relation may suffer from update anmalies. For example, two tuples have to be updated if the roomNo need be changed for staffNo SG5 on the 13-May-02. ClientInterview ClientNo interviewDate interviewTime staffNo roomNo CR76 13-May-02 10.30 SG5 G101 CR76 13-May-02 12.00 SG5 G101 CR74 13-May-02 12.00 SG37 G102 CR56 1-Jul-02 10.30 SG5 G102 Figure 8 ClientInterview relation
  • 119. To transform the ClientInterview relation to BCNF, we must remove the violating functional dependency by creating two new relations called Interview and SatffRoom as shown below, Interview (clientNo, interviewDate, interviewTime, staffNo) StaffRoom(staffNo, interviewDate, roomNo) Interview ClientNo interviewDate interviewTime staffNo CR76 13-May-02 10.30 SG5 CR76 13-May-02 12.00 SG5 CR74 13-May-02 12.00 SG37 CR56 1-Jul-02 10.30 SG5 StaffRoom staffNo interviewDate roomNo SG5 13-May-02 G101 SG37 13-May-02 G102 SG5 1-Jul-02 G102 Figure 9 BCNF Interview and StaffRoom relations
  • 120. Multi-valued dependency (MVD) represents a dependency between attributes (for example, A, B and C) in a relation, such that for each value of A there is a set of values for B and a set of value for C. However, the set of values for B and C are independent of each other. A multi-valued dependency can be further defined as being trivial or nontrivial. A MVD A > B in relation R is defined as being trivial if • B is a subset of A or •AU B= R A MVD is defined as being nontrivial if neither of the above two conditions is satisfied.
  • 121. Fourth normal form (4NF) A relation that is in Boyce-Codd normal form and contains no nontrivial multi-valued dependencies.
  • 122. Fifth normal form (5NF) A relation that has no join dependency. Lossless-join dependency A property of decomposition, which ensures that no spurious tuples are generated when relations are reunited through a natural join operation. Join dependency Describes a type of dependency. For example, for a relation R with subsets of the attributes of R denoted as A, B, …, Z, a relation R satisfies a join dependency if, and only if, every legal value of R is equal to the join of its projections on A, B, …, Z.
  • 123. Atomicity requires that database modifications must follow an "all or nothing" rule.  Each transaction is said to be atomic. If one part of the transaction fails, the entire transaction fails and the database state is left unchanged.  To be compliant with the 'A', a system must guarantee the atomicity in each and every situation, including power failures / errors / crashes.  This guarantees that 'an incomplete transaction' cannot exist.
  • 124. The consistency property ensures that any transaction the database performs will take it from one consistent state to another.  Consistency states that only consistent (valid according to all the rules defined) data will be written to the database.  Quite simply, whatever rows will be affected by the transaction will remain consistent with each and every rule that is applied to them (including but not limited to: constraints, cascades, triggers).
  • 125. While this is extremely simple and clear, it's worth noting that this consistency requirement applies to everything changed by the transaction, without any limit (including triggers firing other triggers launching cascades that eventually fire other triggers etc.) at all.
  • 126. Isolation refers to the requirement that no transaction should be able to interfere with another transaction  In other words, it should not be possible that two transactions that affect the same rows run concurrently, as the outcome would be unpredicted and the system thus made unreliable at all.
  • 127. In effect the only strict way to respect the isolation property is to use a serial model where no two transactions can occur on the same data at the same time and where the result is predictable (i.e. transaction B will happen after transaction A in every single possible case).
  • 128. Durability means that once a transaction has been committed, it will remain so.  In other words, every committed transaction is protected against power loss/crash/errors and cannot be lost by the system and can thus be guaranteed to be completed.  In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently.  If the database crashes right after a group of SQL statements execute, it should be possible to restore the database state to the point after the last transaction committed.
  • 129. The transaction subtracts 10 from A and adds 10 to B.  If it succeeds, it would be valid, because the data continues to satisfy the constraint.  However, assume that after removing 10 from A, the transaction is unable to modify B.  If the database retains A's new value, atomicity and the constraint would both be violated. Atomicity requires that both parts of this transaction complete or neither.
  • 130. Consistency is a very general term that demands the data meets all validation rules.  Also, it may be implied that both A and B must be integers.  A valid range for A and B may also be implied. All validation rules must be checked to ensure consistency.  Assume that a transaction attempts to subtract 10 from A without altering B.  Because consistency is checked after each transaction, it is known that A + B = 100 before the transaction begins.
  • 131. If the transaction removes 10 from A successfully, atomicity will be achieved.  However, a validation check will show that A + B = 90.  That is not consistent according to the rules of the database.  The entire transaction must be cancelled and the affected rows rolled back to their pre- transaction state.