Today’s Agenda
 Any questions about the assignment (due
Mon)?
 Quiz
 Quiz review
 Homework for Friday:
 Watch the two videos on the Coursera db
website that deal with relational algebra
 And/or read up on it from your favorite resource
 Formalize concepts of database anomalies
 If time, finish off our Transit Tables
4-1
Today’s slides from:
Chapter 4
The Database Management
System Concept
Fundamentals of Database Management Systems
by
Mark L. Gillenson, Ph.D.
University of Memphis
John Wiley & Sons, Inc.
4-3
Objectives
 List the three problems created by data
redundancy.
 Describe the nature of data redundancy
among many files.
 Explain the relationship between data
integration and data redundancy in one
file.
4-4
The Database Concept
 Data Integration and Data Redundancy - The
ability to achieve data integration while at the
same time storing data in a nonredundant
fashion. This, alone, is the central, defining
feature of the database approach.
 Multiple Relationships - The ability to store data
representing entities involved in multiple
relationships without introducing data
redundancy or other structural problems.
4-5
Data Integration and Data
Redundancy
 Data integration - the ability to tie together
pieces of related data within an
information system.
 Data redundancy - the same fact about the
business environment is stored more than
once within an information system.
4-6
Data Redundancy - Problems
 Redundant data takes up a great deal of
extra disk space.
 If the redundant data has to be updated, it
takes additional time to do so. This can be
a major performance issue.
 There is the potential for data integrity
problems.
4-7
Data Integrity
 Refers to the accuracy of the data.
 Inaccurate data leaves the whole
information system of limited value.
4-8
Data Redundancy,
Data Integrity
 When all copies of redundant data are not
updated consistently, a data integrity problem
exists.
4-9
Three Files with Redundant
Data
Sales file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 123 Elm Street
Accounts Receivable file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 123 Elm Street
Credit file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 123 Elm Street
4-10
Three Files with a Data
Integrity Problem
Sales file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 456 Oak Street
Accounts Receivable file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 456 Oak Street
Credit file
Customer
Number
Customer
Name
Customer
Address
2746795 John Jones 123 Elm Street
4-11
(a) Salesperson file
Salesperson
Number
Salesperson
Name
Commission
Percentage
Year Of Hire
137 Baker 10 1995
186 Adams 15 2001
204 Dickens 10 1998
361 Carlyle 20 2001
(b) Customer file
Customer
Number
Customer
Name
Salesperson
Number HQ City
0121 Main St.
Hardware
137 New York
0839 Jane’s Stores 186 Chicago
0933 ABC Home
Stores
137 Los Angeles
1047 Acme
Hardware Store
137 Los Angeles
1525 Fred’s Tool
Stores
361 Atlanta
1700 XYZ Stores 361 Washington
1826 City Hardware 137 New York
2198 Western
Hardware
204 New York
2267 Central Stores 186 New York
 General
Hardware
Company
Files
4-12
General Hardware Company
Combined File
0121 Main St.
Hardware
137 New York 137 Baker 10 1995
0839 Jane’s Stores 186 Chicago 186 Adams 15 2001
0933 ABC Home
Stores
137 Los Angeles 137 Baker 10 1995
1047 Acme
Hardware
Store
137 Los Angeles 137 Baker 10 1995
1525 Fred’s Tool
Stores
361 Atlanta 361 Carlyle 20 2001
1700 XYZ Stores 361 Washington 361 Carlyle 20 2001
1826 City Hardware 137 New York 137 Baker 10 1995
2198 Western
Hardware
204 New York 204 Dickens 10 1998
2267 Central Stores 186 New York 186 Adams 15 2001
4-13
Anomalies
 Typically occur in poorly structured files.
 Problems arise when two different kinds of
data, like salesperson and customer data
are merged into one file.
4-14
Anomalies
 Deletion Anomaly - e.g, if you delete a customer and
that record was the only one for a salesperson, the
salesperson’s data is gone.
 Insertion Anomaly - e.g., General Hardware cannot add
data about a new salesperson the company just hired
until she is assigned at least one customer.
 Update Anomaly - redundant data in the database file
must be updated each place it exists when it changes
4-15
Database Management
System
 A software utility for storing and retrieving
data that gives the end-user the
impression that the data is well integrated
even though the data can be stored with
no redundancy at all.
How would you represent
Sales and Customers?
 What is the cardinality/modality of the
relationship
 How should we handle it?
4-16
In order to avoid anomalies,
we normalize our tables
 But first… we will start a side trip into
Relational Algebra first
 Need to know the basic rules for how
relational data is
updated/deleted/combined/selected/projected
in order to fully understand the implications of
normalization
 Do your homework for Friday – dive into
relational algebra
4-17
4-18
“Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction
or translation of this work beyond that permitted in Section 117 of the 1976
United States Copyright Act without express permission of the copyright owner
is unlawful. Request for further information should be addressed to the
Permissions Department, John Wiley & Sons, Inc. The purchaser may make
back-up copies for his/her own use only and not for distribution or resale. The
Publisher assumes no responsibility for errors, omissions, or damages caused
by the use of these programs or from the use of the information contained
herein.”

data_anomalies.ppt

  • 1.
    Today’s Agenda  Anyquestions about the assignment (due Mon)?  Quiz  Quiz review  Homework for Friday:  Watch the two videos on the Coursera db website that deal with relational algebra  And/or read up on it from your favorite resource  Formalize concepts of database anomalies  If time, finish off our Transit Tables 4-1
  • 2.
    Today’s slides from: Chapter4 The Database Management System Concept Fundamentals of Database Management Systems by Mark L. Gillenson, Ph.D. University of Memphis John Wiley & Sons, Inc.
  • 3.
    4-3 Objectives  List thethree problems created by data redundancy.  Describe the nature of data redundancy among many files.  Explain the relationship between data integration and data redundancy in one file.
  • 4.
    4-4 The Database Concept Data Integration and Data Redundancy - The ability to achieve data integration while at the same time storing data in a nonredundant fashion. This, alone, is the central, defining feature of the database approach.  Multiple Relationships - The ability to store data representing entities involved in multiple relationships without introducing data redundancy or other structural problems.
  • 5.
    4-5 Data Integration andData Redundancy  Data integration - the ability to tie together pieces of related data within an information system.  Data redundancy - the same fact about the business environment is stored more than once within an information system.
  • 6.
    4-6 Data Redundancy -Problems  Redundant data takes up a great deal of extra disk space.  If the redundant data has to be updated, it takes additional time to do so. This can be a major performance issue.  There is the potential for data integrity problems.
  • 7.
    4-7 Data Integrity  Refersto the accuracy of the data.  Inaccurate data leaves the whole information system of limited value.
  • 8.
    4-8 Data Redundancy, Data Integrity When all copies of redundant data are not updated consistently, a data integrity problem exists.
  • 9.
    4-9 Three Files withRedundant Data Sales file Customer Number Customer Name Customer Address 2746795 John Jones 123 Elm Street Accounts Receivable file Customer Number Customer Name Customer Address 2746795 John Jones 123 Elm Street Credit file Customer Number Customer Name Customer Address 2746795 John Jones 123 Elm Street
  • 10.
    4-10 Three Files witha Data Integrity Problem Sales file Customer Number Customer Name Customer Address 2746795 John Jones 456 Oak Street Accounts Receivable file Customer Number Customer Name Customer Address 2746795 John Jones 456 Oak Street Credit file Customer Number Customer Name Customer Address 2746795 John Jones 123 Elm Street
  • 11.
    4-11 (a) Salesperson file Salesperson Number Salesperson Name Commission Percentage YearOf Hire 137 Baker 10 1995 186 Adams 15 2001 204 Dickens 10 1998 361 Carlyle 20 2001 (b) Customer file Customer Number Customer Name Salesperson Number HQ City 0121 Main St. Hardware 137 New York 0839 Jane’s Stores 186 Chicago 0933 ABC Home Stores 137 Los Angeles 1047 Acme Hardware Store 137 Los Angeles 1525 Fred’s Tool Stores 361 Atlanta 1700 XYZ Stores 361 Washington 1826 City Hardware 137 New York 2198 Western Hardware 204 New York 2267 Central Stores 186 New York  General Hardware Company Files
  • 12.
    4-12 General Hardware Company CombinedFile 0121 Main St. Hardware 137 New York 137 Baker 10 1995 0839 Jane’s Stores 186 Chicago 186 Adams 15 2001 0933 ABC Home Stores 137 Los Angeles 137 Baker 10 1995 1047 Acme Hardware Store 137 Los Angeles 137 Baker 10 1995 1525 Fred’s Tool Stores 361 Atlanta 361 Carlyle 20 2001 1700 XYZ Stores 361 Washington 361 Carlyle 20 2001 1826 City Hardware 137 New York 137 Baker 10 1995 2198 Western Hardware 204 New York 204 Dickens 10 1998 2267 Central Stores 186 New York 186 Adams 15 2001
  • 13.
    4-13 Anomalies  Typically occurin poorly structured files.  Problems arise when two different kinds of data, like salesperson and customer data are merged into one file.
  • 14.
    4-14 Anomalies  Deletion Anomaly- e.g, if you delete a customer and that record was the only one for a salesperson, the salesperson’s data is gone.  Insertion Anomaly - e.g., General Hardware cannot add data about a new salesperson the company just hired until she is assigned at least one customer.  Update Anomaly - redundant data in the database file must be updated each place it exists when it changes
  • 15.
    4-15 Database Management System  Asoftware utility for storing and retrieving data that gives the end-user the impression that the data is well integrated even though the data can be stored with no redundancy at all.
  • 16.
    How would yourepresent Sales and Customers?  What is the cardinality/modality of the relationship  How should we handle it? 4-16
  • 17.
    In order toavoid anomalies, we normalize our tables  But first… we will start a side trip into Relational Algebra first  Need to know the basic rules for how relational data is updated/deleted/combined/selected/projected in order to fully understand the implications of normalization  Do your homework for Friday – dive into relational algebra 4-17
  • 18.
    4-18 “Copyright 2004 JohnWiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information contained herein.”