CLASSROOM OPENER GREAT BUSINESS DECISIONS – Edgar Codd’s Relational Database Theory Edgar Frank Codd was born at Portland, Dorset, in England. He studied mathematics and chemistry at Exeter College, Oxford, before serving as a pilot in the Royal Air Force during the Second World War. In 1948, he moved to New York to work for IBM as a mathematical programmer. In 1953 Codd moved to Ottawa, Canada. A decade later he returned to the USA and received his doctorate in computer science from the University of Michigan in Ann Arbor. Two years later he moved to San Jose, California to work at IBM's Almaden Research Center. In the 1960s and 1970s he worked out his theories of data arrangement, issuing his paper "A Relational Model of Data for Large Shared Data Banks" in 1970, after an internal IBM paper one year earlier. To his disappointment, IBM proved slow to exploit his suggestions until commercial rivals started implementing them. Initially, IBM refused to implement the relational model in order to preserve revenue from IMS/DB. Codd then showed IBM customers the potential of the implementation of its model, and they in turn pressured IBM. Then IBM included in its Future System project a System R subproject — but put in charge of it were developers who were not thoroughly familiar with Codd's ideas, and isolated the team from Codd. As a result, they did not use Codd's own Alpha language but created a non-relational one, SEQUEL. Even so, SEQUEL was so superior to pre-relational systems that it was copied, based on pre-launch papers presented at conferences, by Larry Ellison in his Oracle DBMS, which actually reached market before SQL/DS — due to the then-already proprietary status of the original moniker, SEQUEL had been renamed SQL. Codd continued to develop and extend his relational model, sometimes in collaboration with Chris Date. One of the normalized forms, the Boyce-Codd Normal Form, is named after Codd. Codd also coined the term OLAP and wrote the twelve laws of online analytical processing, although these were never truly accepted after it came out that his white paper on the subject was paid for by a software vendor. Edgar F. Codd died of heart failure at his home in Williams Island, Florida at the age of 79 on Friday, April 18, 2003.
7.1 Define the fundamental concepts of the relational database model The relational database model stores information in the form of logically related two-dimensional tables Entities, attributes, primary keys, and foreign keys are all fundamental concepts included in the relational database model 7.2 Evaluate the advantages of the relational database model Database advantages from a business perspective include Increased flexibility Increased scalability and performance Reduced information redundancy Increased information integrity (quality) Increased information security 7.3 Compare operational integrity constraints and business-critical integrity constraints Operational integrity constraints are rules that enforce basic and fundamental information-based constraints Business-critical integrity constraints are rules that enforce business rules vital to an organization’s success and often require more insight and knowledge than operational integrity constraints
7. 4 Describe the benefits of a data-driven Web site. A data-driven Web site is an interactive Web site kept constantly updated and relevant to the needs of its customers through the use of a database. Data-driven Web sites are especially useful when the site offers a great deal of information, products, or services. Web site visitors are frequently angered if they are buried under an avalanche of information when searching a Web site. A data-driven Web site invites visitors to select and view what they are interested in by inserting a query, which the Web site then analyzes and custom builds a Web page in real-time that satisfies the query. 7.5 Describe the two primary methods for integrating information across multiple databases. Forward integration – takes information entered into a given system and sends it automatically to all downstream systems and processes. Backward integration – takes information entered into a given system and sends it automatically to all upstream systems and processes.
How many of you are familiar with databases? What kinds of databases can be found around your college? Student registration Course evaluation Payroll Parking services Explain to your students that almost every business decision is based on information The information required to make these decisions is typically stored in databases
Most organizations use the relational database model This text focuses on the relational database model Discuss the Coca-Cola Bottling Company of Egypt example in the text
This text focuses on the relational database model Review Figure 7.1 What kinds of additional entity classes might be found in this database? INVENTORY, MARKETING CAMPAIGN, SALES QUOTE, INVOICE, PAYMENT What kinds of additional entities might be found in the CUSTOMER table? Could include any additional customer – Joe’s Mexican Restaurant, Fitness Forever, and Summer’s Flower Shop (these are all fictitious) Review Figure 7.1 What kinds of additional attributes might be found in the CUSTOMER table for Dave’s Sub Shop? Could include any additional customer information: Address Fax E-mail Cell phone
Review Figure 7.1 Explain to your students that the logic that correlates the tables is implemented through the primary keys For example: Hawkins Shipping in the DISTRIBUTOR table has a primary key called Distributor ID – DEN8001 Notice that Hawkins Shipping ( Distributor ID DEN8001 ) is responsible for delivering orders 34561 and 345652 Therefore, Distributor ID in the ORDER table creates a logical relationship (who shipped what order) between ORDER and DISTRIBUTOR
Walk your students through the relational database model in Figure 7.1 To ensure your students are grasping the concepts, ask them to answer the following: How many orders have been placed for T’s Fun Zone? Ans: 1 Order IT 34563 How many orders have been placed for Pizza Palace? Ans: None How many items are included in Dave’s Sub Shop’s two orders? Ans: Order 34561 has 3 items and order 34562 has one item for a total of 4 items in both orders. Who is responsible for distributing Dave’s Sub Shop’s orders? Ans: Hawkins Shipping Which products are included in Order 34562? Ans: 300 Vanilla Coke
All of the above are discussed in the following slides: A good way to explain databases is to compare them to spreadsheets What are the limitations when using a spreadsheet? Limited number of rows and columns (Excel - 65,536 rows by 256 columns) Once you use more than 65,536 rows you have outgrown your spreadsheet Only one users can access the spreadsheet Users can view all information in the spreadsheet Users can change all information in the spreadsheet All of the disadvantages associated with a spreadsheet are fixed when using a database These advantages are discussed in detail over the next several slides
The separation between logical and physical views is what allows each user to access database information differently What would happen if a new database called “RealData” hit the market and allowed only one logical view? The “RealData” database simply would never sell. With only one logical view every person in an entire organization would have the same view Define two database views for your school’s student database (one for students, and one for instructors) What does the student view display when a student accesses the school’s student database? Courses enrolled Grades Tuition Credits for graduation What does the instructor view display when an instructor accesses the school’s student database? Courses teaching Students in each course Payment information Vacation time
What happens to a business if its suddenly experienced a 60 percent growth in sales and its IT systems fail with all of the increased activity? Remind your students that a big part of developing successful IT systems is being able to anticipate future growth CLASSROOM EXERCISE Building an ER Diagram Break your students into groups and ask them to create an entity relationship diagram similar to the one in Figure 7.1 for a company or product of their choice. If the students are uncomfortable with databases, you should recommend that they stick to a company similar to the TCCBCE, perhaps a snack food producer, mountain bike equipment producer, or even a footwear producer. If your students are more comfortable with databases, ask them to choose a company that would challenge them such as a fast food restaurant, online book seller, or even a university’s course registration system. The important part of this exercise is for your students to begin to understand how the tables in a database relate. Be sure their ER diagrams include primary keys and foreign keys. Have your students present their ER diagrams to the class and ask the students to find any potential errors with the diagrams.
One of the primary goals of a database is to eliminate information redundancy by recording each piece of information in only one place This is a good time to tie the discussion back to the material in the previous chapter, low quality information Recall what happens when a single customer is stored twice with different phone numbers, addresses, or order information in a single database
Relational integrity constraint – rule that enforces basic and fundamental information-based constraints Business-critical integrity constraint – rule that enforce business rules vital to an organization’s success and often require more insight and knowledge than relational integrity constraints Can you define two relational integrity constraints for an ordering system? Users cannot create an order for a nonexistent customer An order cannot be shipped without an address Can you define two business-critical integrity constraints for an ordering system? Product returns are not accepted for fresh product 15 days after purchase A discount maximum of 20 percent
Why you would want to define access level security? Access levels will typically mimic the hierarchical structure of the organization and protect organizational information from being viewed and manipulated by individuals who should not have access to the sensitive or confidential information Low level employees typically have the lowest levels of access High level employees typically have access to all types of database information For example: You would not want analysts viewing all salary information for the entire company - in general: Analysts can usually only view their own salary Managers have higher access and can view the salaries of all their team members, but cannot view other managers’ salaries Directors can view all of their managers’ and analysts’ salaries, but not other directors’ salaries The CFO and CEO can view every employee’s salary
Discuss the two primary forms of user interaction with a database Direct interaction – The user interacts directly with the DBMS The DBMS obtains the information from the database Indirect interaction User interacts with an application (i.e., payroll application, manufacturing application, sales application) The application interacts with the DBMS The DBMS obtains the information from the database
A data-driven Web site is an interactive Web site kept constantly updated and relevant to the needs of its customers through the use of a database. Data-driven Web sites are especially useful when the site offers a great deal of information, products, or services. Web site visitors are frequently angered if they are buried under an avalanche of information when searching a Web site. A data-driven Web site invites visitors to select and view what they are interested in by inserting a query, which the Web site then analyzes and custom builds a Web page in real-time that satisfies the query. The figure displays a Wikipedia user querying business intelligence and the database sending back the appropriate Web page that satisfies the user’s request Ask your students what would happen to a Web site that is not data-driven? The users would need to continually update the Web site data manually as the business data is updated. This would be a redundant effort and most likely result in errors and the Web site could quickly become out of sync with the business data
Data Driven Web Site Advantages Development: Allows the Web site owner to make changes any time—all without having to rely on a developer or knowing HTML programming. A well-structured, data-driven Web site enables updating with little or no training. Content management: A static Web site requires a programmer to make updates. This adds an unnecessary layer between the business and its Web content, which can lead to misunderstandings and slow turnarounds for desired changes. Future expandability: Having a data-driven Web site enables the site to grow faster than would be possible with a static site. Changing the layout, displays, and functionality of the site (adding more features and sections) is easier with a data-driven solution. Minimizing human error: Even the most competent programmer charged with the task of maintaining many pages will overlook things and make mistakes. This will lead to bugs and inconsistencies that can be time consuming and expensive to track down and fix. Unfortunately, users who come across these bugs will likely become irritated and may leave the site. A well-designed, data-driven Web site will have ”error trapping” mechanisms to ensure that required information is filled out correctly and that content is entered and displayed in its correct format. Cutting production and update costs: A data-driven Web site can be updated and ”published” by any competent data entry or administrative person. In addition to being convenient and more affordable, changes and updates will take a fraction of the time that they would with a static site. While training a competent programmer can take months or even years, training a data entry person can be done in 30 to 60 minutes. More efficient: By their very nature, computers are excellent at keeping volumes of information intact. With a data-driven solution, the system keeps track of the templates, so users do not have to. Global changes to layout, navigation, or site structure would need to be programmed only once, in one place, and the site itself will take care of propagating those changes to the appropriate pages and areas. A data-driven infrastructure will improve the reliability and stability of a Web site, while greatly reducing the chance of ”breaking” some part of the site when adding new areas. Improved Stability: Any programmer who has to update a Web site from ”static” templates must be very organized to keep track of all the source files. If a programmer leaves unexpectedly, it could involve re-creating existing work if those source files cannot be found. Plus, if there were any changes to the templates, the new programmer must be careful to use only the latest version. With a data-driven Web site, there is peace of mind, knowing the content is never lost—even if your programmer is.
Companies can gain business intelligence by viewing the data accessed and analyzed from their Web site. The figure displays how running queries or using analytical tools, such as a Pivot Table, on the database that is attached to the Web site can offer insight into the business, such as items browsed, frequent requests, items bought together, etc.
One of the biggest benefits of integration is that organizations only have to enter information into the systems once and it is automatically sent to all of the other systems throughout the organization This feature alone creates huge advantages for organizations because it reduces information redundancy and ensures accuracy and completeness Without integrations an organization would have to enter information into every single system that requires the information from marketing and sales to billing and customer service For example, customer information would have to be manually entered into the marketing, sales, ordering, inventory, billing, and shipping databases. (Each of these systems are separate and would have their own database – if the company doesn’t have a complete ERP installed.) Entering the same customer information into multiple systems is redundant, and chances of making a mistake in one of the systems is high Integrations offer many advantages, but for the most part, the automated flow of information among separate systems is the biggest benefit
Identify the arrows along the top of the figure when explaining forward integrations Basically, all information flows forward along the business process Sales enters the information when it is negotiating the sale (looking for opportunities) The information is then passed to the order entry system when the order is actually placed The order fulfillment system picks the products from the warehouse, packs the products, labels boxes, etc Once the order is filled and shipped, the customer is billed What would happen if users could enter order information directly into the billing system? The systems would quickly become out-of-sync. There might be bills for nonexistent orders, or orders that do not have any bills (if someone deleted a bill) For this reason organizations typically place a business-critical integrity constraint on integrated systems: With a forward integration the information must be entered in the sales system, you could not enter information directly into the billing system Integrations are expensive to build and maintain Integrations are difficult to implement For these reasons many organizations only build forward integrations and use business-critical integrity constraints to ensure all information is always entered only at the start of the integration (one source of record)
Identify the arrows along the bottom of the figure when explaining backward integrations Basically, all information flows backward along the business process Billing enters information and this information is passed back to the order system The order fulfillment system passes the information back to the order entry system The order entry system passes the information back to the sales system Why would an organization want to build both forward and backward integrations? This allows users to enter information at any point in the business process and the information is automatically sent upstream and downstream to all other systems For example, if order fulfillment determined that they could not fulfill an order (the product had been discontinued), they could simply enter this information into the database and it would be sent automatically upstream to the sales representative who could contact the customer and downstream to billing to remove the item from the bill
The above figure displays an example of customer information integrated using this method Users can create, read, update, and delete in the main customer repository, and it is automatically sent to all of the other databases This method does not follow the business process when building the integrations Business-critical integrity constraints still need to be built to ensure information is only ever entered into the customer repository, otherwise the information will become out-of-sync
1. Identify the different types of entity classes that might be stored in Wikipedia’s database. Entity classes could include: SUBJECT AREA, SEARCH TERM, WEB PAGE, RESOURCE, EDITOR 2. Explain why database technology is so important to Wikipedia’s business model. Without databases, Wikipedia simply would not exist for two primary reasons. First, vast amounts of information are at the heart of Wikipedia and without databases it would be impossible to store and retrieve the information. This is the information that Wikipedia’s customers are editing and researching. Second, Wikipedia uses database to store its indexes and to find and retrieve the information that its customers are looking for. Again, without databases Wikipedia simply would not exist – its business operates entirely on databases. 3. Explain the difference between logical and physical views and why logical views are important to Wikipedia’s customers. A well-designed database should handle changes quickly and easily, and provide users with different views. Physical views deals with the physical storage of information on a storage device such as a hard disk. Logical views focus on how users logically access information to meet particular business needs. A database has only one physical view and multiple logical views. The separation between logical and physical views is what allows each user to access database information differently. If Wikipedia’s customers had to access physical views of information they would be confused and find the site difficult to use and understand. The site provides a logical view for each customer’s queries.
1. How many organizations have your personal information, including your Social Security number, bank account numbers, and credit card numbers? This number will vary by student. Potential holders could include: Banks, Colleges, Credit Card Companies, Stores that Issue Credit, Insurance Companies, Auto Dealerships, Professors (if Social Security Number is used as Student ID), Government Agencies, Loan Applications, Hospitals, Doctor’s Offices, Dentist Offices 2. What information is stored at your college? Is there any chance your information could be hacked and stolen from your college? All of your personal information is stored at your college from date of birth to social security number. Absolutely, information can be stolen from any organization. Colleges have numerous college students working at different locations across campuses who could easily access personal information. This is one reason many colleges no longer use social security numbers as student identification numbers. 3. What can you do to protect yourself from identity theft? Continuously checking your credit report and perhaps purchasing identity theft protection services is the best way to ensure you are safe from identity theft. Be careful not to give your information to any individual who does not need it – especially via e-mail or telephone calls. Be aware of phising scams and other ways people might try to steal your information and buy a shredder for your documents.
4. Do you agree or disagree with changing laws to hold the company where the data theft occurred accountable? Why or why not? Student answers to this question will vary. The important part of their answer is the justification as to why or why not the company should be held accountable. One comment to get your students thinking would be should a bank be held liable if a gunman robs the bank? Is this the same type of theft and situation? 5. What impact would holding the company liable where the data theft occurred have on large organizations? Companies would take greater actions to ensure the safety of customer information. 6. What impact would holding the company liable where the data theft occurred have on small business? Small businesses would have to spend more money ensuring the safety of customer data and it might drain resources that are fundamental in keeping the business running.
Learning Outcomes7.1 Define the fundamental concepts of the relational database model7.2 Evaluate the advantages of the relational database model7.3 Compare relational integrity constraints and business-critical integrity constraints 7-2
Learning Outcomes7.4 Describe the benefits of a data driven Web site7.5 Describe the two primary methods for integrating information across multiple databases 7-3
Relational Database Fundamentals• Information is everywhere in an organization• Information is stored in databases – Database – maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses) 7-4
Relational Database Fundamentals• Database models include: – Hierarchical database model – information is organized into a tree-like structure (using parent/child relationships) in such a way that it cannot have too many relationships – Network database model – a flexible way of representing objects and their relationships – Relational database model – stores information in the form of logically related two-dimensional tables 7-5
Entities and Attributes• Entity – a person, place, thing, transaction, or event about which information is stored – The rows in each table contain the entities – In Figure 7.1 CUSTOMER includes Dave’s Sub Shop and Pizza Palace entities• Attributes (fields, columns) – characteristics or properties of an entity class – The columns in each table contain the attributes – In Figure 7.1 attributes for CUSTOMER include Customer ID, Customer Name, Contact Name 7-6
Keys and Relationships• Primary keys and foreign keys identify the various entity classes (tables) in the database – Primary key – a field (or group of fields) that uniquely identifies a given entity in a table – Foreign key – a primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables 7-7
Keys and Relationships• Potential relational database for Coca-Cola 7-8
Relational Database Advantages • Database advantages from a business perspective include – Increased flexibility – Increased scalability and performance – Reduced information redundancy – Increased information integrity (quality) – Increased information security 7-9
Increased Flexibility• A well-designed database should: – Handle changes quickly and easily – Provide users with different views – Have only one physical view • Physical view – deals with the physical storage of information on a storage device – Have multiple logical views • Logical view – focuses on how users logically access information 7-10
Increased Scalability and Performance • A database must scale to meet increased demand, while maintaining acceptable performance levels – Scalability – refers to how well a system can adapt to increased demands – Performance – measures how quickly a system performs a certain process or transaction 7-11
Reduced Information Redundancy• Databases reduce information redundancy – Redundancy – the duplication of information or storing the same information in multiple places• Inconsistency is one of the primary problems with redundant information 7-12
Increase Information Integrity (Quality)• Information integrity – measures the quality of information• Integrity constraint – rules that help ensure the quality of information – Relational integrity constraint – Business-critical integrity constraint 7-13
Increased Information Security• Information is an organizational asset and must be protected• Databases offer several security features including: – Password – provides authentication of the user – Access level – determines who has access to the different types of information – Access control – determines types of user access, such as read-only access 7-14
Database Management Systems• Database management systems (DBMS) – software through which users and application programs interact with a database 7-15
DATA-DRIVEN WEB SITES• Data-driven Web sites – an interactive Web site kept constantly updated and relevant to the needs of its customers through the use of a database 7-16
Data-Driven Web Site Business Advantages• Development• Content Management• Future Expandability• Minimizing Human Error• Cutting Production and Update Costs• More Efficient• Improved Stability 7-17
Data-Driven Business Intelligence • BI in a data-driven Web site 7-18
Integrating Information among Multiple Databases• Integration – allows separate systems to communicate directly with each other – Forward integration – takes information entered into a given system and sends it automatically to all downstream systems and processes – Backward integration – takes information entered into a given system and sends it automatically to all upstream systems and processes 7-19
Integrating Information among Multiple Databases• Forward integration 7-20
Integrating Information among Multiple Databases• Building a central repository specifically for integrated information 7-22
OPENING CASE STUDY QUESTIONSIt Takes A Village to Write an Encyclopedia 1. Identify the different types of entity classes that might be stored in Wikipedia’s database 2. Explain why database technology is so important to Wikipedia’s business model 3. Explain the difference between logical and physical views and why logical views are important to Wikipedia’s customers 7-23
CHAPTER SEVEN CASE Keeper of the Keys• Almost 90 million people had their personal information stolen or lost by organizations – Bank of America: 1.2 million customers – CardSystems: 40 million customers – Citigroup: 3.9 million customers – DSW Shoe Warehouse: 1.4 million customers. – TJX Companies: 45.6 million customers – Wachovia: 676,000 customers 7-24
Chapter Seven Case Questions1. How many organizations have your personal information, including your Social Security number, bank account numbers, and credit card numbers?2. What information is stored at your college? Is there any chance your information could be hacked and stolen from your college?3. What can you do to protect yourself from identity theft? 7-25
Chapter Seven Case Questions4. Do you agree or disagree with changing laws to hold the company where the data theft occurred accountable? Why or why not?5. What impact would holding the company liable where the data theft occurred have on large organizations?6. What impact would holding the company liable where the data theft occurred have on small business? 7-26