Week 6-7.ppt

374 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
374
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • File - A group of related records is known as a data file , or table . Files are frequently classified by the application for which they are primarily used, such as a payroll file or an inventory file, or the type of data they contain, such as a document file or a graphical image file .
  • To understand databases, it is useful to remember that the elements of data that make up the database are divided into hierarchical levels. These logical data elements make up the foundation data concepts upon which a database is built. Character . The most basic logical element is the character, which consists of a single alphabetic, numeric, or other symbol. While it may take several bits or bytes to represent a character digitally, remember that these refer to physical storage, not the logical concept of the character itself. Field . A field is a grouping of characters that represent a characteristic of a person, place, thing, or event. A person's name is typically placed in a field. A field is a data item . A data field represents an attribute or some entity . Record . A record is a collection of interrelated fields. For example, an employee's payroll record usually contains several fields, such as their name, social security number, department, and salary. Records may be fixed-length or variable-length. File . A file is a collection of interrelated records. For example, a payroll file might contain all of the payroll files for all the employees of a firm. Files are usually classified by the application for which they are used. Database . A database is an integrated collection of logically interrelated records or files. For example, the personnel database of a firm might contain payroll, personnel action, and employee skills files. The data stored in a database is independent of the application programs using it and of the type of secondary storage devices on which it is stored. Teaching Tips This slide corresponds to Figure 5.2 on p. 174 and relates to material on pp. 174-175.
  • Six major types of databases are illustrated on the slide and used by computer-based organizations: Operational Databases . These databases store detailed data needed to support the operations of the entire organization. They are also called subject area databases (SADB), transaction databases, and production databases. These also include databases of Internet and electronic commerce activity, such as click stream data or data describing online behavior of visitors at a company’s website. Data Warehouse Databases . These store data from current and previous years that has been extracted from the various operational and management databases of the organization. As a standardized and integrated central source of data, warehouses can be used by managers for pattern processing, where key factors and trends about operations can be identified from the historical record. Data Marts . Are subsets of the data included in a Data Warehouse which focus on specific aspects of a company, e.g. department, business process, etc. Distributed Databases . These are the databases of local workgroups and departments at regional offices, branch offices, and other work sites needed to complete the task at hand. They include relevant information from other organizational databases combined with data and information generated only at the particular site. These databases can reside on network servers, on the World Wide Web, or on Intranets and Extranets. End User Databases . These consist of a variety of data files developed by end users at their workstations. For example, an end user in sales might combine information on a customer’s order history with her own notes and impressions from face-to-face meetings to improve follow-up. External Databases . Many organizations make use of privately generated and owned online databases or data banks that specialize in a particular area of interest. Access is usually through a subscription for continuing links or a one-time fee for a specific piece of information (like the results of a single search). Other sources like those found on the Web are free. Teaching Tips This slide corresponds to Figure 5.10 on p. 181 and relates to material on pp. 180-182.
  • What makes Database different than the rolodex is the relationship among entities.
  • The relationships among the records stored in databases are based upon one of several logical database structures or models. These fundamental database structures are described below. Hierarchical Structure . Under this tree-like structure, each data element is related only to one element above it, a so-called one-to-many relationship. All records are dependent and arranged in multilevel structures. Network Structure . This structure features a many-to-many arrangement whereby the DBMS can access a data element by following one of several paths. Relational Structure . This model has become the most popular structure and is used by most microcomputers. All data elements within the database are viewed as being stored in the form of simple tables. The DBMS can link data elements from various tables to provide information to end users. Teaching Tips This slide corresponds to Figure 5.16 on p. 189 and relates to material on pp. 189-190.
  • Object-Oriented Structure . Objects consist of data values describing the attributes of an entity and the operations that can be performed on the data. This is called encapsulation and allows object-oriented database structures to better handle complex types of data such as video and audio. The object-oriented model also supports inheritance , allowing new objects to replicate some or all the characteristics of one or more parent objects, as shown in the slide. Such capabilities allow developers to copy and combine objects, allowing for very rapid development of new database solutions. Multidimensional Structure . This structure uses cells within a multidimensional framework to aggregate data related to elements within a given dimension. Each cell combines with similar cells to form a coherent "cube" of information and data, which in turn is combined with other cubes to form dimensions. As a result they are both compact and easy to understand. Multidimensional structures have fast become the most popular database structure for analytical databases that support online analytical processing (OLAP) applications. Teaching Tips This slide corresponds to Figure 5.17 on p.191 and Figure 5.18 on p. 192 and relates to material on pp. 190-192.
  • Under the database management approach, data records are consolidated into databases that can be accessed by many different application programs. A database management system (DBMS) is a set of computer programs that control the creation, maintenance, and use of the databases of an organization and its end users. Four major DBMS facilities include: Database Development . A DBMS allows control of development to be placed with database administrators. The administrator uses a data definition language (DDL) to develop and specify the data contents, relationships, and structure of each database, and to modify these specifications when necessary. This approach improves integrity and security for the organizational databases. The information is stored in a data dictionary, which uses data definitions to specify what all the records and files are, can be, and, if desired, to automatically enforce data element definitions when fields, records, or files are modified. Database Interrogation . A DBMS allows end users without programming skills to ask for information from a database using a query language or report generator. Queries are usually made one of two ways: SQL (Structured Query Language). This uses the basic form of SELECT ...FROM...WHERE. After SELECT the user lists the data fields to be retrieved. After FROM the user lists the files or tables from which the data must be retrieved. After WHERE the user specifies conditions that limit the search. QBE (Query by Example). This method allows users to point and click on display boxes for each of the data fields in one or more files to specify the rules of the search Database Maintenance . Updating the databases and other maintenance are conducted by transaction processing programs. Application Development . A DBMS makes application development much easier and quicker by allowing developers to include data manipulation language (DML) statements in their programs that let the DBMS perform necessary data-handling activities. Teaching Tips This slide corresponds to Figure 5.5 on p. 177 and relates to material on pp. 177-180.
  • Database planning, beyond that of a personal or small business end user database created by a database management package, typically requires use of a top-down data planning process based upon the systems development model covered earlier: 1. Data Planning . At this stage, planners develop a model of business processes. This results in an enterprise model of business processes with documentation. 2. Requirements Specification . This stage defines the information needs of end users in a business process. Description of needs may be provided in natural language or using the tools of a particular design methodology. 3. Conceptual Design . This stage expresses all information requirements in the form of a high-level model. 4. Logical Design . This stage translates the conceptual model into the data model of a DBMS. 5. Physical Design . This stage determines the data storage structures and access methods. Teaching Tips This slide corresponds to Figure 5.22 on p.197 and relates to material on pp. 196-198.
  • A data warehouse stores data that has been extracted from various operational, external, and other databases within the organization. To create a data warehouse, data from various databases are captured, cleaned, e.g. sorted, filtered, converted, and transformed into data that can be better used for analysis. The data is then stored in the enterprise data warehouse, from where it can be moved into data marts or to an analytical data store that holds data to support certain types of analysis. Metadata , that defines the data in the data warehouse, is stored in a Metadata Directory that is used to support data administration. A variety of analytical software tools can then be used to query, report, and analyze data. One such means for analyzing data in a data warehouse is called data mining. In data mining , the data in a data warehouse are analyzed to reveal hidden patterns and trends in historical business activity. This can help managers make decisions about strategic changes in business operations. Teaching Tips This slide corresponds to Figure 5.11 on p.182 and relates to material on pp. 182-183.
  • The rapid growth of websites on the Internet and corporate intranets and extranets has dramatically increased the use of databases of hypertext and hypermedia documents. Hypermedia Database : A website stores information in a hypermedia database consisting of a home page and other hyperlinked pages of multimedia or mixed media (text, graphics and photographic images, video clips, audio segments, and so on). Browser : A web browser on your client PC is used to connect with a web network server. This server runs web software to access and transfer the web pages you request. Web Site : A website uses a hypermedia database consisting of HTML (Hypertext Markup Language) pages, GIF (graphics image files) files, and video files. Web Server Software : Acts as a database management system to manage the use of the interrelated hypermedia pages of the website. Teaching Tips This slide corresponds to Figure 5.13 on p.184 and relates to material on pp. 183-184.
  • The security and integrity of an organization's databases are the major concerns of the data resource management efforts. Key activities of data resource management include: Database Administration . This area is responsible for developing and maintaining the organization's data dictionary, as well as designing and monitoring the performance of databases, and enforcing the standards for database use and security. Data Planning . Data planning is a corporate planning and analysis function responsible for the overall data architecture for the firm. This role ensures that data resources are developed to support the firm's strategic mission and plans. Data Administration . This area involves the establishment and enforcement of policies and procedures for managing data as a strategic corporate resource. This means standardizing data so that it is available to all end users from whatever database they are working.
  • Efficient access to data is the critical necessity of an effective database system. Key concepts and terms associated with file access include: Key Fields . This is a unique identifier of the data record. URLs . The files and databases on the Internet and corporate intranets and extranets use URLs (Uniform Resource Locators) for data access. Thus, pages of hyperlinked text and multimedia documents on the Web and intranet/extranet websites are accessed by URLs. Sequential Organization . This refers to a structure in which the records are physically stored in a specified order according to a key field in each record. Sequential Access . This refers to the predetermined order of processing data. Each record is accessed according to the same set of commands. Access begins at the start of the file or record and proceeds, in order, to the end. This is a fast and efficient method for processing large volumes of data. Direct Access . Under this method, records do not have to be arranged into any particular sequence on storage media, however the computer must keep track of the storage location of each record. Key Transformation . This common direct access technique performs an arithmetic computation on a key field or record and uses the number that results as an address to store and access that record. Indexed Sequential Access Method . This approach combines features of both sequential and direct access. Sequential storage allows for large volume processing while indexed addressing allows direct access of smaller amounts of data from individual records. Teaching Tips This slide relates to material on pp. 193-196.
  • Week 6-7.ppt

    1. 1. CIS 1 Ch 10: Database Ch 11: Programming Perry Mehta Solano Community College
    2. 2. What is Database? Employees Clients Vendors Database – An integrated collection of log1cally related files (Some databases contain only a single file.)
    3. 3. Purpose of Database Management <ul><li>Define organizations, procedures, and analytical methods </li></ul><ul><li>Allow us to master the richer flow, faster pace, and huge volume of data and information - to find better, not just faster, analysis and decision-making procedures </li></ul>
    4. 4. Information Management <ul><li>Information Management is the process of acquiring mountains of data and quickly translating them into knowledge and understanding. </li></ul>Achieved by using judgment to give knowledge relevance within a specific situational context. Representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means Data placed in a subjective context, or as data colored by perception Information tested and accepted as factual through cognition through assessment or testing to prove the information, or be acceptance of the information as factual.
    5. 5. Logical Data Elements Employee Record 2 Employee Record 1 Employee Record 3 Employee Record 4 Name SS Salary Name SS Salary Name SS Salary Name SS Salary Data Data Data Data Data Data Data Data Data Data Data Data Personnel Database Payroll File Benefits File
    6. 6. Major Types of Databases Network Server External Databases on the Internet & Online Services Client PC or NC Operational Databases of the Organization Data Mart End User Databases Data Warehouse Distributed Databases on Intranets & Other Networks
    7. 7. Data Dictionary File <ul><li>Data about the database: Metadata </li></ul><ul><ul><ul><li>List of the files in the database </li></ul></ul></ul><ul><ul><ul><li>Number of records in each file </li></ul></ul></ul><ul><ul><ul><li>Field names and field attributes </li></ul></ul></ul><ul><ul><ul><li>Other similar information </li></ul></ul></ul><ul><ul><li>No database data; only bookkeeping information for database management </li></ul></ul><ul><ul><li>Normally hidden from users for security and integrity </li></ul></ul>
    8. 8. Data Model User’s View of Data
    9. 9. Database – Entities and Relationships
    10. 10. Database Terms - Record <ul><li>Record – A collection of interrelated fields </li></ul><ul><ul><li>Fixed length records </li></ul></ul><ul><ul><li>Variable length records </li></ul></ul>NAME ADDR CITY STATE PHONE ZIP ( ) -
    11. 11. Database Terms - Field NAME ADDR CITY STATE PHONE ZIP ( ) - Field – A grouping of characters of data (There are nine fields in this example.)
    12. 12. Database Terms - Field Attributes <ul><li>Attributes of a Field </li></ul><ul><li>Alphabetic, numeric, or both </li></ul><ul><li>Number of characters </li></ul><ul><li>Uppercase, lowercase, 1 st letter capital, etc. </li></ul><ul><li>Number of decimal places, dollar format, etc. </li></ul><ul><li>Justification, color, and many others </li></ul>NAME ADDR CITY STATE PHONE ZIP ( ) -
    13. 13. Database Distribution Types Network Structure Employee 3 Project B Project A Employee 2 Employee 1 Dept A Dept B Employee 1 Employee 2 Hierarchical Structure Project A Dept Project B Relational Structure Dept Dname Dloc Dmgr A B C D B Empno Ename Etitle Dept 1 2 B 3 A C 4
    14. 14. Database Distribution (cont) Multidimensional Database Structure <ul><li>Attributes </li></ul><ul><li>Customer </li></ul><ul><li>Balance </li></ul><ul><li>Operations </li></ul><ul><li>Deposit </li></ul><ul><li>Withdraw </li></ul>Bank Account Object <ul><li>Attributes </li></ul><ul><li>Credit Line </li></ul><ul><li>Mthly Statement </li></ul><ul><li>Operations </li></ul><ul><li>Calculate Interest </li></ul><ul><li>Print Mthly Statement </li></ul>Checking Account Object <ul><li>Attributes </li></ul><ul><li>Credit Line </li></ul><ul><li>Mthly Statement </li></ul><ul><li>Operations </li></ul><ul><li>Calculate Interest </li></ul><ul><li>Print Mthly </li></ul><ul><li>Statement </li></ul>Savings Account Object Object-Oriented Database Structure East West Denver Feb Actual Budget Margin Sales TV VCR TV VCR
    15. 15. Object-Oriented Databases <ul><li>Finding increasing use in managing hypermedia for the Web </li></ul><ul><li>Supports complex data types and applets </li></ul><ul><ul><li>Documents </li></ul></ul><ul><ul><li>Graphic images </li></ul></ul><ul><ul><li>Video clips </li></ul></ul><ul><ul><li>Audio segments </li></ul></ul><ul><ul><li>Other subsets of Web pages </li></ul></ul><ul><li>Major software vendors are adding object oriented support to their relational data-base offerings </li></ul>
    16. 16. Database Management Systems <ul><li>Database Development </li></ul><ul><li>Database Interrogation </li></ul><ul><li>Database Maintenance </li></ul><ul><li>Application Development </li></ul>Operating System Database Management System Application Programs Databases Data Dictionary Database Management
    17. 17. Structured Query Language (SQL) <ul><ul><li>1974: Structured English Query Language (SEQUEL) by IBM; 1979: SQL by Oracle </li></ul></ul><ul><ul><li>“Standard” database query language </li></ul></ul><ul><ul><li>Simple commands generate both on-screen and printed reports </li></ul></ul>Databases SELECT name, age, salary FROM employee WHERE age > 50; Very simple text-based example:
    18. 18. Structured Query Language (SQL) <ul><ul><li>Graphical queries provide drag and drop and drop-down menus to select options </li></ul></ul>Databases FIELD OPER VALUE REL AGE GT 30 AND ZIP EQ GT GE LT LE NE
    19. 19. Structured Query Language (SQL) <ul><ul><li>Graphical queries provide drag and drop and drop-down menus to select options </li></ul></ul>Databases FIELD OPER VALUE REL AGE GT 30 AND ZIP GE 98000 AND ZIP LE 98010
    20. 20. Database Development User Needs Description 1. Data Planning Enterprise Model 2. Requirements Specifications 3. Conceptual Design 4. Logical Design Physical Models 5. Physical Design Data Models Logical Models
    21. 21. P&G uses data mining to cut animal testing Gary H Anthes .  Computerworld . Framingham:  Dec 6, 1999 . Vol. 33, Iss. 49;  pg. 44, 2 pgs <ul><li>In June 1999, Procter & Gamble Co. announced it was ending immediately the use of animals for testing several hundred beauty, fabric, home-care and paper products. P&G has eliminated some 80% of its animal testing since 1984 while tripling in size. </li></ul><ul><li>Some of the reduction is from the substitution of human and animal cell cultures. But most of it comes from the use of data mining, analysis and modeling. </li></ul><ul><li>The key is to use huge databases of information about existing chemicals and past tests to predict whether a new product ingredient will be safe- testing it in software instead of on animals. </li></ul>
    22. 22. Data Warehouse and Data Mining Client PC or NC Analytical Data Store Enterprise Warehouse Data Mart Data Acquisition Subsystem Warehouse Design Subsystem Data Management Subsystem Data Access and Delivery Subsystem Web Information System Operational Databases Metadata Directory Metadata Repository Metadata Management Subsystem
    23. 23. Web-Based Systems Web Browser Web Server Software Network Server The Internet Intranets Extranets Client PCs or NCs HTML pages GIF image files Video files Web Objects
    24. 24. Database Administration <ul><li>Planning </li></ul><ul><li>Design </li></ul><ul><li>Creation </li></ul><ul><li>Maintenance </li></ul><ul><li>Analysis of usage </li></ul><ul><li>Security </li></ul>
    25. 25. Data Resource Management Data Administration Data Planning Database Administration
    26. 26. Data Management Concepts <ul><li>Data Transformation </li></ul><ul><li>Data Accreditation </li></ul><ul><li>Data Organization </li></ul><ul><li>Data Extraction </li></ul><ul><li>Data Access </li></ul><ul><li>Data Exploitation </li></ul><ul><li>Knowledge </li></ul><ul><li>Information </li></ul>Let’s use a tire store to illustrate these concepts!
    27. 27. Data Transformation <ul><li>Data transformation: </li></ul><ul><ul><li>The process and algorithms used to convert data from one format to another, e.g., the conversion of kilograms to pounds by multiplying kilograms times the constant 2.2046 . </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>The owner's manual for a European car may specify tire pressure in kilograms per square centimeter. To provide the correct tire for the customer, this measurement needs to be converted to pounds per square inch. </li></ul></ul>
    28. 28. Knowledge <ul><li>Knowledge: </li></ul><ul><ul><li>The analysis of information to learn relationship, discover relationships or derive new relationships. </li></ul></ul><ul><ul><li>The analysis of information to create rules to organize the information and create new information. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>If we know that 60% of customers who buy new tires will also buy mud flaps, we can estimate how much stock in mud flaps is needed based on projected tire sales. </li></ul></ul>
    29. 29. Data Accreditation <ul><li>Data Accreditation: </li></ul><ul><ul><li>The granting of approval to the definition of data items. </li></ul></ul><ul><ul><li>The concept that the owner of the data being responsible for the data, e.g., the definition of data items, the format of data items, and the organization of data items. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>The store manager will decide what type of information should be gathered from customers about their purchases, etc. </li></ul></ul>
    30. 30. Data Organization <ul><li>Data Organization: </li></ul><ul><ul><li>The process and algorithms used to create physical or logical structures for data. </li></ul></ul><ul><ul><li>Physical structures determine the location of each data item and its relative location to other data items in the storage media, e.g., on a disk. </li></ul></ul><ul><ul><li>Logical structures use metadata to create and maintain the logical relationships of data, e.g., a person's name typically consists of a first name, a middle name and a last name. </li></ul></ul><ul><ul><li>Each part of the name would be physically stored as a separate data item. </li></ul></ul><ul><ul><li>The metadata would recognize the name as the first name and middle name combined. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>Using binders, tabs and indexes to store and organize the various item prices is data organization. </li></ul></ul>
    31. 31. Data Extraction <ul><li>Data Extraction: </li></ul><ul><ul><li>The process, search algorithms, and other techniques used to draw out useful information from a very large set of information or data </li></ul></ul><ul><ul><li>The use of the Dewey decimal system or algorithms to find a particular book from a library, or the use of computer index files to locate a record quickly in a file. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>When the sales clerk uses the computer to find the price of an item, that is data extraction </li></ul></ul>
    32. 32. Data Access <ul><li>Data Access: </li></ul><ul><ul><li>The process of locating and retrieving data in response to some request, or the concern with the ready availability of access to data. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>When the sales clerk looks up the price of an item requested by the customer, he is accessing data. </li></ul></ul>
    33. 33. Accessing Files and Databases Key Fields Sequential Access Sequential Organization Indexed Sequential Access Method Key Transformation Direct Access URLs
    34. 34. Data Exploitation <ul><li>Data Exploitation: </li></ul><ul><ul><li>The concept of using data for strategic advantage. </li></ul></ul><ul><ul><li>Corporations and government agencies (enterprises) have enormous databases, which can be exploited. </li></ul></ul><ul><ul><li>Using the data in them for enterprise advantages in data exploitation, e.g., The answer to the question—how can we exploit the data we have to our advantage.. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>Using the information the customer gave about her name and address to send her future sale information would be data exploitation </li></ul></ul>
    35. 35. Data Exploitation <ul><li>Data Exploitation: </li></ul><ul><ul><li>The concept of using data for strategic advantage. </li></ul></ul><ul><ul><li>Corporations and government agencies (enterprises) have enormous databases, which can be exploited. </li></ul></ul><ul><ul><li>Using the data in them for enterprise advantages in data exploitation, e.g., The answer to the question—how can we exploit the data we have to our advantage.. </li></ul></ul><ul><li>In the tire store: </li></ul><ul><ul><li>Using the information the customer gave about her name and address to send her future sale information would be data exploitation </li></ul></ul>
    36. 36. Impact on Acquiring Information Technology <ul><li>System performance requirements </li></ul><ul><li>Program management and acquisition </li></ul><ul><li>Operation and maintenance of data in systems </li></ul>
    37. 37. System Performance Requirements Impact on Acquiring Information Technology <ul><li>If the performance requirements are set higher than required to meet the user's requirements, both hardware and software costs increase. </li></ul><ul><li>Learning Team Exercise </li></ul><ul><ul><li>If we state that the system must be able to process 100,000 transactions of a certain type per minute this may be translated into a requirement to buy 10 hardware platforms that must process 10,000 transactions per minute. </li></ul></ul><ul><ul><li>What will happen if the real requirement is 50,000 transactions per minute? </li></ul></ul><ul><ul><li>We really need to buy only 5 platforms — in other words, we have bought twice as much processing capability as we need. </li></ul></ul><ul><li>Key Point </li></ul><ul><ul><li>If we require that a system respond immediately (less than 1 second) when a 10 minute response time is good enough, we are likely to increase both our hardware and software costs. </li></ul></ul>
    38. 38. Response Time System Performance Requirements <ul><li>If the requirement is too low, meaning that a one minute response time is requested when a one-tenth of a second response time is needed, then the users do not get the support that they need. </li></ul><ul><li>Data management system requirements must be set as high as is needed to meet the users needs but not higher. </li></ul><ul><li>Learning Team </li></ul><ul><ul><li>Brain some examples where a database design is sensitive to response time </li></ul></ul>
    39. 39. Redundancy System Performance Requirements <ul><li>Negative effect on system performance. </li></ul><ul><li>If each office in an organization keeps its own copy of all the data files, then each office has to update, backup, and otherwise maintain the data. </li></ul><ul><li>This duplication of effort is expensive and introduces more errors, which in turn are expensive. </li></ul><ul><li>Learning Team </li></ul><ul><ul><li>What are some other examples? </li></ul></ul>
    40. 40. Data Maintenance & Data Refresh Requirements System Performance Requirements <ul><li>Data represent facts about the real world. </li></ul><ul><li>As facts in the real world change they must be captured and used to update the data in the database. </li></ul><ul><li>For example, if a requirement states that the data in the database be no more than 7 days old, then the data must be refreshed every 7 days. </li></ul>Now let’s visit www.fetch.com
    41. 41. References <ul><li>Celano, J. (2004). Personal Communications </li></ul>

    ×