• Like
  • Save
Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien
Upcoming SlideShare
Loading in...5
×

Lectura Capitulo 5. Sistemas de Información Gerencial, James O´Brien

  • 45,842 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
45,842
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Management ChallengesCHAPTER 5 Business Applications Module II Information Technologies Development Foundation Processes ConceptsDATA RESOURCE MANAGEMENTChapter Highlights Learning ObjectivesSection I After reading and studying this chapter, you shouldTechnical Foundations of Database Management be able to:Real World Case: Harrah’s Entertainment and Others: 1. Explain the business value of implementing dataProtecting the Data Jewels resource management processes and technologiesDatabase Management in an organization.Fundamental Data Concepts 2. Outline the advantages of a database managementDatabase Structures approach to managing the data resources of aDatabase Development business, compared to a file processing approach.Section II 3. Explain how database management software helpsManaging Data Resources business professionals and supports the operationsReal World Case: Emerson and Sanofi: Data Stewards and management of a business.Seek Data ConformityData Resource Management 4. Provide examples to illustrate each of theTypes of Databases following concepts:Data Warehouses and Data Mining a. Major types of databases.Traditional File Processing b. Data warehouses and data mining.The Database Management Approach c. Logical data elements.Real World Case: Acxiom Corporation: Data d. Fundamental database structures.Demands Respect e. Database development. 149
  • 2. 150 ● Module II / Information Technologies SECTION I Technical Foundations of Database ManagementDatabase Just imagine how difficult it would be to get any information from an information sys- tem if data were stored in an unorganized way, or if there were no systematic way toManagement retrieve them. Therefore, in all information systems, data resources must be organized and structured in some logical manner so that they can be accessed easily, processed ef- ficiently, retrieved quickly, and managed effectively. Data structures and access meth- ods ranging from simple to complex have been devised to efficiently organize and access data stored by information systems. In this chapter, we will explore these con- cepts, as well as the managerial implications and value of data resource management. See Figure 5.1. Read the Real World Case on data resources in the casino gaming and hospitality industry. We can learn a lot from this case about the importance of protecting the data resources of the organization.Fundamental Before we go any further, let’s discuss some fundamental concepts about how data are organized in information systems. A conceptual framework of several levels of data hasData Concepts been devised that differentiates between different groupings, or elements, of data. Thus, data may be logically organized into characters, fields, records, files, and data- bases, just as writing can be organized in letters, words, sentences, paragraphs, and documents. Examples of these logical data elements are shown in Figure 5.2.Character The most basic logical data element is the character, which consists of a single alpha- betic, numeric, or other symbol. You might argue that the bit or byte is a more ele- mentary data element, but remember that those terms refer to the physical storage elements provided by the computer hardware, discussed in Chapter 3. Using that un- derstanding, one way to think of a character is that it is a byte used to represent a par- ticular character. From a user’s point of view (that is, from a logical as opposed to a physical or hardware view of data), a character is the most basic element of data that can be observed and manipulated.Field The next higher level of data is the field, or data item. A field consists of a grouping of related characters. For example, the grouping of alphabetic characters in a person’s name may form a name field (or typically, last name, first name, and middle initial fields), and the grouping of numbers in a sales amount forms a sales amount field. Specifically, a data field represents an attribute (a characteristic or quality) of some entity (object, person, place, or event). For example, an employee’s salary is an attribute that is a typical data field used to describe an entity who is an employee of a business. Generally speaking, fields are organized such that they represent some logi- cal order. For example, last_name, first_name, address, city, state, zipcode, and so on.Record All of the fields used to describe the attributes of an entity are grouped to form a record. Thus, a record represents a collection of attributes that describe an entity. An example is a person’s payroll record, which consists of data fields describing attributes such as the person’s name, Social Security number, and rate of pay. Fixed-length records contain a fixed number of fixed-length data fields. Variable-length records contain a variable number of fields and field lengths. Another way of looking at a record is that it represents a single instance of an entity. Each record in an employee file describes one specific employee.File A group of related records is a data file, or table. Thus, an employee file would contain the records of the employees of a firm. Files are frequently classified by the application
  • 3. Chapter 5 / Data Resource Management ● 151 1REAL WORLD Harrah’s Entertainment and Others: CASE Protecting the Data JewelsI n the casino industry, one of the most valuable assets is the dossier that casinos keep on their affluent customers, the high rollers. But in 2003, casino operator Harrah’s Enter-tainment Inc. filed a lawsuit in Placer County, California,Superior Court charging that a former employee had copied lists. Through these documents, employees “acknowledge that they will be introduced to this information and agree not to disclose it on departure from the company,” says Suzanne Labrit, a partner at law firm Shutts & Bowen LLP in West Palm Beach, Florida.the records of up to 450 wealthy customers before leaving Although most states have enacted trade-secrets laws,the company to work at competitor Thunder Valley Casino Labrit says they have different attitudes about enforcing thesein Lincoln, California. laws with regard to customer lists. “But as a starting point, at The complaint said the employee was seen printing the least you have this understanding [with employees] that thelist—which included names, contact information, and credit customer information is being treated as confidential,” Labritand account histories—from a Harrah’s database. It also says. Then if an employee leaves to work for a competitor andalleged that he tried to lure those players to Thunder Valley. uses this protected customer data, the employer will moreThe employee denies the charge of stealing Harrah’s trade likely be able to take legal action to stop the activity. “If yousecrets, and the case was still pending at this writing, but don’t treat it as confidential information internally,” she says,many similar cases have been filed in the past 20 years, legal “the court will not treat it as confidential information, either.”experts say. It’s also important to educate employees about the While savvy companies are using business intelligence confidentiality of customer lists, because many peopleand customer relationship management systems to identify wrongly assume they’re public information, says Timtheir most profitable customers, there’s a genuine danger Headley, a partner at the Houston law firm of Gardereof that information falling into the wrong hands. Broader Wynne Sewell LLP. “Most people think they can take theaccess to those applications and the trend toward employees lists with them,” he says. “You have to show that you’veswitching jobs more frequently have made protecting cus- kept it a secret and told employees it’s a valuable secret.tomer lists an even greater priority. [Customer lists] are at the core of how you bring revenue Fortunately, there are managerial, legal, and technologi- into the company. These are the decision-makers who arecal steps you can take to help prevent, or at least discourage, willing to buy your product.”departing employees from walking out the door with this From a management and process standpoint, organiza-vital information. tions should try to limit access to customer lists to only For starters, organizations should make sure that certain employees, such as sales representatives, who need theemployees—particularly those with frequent access to cus- information to do their jobs. “If you make it broadly avail-tomer information—sign nondisclosure, noncompete, and able to employees, then it’s not considered confidential,” saysnonsolicitation agreements that specifically mention customer Labrit. Physical security should also be considered, Labrit says.FIGURE 5.1 Visitors such as vendors shouldn’t be permitted to roam free in the hallways or into conference rooms. And security poli- cies, such as a requirement that all computer systems have strong password protection, should be strictly enforced. Companies should instantly shut down access to com- puters and networks when employees leave, whether the rea- son is a layoff or a move to a new job. At the exit interview, the employee should be reminded of any signed agreements and corporate policies regarding customer lists and other confidential information. Employees should be told to turn over anything, including data that belongs to the company. In addition, employers should track the activities of em- ployees who’ve given notice but will be around for a while. This includes monitoring systems to see if the employee is e-mailing company-owned documents outside the company. Some organizations rely on technology to help prevent While data management is a strategic initiative in the loss of customer lists and other critical data. Inflow Inc., every modern organization, those in the gaming a Denver-based provider of managed Web hosting services, industry believe their success lies in the protection uses a product from Opsware Inc. in Sunnyvale, California, and strategic management of their data resources. that lets managers control access to specific systems, such as databases, from a central location.Source: Jose Luis Palaez, Inc./Corbis.
  • 4. 152 ● Module II / Information Technologies The company also uses an e-mail-scanning service that Vijay Sonty, chief technology officer at advertising firmallows it to analyze messages that it suspects might contain Foote Cone & Belding Worldwide in New York, says losingproprietary files, says Lenny Monsour, general manager customer information to competitors is a growing concern,of application hosting and management. Inflow combines particularly in industries where companies go after many ofthe use of this technology with practices such as monitoring the same clients.employees who have access to data considered vital to the “We have a lot of account executives who are very closecompany. to the clients and have access to client lists,” Sonty says. “If A major financial services provider is using a firewall an account executive leaves to join a competitor, he can takefrom San Francisco-based Vontu Inc. that monitors out- all this confidential information.” The widespread sharing ofbound e-mail, Webmail, Web posts, and instant messages to corporate data, such as customer contact information, hasensure that no confidential data leave the company. The made it easier for people to do their jobs, but it has alsosoftware includes search algorithms and can be customized increased the risk of losing confidential data, Sonty says.to automatically detect specific types of data such as lists on He says the firm, which mandates that some employeesa spreadsheet or even something as granular as a customer’s sign noncompete agreements, is looking into policies andSocial Security number. The firm began using the product guidelines regarding the proper use of customer informa-after it went through layoffs in 2000 and 2001. tion, as well as audit trails to see who’s accessing customer “Losing customer information was a primary concern of lists. “I think it makes good business sense to take precau-ours,” says the firm’s chief information security officer, who tions and steps to prevent this from happening,” Sonty says.asked to not be identified. “We were concerned about people “We could lose a lot of money if key people leave.”leaving and sending e-mail to their home accounts.” In fact,he says, before using the firewall, the company had troublewith departing employees taking intellectual property and Source: Adapted from Bob Violino, “Protecting the Data Jewels: Valuableusing it in their new jobs at rival firms, which sometimes led Customer Lists,” Computerworld, July 19, 2004. Copyright © 2004 byto lawsuits. Computerworld Inc., Framingham, MA 01701. All rights reserved. CASE STUDY QUESTIONS REAL WORLD ACTIVITIES 1. Why have developments in IT helped to increase the 1. Companies are increasingly adopting a position that value of the data resources of many companies? data is an asset that must be managed with the same 2. How have these capabilities increased the security chal- level of attention as that of cash and other capital. lenges associated with protecting a company’s data Using the Internet, see if you can find examples of how resources? companies treat their data. Does there seem to be any relationship between companies that look at their data 3. How can companies use IT to meet the challenges of as an asset and companies that are highly successful in data resource security? their respective industries?The Real World Case above illustrates how valuable data re- estimated that any firm in the financial industry would havesources are to the casino industry. Break into small groups a life expectancy of less than 100 hours if they were placed inwith your classmates, and discuss other industries where a position where they could not access their organizationaltheir data are clearly their lifeblood. For example, it has been data. Do you agree with this estimate?
  • 5. Chapter 5 / Data Resource Management ● 153FIGURE 5.2 Examples of the logical data elements in information systems. Note especially the examples of howdata fields, records, files, and databases are related. Human Resource Database Payroll File Benefits File Employee Employee Employee Employee Record 1 Record 2 Record 3 Record 4 Name SS No. Salary Name SS No. Salary Name SS No. Insurance Name SS No. Insurance Field Field Field Field Field Field Field Field Field Field Field FieldJones T. A. 275-32-3874 20,000 Klugman J. L. 349-88-7913 28,000 Alvarez J.S. 542-40-3718 100,000 Porter M.L. 617-87-7915 50,000 for which they are primarily used, such as a payroll file or an inventory file, or the type of data they contain, such as a document file or a graphical image file. Files are also classified by their permanence, for example, a payroll master file versus a payroll weekly transac- tion file. A transaction file, therefore, would contain records of all transactions occur- ring during a period and might be used periodically to update the permanent records contained in a master file. A history file is an obsolete transaction or master file retained for backup purposes or for long-term historical storage called archival storage.Database A database is an integrated collection of logically related data elements. A database consolidates records previously stored in separate files into a common pool of data elements that provides data for many applications. The data stored in a database are independent of the application programs using them and of the type of storage devices on which they are stored. Thus, databases contain data elements describing entities and relationships among entities. For example, Figure 5.3 outlines some of the entities and relationships in aFIGURE 5.3Some of the entities and Electric Utility Databaserelationships in a simplifiedelectric utility database.Note a few of the business Billing Paymentapplications that access the Entities: processingdata in the database. Customers, meters, bills, payments, meter readings Meter Service Relationships: reading start / stop Bills sent to customers, customers make payments, customers use meters, . . . Source: Adapted from Michael V. Mannino, Database Application Development and Design (Burr Ridge, IL: McGraw-Hill/Irwin, 2001), p. 6.
  • 6. 154 ● Module II / Information Technologies database for an electric utility. Also shown are some of the business applications (billing, payment processing) that depend on access to the data elements in the database.Database The relationships among the many individual data elements stored in databases are based on one of several logical data structures, or models. Database management sys-Structures tem packages are designed to use a specific data structure to provide end users with quick, easy access to information stored in databases. Five fundamental database struc- tures are the hierarchical, network, relational, object-oriented, and multidimensional models. Simplified illustrations of the first three database structures are shown in Figure 5.4.Hierarchical Early mainframe DBMS packages used the hierarchical structure, in which the rela-Structure tionships between records form a hierarchy or treelike structure. In the traditional hierarchical model, all records are dependent and arranged in multilevel structures,FIGURE 5.4 Hierarchical StructureExample of three Departmentfundamental database Data Elementstructures. They representthree basic ways todevelop and express therelationships among the Project A Project Bdata elements in a database. Data Element Data Element Employee 1 Employee 2 Data Element Data Element Network Structure Department A Department B Employee Employee Employee 1 2 3 Project Project A B Relational Structure Department Table Employee Table Deptno Dname Dloc Dmgr Empno Ename Etitle Esalary Deptno Dept A Emp 1 Dept A Dept B Emp 2 Dept A Dept C Emp 3 Dept B Emp 4 Dept B Emp 5 Dept C Emp 6 Dept B
  • 7. Chapter 5 / Data Resource Management ● 155 consisting of one root record and any number of subordinate levels. Thus, all of the relationships among records are one-to-many, since each data element is related to only one element above it. The data element or record at the highest level of the hierarchy (the department data element in this illustration) is called the root element. Any data element can be accessed by moving progressively downward from a root and along the branches of the tree until the desired record (for example, the employee data element) is located.Network Structure The network structure can represent more complex logical relationships and is still used by some mainframe DBMS packages. It allows many-to-many relationships among records; that is, the network model can access a data element by following one of several paths, because any data element or record can be related to any number of other data elements. For example, in Figure 5.4, departmental records can be related to more than one employee record, and employee records can be related to more than one project record. Thus, you could locate all employee records for a particular department, or all project records related to a particular employee.Relational Structure The relational model is the most widely used of the three database structures. It is used by most microcomputer DBMS packages, as well as by most midrange and mainframe systems. In the relational model, all data elements within the database are viewed as being stored in the form of simple two-dimensional tables sometimes referred to as relations. The tables in a relational database have rows and columns. Each row repre- sents a single record in the file, and each column represents a field. Figure 5.4 illustrates the relational database model with two tables representing some of the relationships among departmental and employee records. Other tables, or rela- tions, for this organization’s database might represent the data element relationships among projects, divisions, product lines, and so on. Database management system pack- ages based on the relational model can link data elements from various tables to provide information to users. For example, a manager might want to retrieve and display an employee’s name and salary from the employee table in Figure 5.4, and the name of the employee’s department from the department table, by using their common department number field (Deptno) to link or join the two tables. See Figure 5.5. The relational model can relate data in any one file with data in another file if both files share a com- mon data element or field. Because of this, information can be created by retrieving data from multiple files even if they are not all stored in the same physical location.Relational Three basic operations can be performed on a relational database to create useful setsOperations of data. The select operation is used to create a subset of records that meet a stated cri- terion. For example, a select operation might be used on an employee database to create a subset of records that contain all employees who make more than $30,000 per year and who have been with the company more than three years. Another way to think of the select operation is that it temporarily creates a table whose rows have records that meet the selection criteria.FIGURE 5.5 Department Table Employee TableJoining the Employee and Deptno Dname Dloc Dmgr Empno Ename Etitle Esalary DeptnoDepartment tables in a Dept A Emp 1 Dept Arelational database enables Dept B Emp 2 Dept Ayou to selectively access Dept C Emp 3 Dept Bdata in both tables at the Emp 4 Dept Bsame time. Emp 5 Dept C Emp 6 Dept B
  • 8. 156 ● Module II / Information Technologies The join operation can be used to temporarily combine two or more tables so that a user can see relevant data in a form that looks like it is all in one big table. Using this operation, a user can ask for data to be retrieved from multiple files or databases without having to go to each one separately. Finally, the project operation is used to create a subset of the columns contained in the temporary tables created by the select and join operations. Just as the select oper- ation creates a subset of records that meet stated criteria, the project operation creates a subset of the columns, or fields, that the user wants to see. Using a project operation, the user can decide not to view all of the columns in the table but only those that have data necessary to answer a particular question or to construct a specific report. Because of the widespread use of the relational model, an abundance of commer- cial products exists to create and manage them. Leading mainframe relational database applications include Oracle 10g from Oracle Corp. and DB2 from IBM. A very popu- lar midrange database application is SQL server from Microsoft. The most commonly used database application for the PC is Microsoft Access.Multidimensional The multidimensional model is a variation of the relational model that uses multidi-Structure mensional structures to organize data and express the relationships between data. You can visualize multidimensional structures as cubes of data and cubes within cubes of data. Each side of the cube is considered a dimension of the data. Figure 5.6 is an example that shows that each dimension can represent a different category, such as product type, region, sales channel, and time [5].FIGURE 5.6 An example of the different dimensions of a multidimensional database. Denver Profit Los Angeles Total Expenses San Francisco Margin West COGS February March East West East Sales Actual Budget Actual Budget Actual Budget Actual Budget Sales Camera TV January TV February VCR March Audio Qtr 1 Margin Camera VCR January TV February VCR March Audio Qtr 1 April April Qtr 1 Qtr 1 March March February February Actual Budget Sales Margin January January Sales Margin Sales Margin TV VCR TV VCR TV East East Actual West Budget South Forecast Total Variance VCR East West Actual West Budget South Forecast Total Variance
  • 9. Chapter 5 / Data Resource Management ● 157FIGURE 5.7 Bank Account ObjectThe checking and savings Attributesaccount objects can inherit Customercommon attributes and Balanceoperations from the bank Interestaccount object. Operations Deposit (amount) Withdraw (amount) Get owner Inheritance Inheritance Checking Account Object Savings Account Object Attributes Attributes Credit line Number of withdrawals Monthly statement Quarterly statement Operations Operations Calculate interest owed Calculate interest paid Print monthly statement Print quarterly statement Source: Adapted from Ivar Jacobsen, Maria Ericsson, and Ageneta Jacobsen, The Object Advantage: Business Process Reengineering with Object Technology (New York: ACM Press, 1995), p. 65. Copyright © 1995, Association for Computing Machinery. By permission. Each cell within a multidimensional structure contains aggregated data related to elements along each of its dimensions. For example, a single cell may contain the total sales for a product in a region for a specific sales channel in a single month. A major benefit of multidimensional databases is that they are a compact and easy-to-understand way to visualize and manipulate data elements that have many interrelationships. So multidimensional databases have become the most popular database structure for the analytical databases that support online analytical processing (OLAP) applications, in which fast answers to complex business queries are expected. We discuss OLAP applications in Chapter 9.Object-Oriented The object-oriented model is considered to be one of the key technologies of a newStructure generation of multimedia Web-based applications. As Figure 5.7 illustrates, an object consists of data values describing the attributes of an entity, plus the operations that can be performed upon the data. This encapsulation capability allows the object- oriented model to more easily handle complex types of data (graphics, pictures, voice, text) than other database structures. The object-oriented model also supports inheritance; that is, new objects can be automatically created by replicating some or all of the characteristics of one or more parent objects. Thus, in Figure 5.7, the checking and savings account objects can both inherit the common attributes and operations of the parent bank account object. Such capabilities have made object-oriented database management systems (OODBMS) popular in computer-aided design (CAD) and in a growing number of applications. For exam- ple, object technology allows designers to develop product designs, store them as ob- jects in an object-oriented database, and replicate and modify them to create new product designs. In addition, multimedia Web-based applications for the Internet and corporate intranets and extranets have become a major application area for object technology. Object technology proponents argue that an object-oriented DBMS can work with complex data types such as document and graphic images, video clips, audio segments, and other subsets of Web pages much more efficiently than relational database management systems. However, major relational DBMS vendors have countered by
  • 10. 158 ● Module II / Information TechnologiesFIGURE 5.8This claims analysisgraphics display providedby the CleverPathenterprise portal is poweredby the Jasmine ii object-oriented databasemanagement system ofComputer Associates. Source: Courtesy of Computer Associates. adding object-oriented modules to their relational software. Examples include multi- media object extensions to IBM’s DB2, and Oracle’s object-based “cartridges” for Oracle 10g. See Figure 5.8.Evaluation of The hierarchical data structure was a natural model for the databases used for theDatabase Structures structured, routine types of transaction processing characteristic of many business op- erations in the early years of data processing and computing. Data for these operations can easily be represented by groups of records in a hierarchical relationship. However, as time progressed, there were many cases where information was needed about records that did not have hierarchical relationships. For example, in some organizations, employees from more than one department can work on more than one project (refer back to Figure 5.4). A network data structure could easily handle this many-to-many relationship, whereas a hierarchical model could not. As such, the more flexible net- work structure became popular for these types of business operations. However, like the hierarchical structure, because its relationships must be specified in advance, the network model was unable to easily handle ad hoc requests for information, thus pointing out the need for the relational model. Relational databases allow an end user to easily receive information in response to ad hoc requests. That’s because not all of the relationships between the data elements in a relationally organized database need to be specified when the database is created. Database management software (such as Oracle 10g, DB2, Access, and Approach) cre- ates new tables of data relationships by using parts of the data from several tables. Thus, relational databases are easier for programmers to work with and easier to main- tain than the hierarchical and network models. The major limitation of the relational model is that relational database manage- ment systems cannot process large amounts of business transactions as quickly and efficiently as those based on the hierarchical and network models, or process com- plex, high-volume applications as well as the object-oriented model. This performance gap has narrowed with the development of advanced relational database software with object-oriented extensions. The use of database management software based on the object-oriented and multidimensional models is growing steadily, as these tech- nologies are playing a greater role for OLAP and Web-based applications.
  • 11. Chapter 5 / Data Resource Management ● 159Experian Experian Inc. (www.experian.com), a unit of London-based GUS PLC, runs one ofAutomotive: The the largest credit reporting agencies in the United States. But Experian wanted to expand its business beyond credit checks for automobile loans. If it could collectBusiness Value vehicle data from the various motor-vehicle departments in the United States andof Relational blend that with other data, such as change-of-address records, then its ExperianDatabase Automotive division could sell the enhanced data to a variety of customers. For example, car dealers could use the data to make sure their inventory matches localManagement buying preferences. And toll collectors could match license plates to addresses to find motorists who sail past tollbooths without paying. But to offer new services, Experian first needed a way to extract, transfer, and load data from the systems of 50 different U.S. state departments of motor vehicles (DMVs), plus Puerto Rico, into a single database. That was a big challenge. “Unlike the credit industry that writes to a common format, the DMVs do not,” says Ken Kauppila, vice president of IT at Experian Automotive in Costa Mesa, California. Of course, Experian didn’t want to replicate the hodgepodge of file formats it inherited when the project began in January 1999—175 formats among 18,000 files. So Kauppila decided to transform and map the data to a common relational database format. Fortunately, off-the-shelf software tools for extracting, transforming, and loading data (called ETL tools) make it economical to combine very large data repositories. Using ETL Extract from Evolutionary Technologies, Experian created a database that can incorporate vehicle information within 48 hours of its entry into any of the nation’s DMV computers. This is one of the areas in which data management soft- ware tools can excel, says Guy Creese, analyst at Aberdeen Group in Boston. “It can simplify the mechanics of multiple data feeds, and it can add to data quality, making fixes possible before errors are propagated to data warehouses,” he says. Using the ETL extraction and transformation tools along with IBM’s DB2 data- base system, Experian Automotive created a database that processes 175 million transactions per month and has created a variety of profitable new revenue streams. Experian’s automotive database is the 10th largest database in the world—now, with up to 16 billion rows of data. But the company says the relational database is man- aged by just three IT professionals. Experian says this demonstrates how efficiently database software like DB2 and the ETL tools can work with a large database to handle vast amounts of data quickly.Database Database management packages like Microsoft Access or Lotus Approach allow end users to easily develop the databases they need. See Figure 5.9. However, large orga-Development nizations usually place control of enterprisewide database development in the hands of database administrators (DBAs) and other database specialists. This improves the in- tegrity and security of organizational databases. Database developers use the data def- inition language (DDL) in database management systems like Oracle 10g or IBM’s DB2 to develop and specify the data contents, relationships, and structure of each database, and to modify these database specifications when necessary. Such information is cata- loged and stored in a database of data definitions and specifications called a data dictio- nary, or metadata repository, which is managed by the database management software and maintained by the DBA. A data dictionary is a database management catalog or directory containing metadata, that is, data about data. A data dictionary relies on a specialized database software component to manage a database of data definitions, that is, metadata about the structure, data elements, and other characteristics of an organization’s databases. For example, it contains the names and descriptions of all types of data records and their interrelationships, as well as information outlining requirements for end users’ access and use of application programs, and database maintenance and security.
  • 12. 160 ● Module II / Information TechnologiesFIGURE 5.9Creating a database tableusing the Table Wizardof Microsoft Access. Source: Courtesy of Microsoft Corp. Data dictionaries can be queried by the database administrator to report the status of any aspect of a firm’s metadata. The administrator can then make changes to the definitions of selected data elements. Some active (versus passive) data dictionaries automatically enforce standard data element definitions whenever end users and ap- plication programs access an organization’s databases. For example, an active data dic- tionary would not allow a data entry program to use a nonstandard definition of a customer record, nor would it allow an employee to enter a name of a customer that exceeded the defined size of that data element. Developing a large database of complex data types can be a complicated task. Data- base administrators and database design analysts work with end users and systems analysts to model business processes and the data they require. Then they determine (1) what data definitions should be included in the database and (2) what structure or relationships should exist among the data elements.Data Planning and As Figure 5.10 illustrates, database development may start with a top-down data plan-Database Design ning process. Database administrators and designers work with corporate and end user management to develop an enterprise model that defines the basic business process of the enterprise. Then they define the information needs of end users in a business process, such as the purchasing/receiving process that all businesses have. Next, end users must identify the key data elements that are needed to perform their specific business activities. This frequently involves developing entity relationship diagrams (ERDs) that model the relationships among the many entities involved in business processes. For example, Figure 5.11 illustrates some of the relationships in a purchasing/receiving process. ERDs are simply graphical models of the various files and their relationships contained within a database system. End users and data- base designers could use database management or business modeling software to help them develop ERD models for the purchasing/receiving process. This would help identify what supplier and product data are required to automate their purchasing/ receiving and other business processes using enterprise resource management (ERM) or supply chain management (SCM) software. You will learn about ERDs and other data modeling tools in much greater detail if you ever take a course in systems analysis and design.
  • 13. Chapter 5 / Data Resource Management ● 161FIGURE 5.10Database development 1. Data Planning Physical Data Modelsinvolves data planning and Develops a model of business Storage representations anddatabase design activities. processes access methodsData models that supportbusiness processes are usedto develop databases thatmeet the information needs 5. Physical Designof users. Enterprise model of business Determines the data storage processes with documentation structures and access methods Logical Data Models 2. Requirements Specification E.g., relational, network, Defines information needs of end hierarchical, multidimensional, users in a business process or object-oriented models Description of users’ needs may 4. Logical Design be represented in natural Translates the conceptual language or using the tools of a models into the data model of particular design methodology a DBMS 3. Conceptual Design Conceptual Data Models Expresses all information Often expressed as entity requirements in the form of a relationship models high-level model Such user views are a major part of a data modeling process where the relation- ships between data elements are identified. Each data model defines the logical rela- tionships among the data elements needed to support a basic business process. For example, can a supplier provide more than one type of product to us? Can a customer have more than one type of account with us? Can an employee have several pay rates or be assigned to several project workgroups? Answering such questions will identify data relationships that have to be repre- sented in a data model that supports a business process. These data models then serve as logical frameworks (called schemas and subschemas) on which to base the physical de- sign of databases and the development of application programs to support the business processes of the organization. A schema is an overall logical view of the relationshipsFIGURE 5.11 Ordered on Supplies PurchaseThis entity relationship Product Supplier Order Itemdiagram illustrates some ofthe relationships among the Stocked asentities (product, supplier, Containswarehouse, etc.) in apurchasing/receivingbusiness process. Purchase Product Holds Warehouse Order Stock
  • 14. 162 ● Module II / Information TechnologiesFIGURE 5.12 Example of the logical and physical database views and the software interface of a banking servicesinformation system. Installment Checking Savings Loan Application Application Application Logical User Views Checking and Installment Data elements and relationships (the subschemas) needed Savings Loan for checking, savings, or installment loan processing Data Model Data Model Data elements and relationships (the schema) Banking Services Data Model needed for the support of all bank services Software Interface Database Management System The DBMS provides access to the bank’s databases Physical Data Views Organization and location of data on the storage media Bank Databases among the data elements in a database, while the subschema is a logical view of the data relationships needed to support specific end user application programs that will access that database. Remember that data models represent logical views of the data and relationships of the database. Physical database design takes a physical view of the data (also called the internal view) that describes how data are to be physically stored and accessed on the storage devices of a computer system. For example, Figure 5.12 illustrates these dif- ferent database views and the software interface of a bank database processing system. In this example, checking, savings, and installment lending are the business processes whose data models are part of a banking services data model that serves as a logical data framework for all bank services. Aetna: Insuring On a daily basis the operational services central support area at Aetna Inc. is Tons of Data responsible for 21.8 tons of data (174.6 terabytes [TB]). Over 119.2TB reside on mainframe-connected disk drives, while the remaining 55.4TB sit on disks attached to midrange computers. Almost all of this data are located in the com- pany’s headquarters in Hartford, Connecticut—with most of the information in relational databases. To make matters even more interesting, outside customers have access to about 20TB of the information. Four interconnected data centers containing 14 mainframes and more than 1,000 midrange servers process the data. It takes more than 4,100 direct-access storage devices to hold Aetna’s key databases.
  • 15. Chapter 5 / Data Resource Management ● 163 Most of Aetna’s ever-growing mountain of data is health care information. Theinsurance company maintains records for both health maintenance organizationparticipants and customers covered by insurance policies. Aetna has detailedrecords of providers, such as doctors, hospitals, dentists, and pharmacies, and itkeeps track of all the claims it has processed. Some of Aetna’s larger customers sendtapes containing insured employee data; the firm is moving toward using the Internetto collect such data. If managing gigabytes of data is like flying a hang glider, managing multipleterabytes of data is like piloting a space shuttle: a thousand times more complex.You can’t just extrapolate from experiences with small and medium data stores tounderstand how to successfully manage tons of data. Even an otherwise mundaneoperation such as backing up a database can be daunting if the time needed to finishcopying the data exceeds the time available. Data integrity, backup, security, and availability are collectively the HolyGrail of dealing with large data stores. The sheer volume of data makes thesegoals a challenge, and a highly decentralized environment complicates matterseven more. Developing and adhering to standardized data maintenance proce-dures always provide an organization with the best return on their data dollarinvestment [9, 11].
  • 16. 164 ● Module II / Information Technologies SECTION II Managing Data ResourcesData Resource Data are a vital organizational resource that needs to be managed like other important business assets. Today’s business enterprises cannot survive or succeed without qualityManagement data about their internal operations and external environment. With each online mouse click, either a fresh bit of data is created or already-stored data are retrieved from all those business websites. All that’s on top of the heavy demand for indus- trial-strength data storage already in use by scores of big corporations. What’s driving the growth is a crushing imperative for corporations to analyze every bit of information they can extract from their huge data warehouses for competitive advantage. That has turned the data storage and management function into a key strategic role of the information age [8]. That’s why organizations and their managers need to practice data resource man- agement, a managerial activity that applies information systems technologies like data- base management, data warehousing, and other data management tools to the task of managing an organization’s data resources to meet the information needs of their busi- ness stakeholders. This chapter will show you the managerial implications of using data resource management technologies and methods to manage an organization’s data assets to meet business information requirements. Read the Real World Case on data administration. We can learn a lot from this case about the challenges of managing the data within an organization. See Figure 5.13.Types of Continuing developments in information technology and its business applications have resulted in the evolution of several major types of databases. Figure 5.14 illus-Databases trates several major conceptual categories of databases that may be found in many organizations. Let’s take a brief look at some of them now.Operational Operational databases store detailed data needed to support the business processesDatabases and operations of a company. They are also called subject area databases (SADB), trans- action databases, and production databases. Examples are a customer database, human re- source database, inventory database, and other databases containing data generated by business operations. For example, a human resource database like that shown earlier in Figure 5.2 would include data identifying each employee and his or her time worked, compensation, benefits, performance appraisals, training and development status, and other related human resource data. Figure 5.15 illustrates some of the common oper- ational databases that can be created and managed for a small business using Microsoft Access database management software.Distributed Many organizations replicate and distribute copies or parts of databases to networkDatabases servers at a variety of sites. These distributed databases can reside on network servers on the World Wide Web, on corporate intranets or extranets, or on other company networks. Distributed databases may be copies of operational or analytical databases, hypermedia or discussion databases, or any other type of database. Replication and dis- tribution of databases are done to improve database performance at end user worksites. Ensuring that the data in an organization’s distributed databases are consistently and concurrently updated is a major challenge of distributed database management. Distributed databases have both advantages and disadvantages. One primary ad- vantage of a distributed database lies with the protection of valuable data. If all of an organization’s data reside in a single physical location, any catastrophic event like a fire or damage to the media holding the data would result in an equally catastrophic loss of use of that data. By having databases distributed in multiple locations, the negative impact of such an event can be minimized.
  • 17. Chapter 5 / Data Resource Management ● 165 2REAL WORLD Emerson and Sanofi: Data CASE Stewards Seek Data ConformityA customer is a customer is a customer, right? Actu- ally, it’s not that simple. Just ask Emerson Process Management, an Emerson Electric Co. unit inAustin that supplies process automation products. In 2000 thecompany attempted to build a data warehouse to store cus- “It’s usually a seesaw effect,” says Chris Enger, formerly manager of information management at Philip Morris USA Inc. “When something goes wrong, they put someone in charge of data quality, and when things get better, they pull those resources away.”tomer information from over 85 countries. The effort failed Creating a data quality team requires gathering peoplein large part because the structure of the warehouse couldn’t with an unusual mix of business, technology, and diplomaticaccommodate the many variations on customers’ names. skills. It’s even difficult to agree on a job title. In Rybeck’s For instance, different users in different parts of the world department, they’re called “data analysts,” but titles at othermight identify Exxon as Exxon, Mobil, Esso, or ExxonMobil, companies include “data quality control supervisor,” “datato name a few variations. The warehouse would see them as coordinator,” or “data quality manager.”separate customers, and that would lead to inaccurate results “When you say you want a data analyst, they’ll comewhen business users performed queries. back with a DBA [database administrator]. But it’s not the That’s when the company hired Nancy Rybeck as data same at all,” Rybeck says. “It’s not the data structure, it’s theadministrator. Rybeck is now leading a renewed data ware- content.”house project that ensures not only the standardization of At Emerson, data analysts in each business unit reviewcustomer names, but also the quality and accuracy of cus- data and correct errors before it’s put into the operationaltomer data, including postal addresses, shipping addresses, systems. They also research customer relationships, loca-and province codes. tions, and corporate hierarchies; train overseas workers to fix To accomplish this, Emerson has done something unusual: data in their native languages; and serve as the main contactIt has started to build a department with 6 to 10 full-time “data with the data administrator and database architect for newstewards” dedicated to establishing and maintaining the quality requirements and bug fixes.of data entered into the operational systems that feed the data As the leader of the group, Rybeck plays a role thatwarehouse. includes establishing and communicating data standards, The practice of having formal data stewards is uncom- ensuring data integrity is maintained during database con-mon. Most companies recognize the importance of data versions, and doing the logical design for the data ware-quality, but many treat it as a “find-and-fix” effort, to be con- house tables.ducted at the end of a project by someone in IT. Others The stewards have their work cut out for them. Bringingcasually assign the job to the business users who deal with the together customer records from the 75 business units yieldeddata head-on. Still others may throw resources at improving a 75 percent duplication rate, misspellings, and fields withdata only when a major problem occurs. incorrect or missing data. “Most of the divisions would have sworn they had greatFIGURE 5.13 processes and standards in place,” Rybeck says. “But when you show them they entered the customer name 17 different ways, or someone had entered, ‘Loading dock open 8:00–4:00’ into the address field, they realize it’s not as clean as they thought.” Although the data steward may report to IT—as is the case at Emerson and at pharmaceuticals company Sanofi- Synthelabo Inc.—it’s not a job for someone steeped in tech- nical knowledge. Yet it’s not right for a businessperson who’s a technophobe, either. Seth Cohen is the first data quality control supervisor at Sanofi in New York. He was hired in 2003 to help design au- tomated processes to ensure the data quality of the customer knowledge base that Sanofi was beginning to build. Data stewards at Sanofi need to have business knowledge because they need to make frequent judgment calls, Cohen says. Indeed, judgment is a big part of the data steward’s job—including the ability to determine where you don’t need 100 percent perfection. Cohen says that task is one of the biggest challenges of the job. “One-hundred percent accuracy is just not achievable,”Source: Flying Colours Ltd./Digital Vision/Getty Images
  • 18. 166 ● Module II / Information Technologieshe says. “Some things you’re just going to have to let go or didn’t see why he was “causing them so many headaches andyou’d have a data warehouse with only 15 to 20 records.” adding several extra steps to the process,” he says. A good example is when Sanofi purchases data on doctors There are many political traps as well. Take the issue ofthat includes their birth dates, Cohen says. If a birth date is defining “customer address.” If data comes from a varietygiven as February 31 or the number of the month is listed as of sources, you’re likely to get different types of coding13 but the rest of the data are good, do you throw out all of schemes, some of which overlap.the data or just figure the birth date isn’t all that important? People may also argue about how data should be pro- It comes down to knowing how much it costs to fix the duced, he says. Should field representatives enter it fromdata versus the payback. “You can pay millions of dollars a their laptops? Or should it first be independently checked foryear to get it perfect, but if the returns are in the hundreds of quality? Should it be uploaded hourly or weekly?thousands, is it worth it?” asks Chuck Kelley, senior advisory Most of all, data stewards need to understand that dataconsultant at Navigator Systems Inc., a corporate perfor- quality is a journey, not a destination. “It’s not a one-shotmance management consultancy in Addison, Texas. deal—it’s ongoing,” Rybeck of Emerson says. “You can’t quit Data stewards also need to be politically astute, diplo- after the first task.”matic, and good at conflict resolution—in part because the Source: Adapted from Mary Brandel, “Data Stewards Seek Data Conformity,”environment isn’t always friendly. When Cohen joined Computerworld, March 15, 2004. Copyright © 2004 by Computerworld Inc.,Sanofi, some questioned why he was there. In particular, IT Framingham, MA 01701. All rights reserved. CASE STUDY QUESTIONS REAL WORLD ACTIVITIES 1. Why is the role of a data steward considered to be 1. As discussed in the case, the role of data steward is innovative? Explain. relatively new, and its creation is motivated by the 2. What are the business benefits associated with the data desire to protect the valuable data assets of the firm. steward program at Emerson? There are many job descriptions in the modern organi- zation associated with the strategic management of data 3. How does effective data resource management resources. Using the Internet, see if you can find evi- contribute to the strategic goals of an organization? dence of other job roles that are focused on the man- Provide examples from Emerson and others. agement of an organization’s data. How might a person train for these new jobs? 2. As more and more data are collected stored, processed, and disseminated by organizations, new and innovative ways to manage them must be developed. Break into small groups with your classmates, and discuss how the data resource management methods of today will need to evolve as more types of data emerge. Will we ever get to the point where we can manage our data in a completely automated manner?
  • 19. Chapter 5 / Data Resource Management ● 167FIGURE 5.14 Examples of some of the major types of databases used by organizations and end users. External Databases on the Internet and Online Client PC Services Network Server Distributed Databases Operational on Intranets Databases and Other of the Networks Organization End User Data Data Databases Warehouse Marts Another advantage of distributed databases is found in their storage requirements. Often, a large database system may be distributed into smaller databases based on some logical relationship between the data and the location. For example, a company with several branch operations may distribute its data so that each branch operation location is also the location of its branch database. Because multiple databases in a distributed system can be joined together, each location has control of its local data while all other locations can access any database in the company if so desired. Distributed databases are not without some challenges, however. The primary chal- lenge is the maintenance of data accuracy. If a company distributes its database toFIGURE 5.15Examples of operationaldatabases that can becreated and managedfor a small business bymicrocomputer databasemanagement software likeMicrosoft Access. Source: Courtesy of Microsoft Corp.
  • 20. 168 ● Module II / Information Technologies multiple locations, any change to the data in one location must somehow be updated in all other locations. This can be accomplished in one of two ways: replication or duplication. Updating a distributed database using replication involves using a specialized soft- ware application that looks at each distributed database and then finds the changes made to it. Once these changes have been identified, the replication process makes all of the distributed databases look the same by making the appropriate changes to each one. The replication process is very complex and, depending upon the number and size of the distributed databases, can consume a lot of time and computer resources. The duplication process, in contrast, is much less complicated. It basically identi- fies one database as a master and then duplicates that database at a prescribed time af- ter hours so that each distributed location has the same data. One drawback to the duplication process is that no changes can ever be made to any database other than the master to avoid having local changes overwritten during the duplication process. Nonetheless, properly used, duplication and replication can keep all distributed locations current with the latest data. One additional challenge associated with distributed databases is the extra com- puting power and bandwidth necessary to access multiple databases in multiple loca- tions. We will look more closely at the issue of bandwidth in Chapter 6 when we focus on telecommunications and networks.External Databases Access to a wealth of information from external databases is available for a fee from commercial online services, and with or without charge from many sources on the World Wide Web. Websites provide an endless variety of hyperlinked pages of multi- media documents in hypermedia databases for you to access. Data are available in the form of statistics on economic and demographic activity from statistical databanks. Or you can view or download abstracts or complete copies of hundreds of newspapers, magazines, newsletters, research papers, and other published material and other peri- odicals from bibliographic and full text databases. Whenever you use a search engine like Google or Yahoo to look up something on the Internet, you are using an external database—a very, very large one!Hypermedia The rapid growth of websites on the Internet and corporate intranets and extranets hasDatabases dramatically increased the use of databases of hypertext and hypermedia documents. A website stores such information in a hypermedia database consisting of hyper- linked pages of multimedia (text, graphic, and photographic images, video clips, audio segments, and so on). That is, from a database management point of view, the set of interconnected multimedia pages at a website is a database of interrelated hypermedia page elements, rather than interrelated data records [2]. Figure 5.16 shows how you might use a Web browser on your client PC to connect with a Web network server. This server runs Web server software to access and transfer theFIGURE 5.16 The components of a Web-based information system include Web browsers,servers, and hypermedia databases. The Internet Intranets Web Extranets HTML Browser Web XML Server Web Pages Software Image Files Video Files Audio Files Client PCs Network Hypermedia Server Database
  • 21. Chapter 5 / Data Resource Management ● 169FIGURE 5.17 The components of a complete data warehouse system.Operational, External,and Other Databases Analytical Data Store Data Enterprise Management Warehouse Data Marts Data Acquisition Data Analysis (Capture, clean, (Query, report, transform, transport, analyze, mine, load/apply) deliver) Metadata Metadata Directory Management Warehouse Metadata Repository Web Information Design SystemsSource: Adapted courtesy of Hewlett-Packard. Web pages you request. The website illustrated in Figure 5.17 uses a hypermedia database consisting of Web page content described by HTML (Hypertext Markup Language) code or XML (Extensible Markup Language) labels, image files, video files, and audio. The Web server software acts as a database management system to manage the transfer of hypermedia files for downloading by the multimedia plug-ins of your Web browser.Data A data warehouse stores data that have been extracted from the various operational, external, and other databases of an organization. It is a central source of the data thatWarehouses have been cleaned, transformed, and cataloged so they can be used by managers andand Data other business professionals for data mining, online analytical processing, and otherMining forms of business analysis, market research, and decision support. (We’ll talk in depth about all of these activities in Chapter 9.) Data warehouses may be subdivided into data marts, which hold subsets of data from the warehouse that focus on specific aspects of a company, such as a department or a business process. Figure 5.17 illustrates the components of a complete data warehouse system. No- tice how data from various operational and external databases are captured, cleaned, and transformed into data that can be better used for analysis. This acquisition process might include activities like consolidating data from several sources, filtering out un- wanted data, correcting incorrect data, converting data to new data elements, and aggregating data into new data subsets. This data is then stored in the enterprise data warehouse, from where it can be moved into data marts or to an analytical data store that holds data in a more useful form for cer- tain types of analysis. Metadata (data that defines the data in the data warehouse) is stored in a metadata repository and cataloged by a metadata directory. Finally, a variety of ana- lytical software tools can be provided to query, report, mine, and analyze the data for delivery via Internet and intranet Web systems to business end users. See Figure 5.18. Revenue: Closing In the late 1990s the state of Iowa had a tax gap, a polite way of describing compa- the Gap with a nies and individuals who either didn’t file state tax returns or who underreported their earnings. To identify noncompliant taxpayers, the Iowa Department of Data Warehouse Revenue and Finance (IDRF) relied on a jumble of nonintegrated mainframe applications, file extracts, and over 20 disparate stand-alone systems (databases,
  • 22. 170 ● Module II / Information TechnologiesFIGURE 5.18 Applications Data MartsA data warehouse and itsdata mart subsets hold datathat have been extracted Finance ERPfrom various operationaldatabases for businessanalysis, market research,decision support, and datamining applications. Inventory control Marketing Logistics Data Warehouse Sales Shipping Accounting Purchasing CRM Management reporting mainframe data, and information on individual spreadsheets, to name a few). The real problem was that none of these systems could communicate with each other. What was needed was a central data warehouse to pull together information from all those systems for analysis. But getting funding from the state for such a large-scale project wasn’t an option. So the IDRF came up with a plan the Iowa Legislature couldn’t help but ap- prove. The plan was simple: Build a data warehouse that would be entirely funded using the additional tax revenue it generated by catching tax scofflaws. Development of the data warehouse began in November 1999, and it became operational five months later. The system combines data from the department’s own tax and accounts receivable systems, tax files shared by the federal Internal Revenue Service, the Iowa Workforce Development Agency, and a number of other sources. Revenue- and finance-department employees analyze the data using com- mercially available reporting software. In the three years since it went live, the IDRF data warehouse has generated $28 million in tax revenue and is expected to generate $10 million each year from now on. There’s no question the project has paid for itself many times over, and the state of Iowa is sold on the value of data warehousing. The next step is to use the data warehouse to better understand why taxpayers might be in noncompliance. That will involve analyzing taxpayer demographics and changes in tax laws and policies. This phase of the project is also expected to generate revenues for the state while simulta- neously helping to improve the tax laws for the citizens of Iowa [12, 13].
  • 23. Chapter 5 / Data Resource Management ● 171FIGURE 5.19 How data mining extracts business knowledge from a data warehouse. Data Trans- Data Interpretation/ Selection formation Mining Evaluation Target Data Patterns Business Data Warehouse Knowledge Databases One important characteristic about the data in a data warehouse is that, unlike a typical database where changes can occur constantly, data in a data warehouse is static. This means that once the data is gathered up, formatted for storage, and stored in the data warehouse, it will never change. This is so that queries can be made on the data to look for complex patterns or historical trends that might otherwise go unnoticed with dynamic data that changes constantly as a result of new transactions and updates.Data Mining Data mining is a major use of data warehouse databases and the static data they con- tain. In data mining, the data in a data warehouse are analyzed to reveal hidden pat- terns and trends in historical business activity. This can be used to help managers make decisions about strategic changes in business operations to gain competitive advan- tages in the marketplace [2]. See Figure 5.19. Data mining can discover new correlations, patterns, and trends in vast amounts of business data (frequently several terabytes of data) stored in data warehouses. Data mining software uses advanced pattern recognition algorithms as well as a variety of mathemati- cal and statistical techniques to sift through mountains of data to extract previously un- known strategic business information. For example, many companies use data mining to: • Perform “market-basket analysis” to identify new product bundles. • Find root causes to quality or manufacturing problems. • Prevent customer attrition and acquire new customers. • Cross-sell to existing customers. • Profile customers with more accuracy [6]. We will discuss data mining further, as well as online analytical processing (OLAP) and other technologies that analyze the data in databases and data warehouses to pro- vide vital support for business decisions in Chapter 9. In the meantime, let’s look at a real world example. Bank of America: In the world of banking, bigger is not always better. Bank of America (BofA) knows Benefits of Data this all too well. A string of some 60 mergers since the mid-1980s has made BofA the nation’s largest bank, servicing 27 million households at approximately 4,400 Mining domestic offices, 13,000 ATMs, and 30 international offices in more than 150 coun- tries. And it provides online banking access to 4 million active users. As BofA grew, however, so did problems with managing data across the enter- prise. During its growth phase, the bank acquired at least 30 disparate, noninte- grated customer databases. One of them, a data warehouse acquired through Bank of America’s merger with NationsBank in 1998, was operating completely au- tonomously with no integration to any other BofA system. The challenge was made greater by the fact that each data warehouse was operating in a nonstandard environment. Resources were being squandered on storing and maintaining dupli- cate data across these repositories.
  • 24. 172 ● Module II / Information Technologies In 2001, Bank of America decided to solve the problems by developing a central data capture and storage point. Through a single enterprise data model and stan- dards, the bank could achieve the consistent, timely, and accurate data it was lacking across its franchise. The goal was to create a “single version of the truth” with a single source of data. Reaching the goal of being smaller and better provided hefty operational sav- ings. Since transitioning to a single data warehouse platform, Bank of America has dropped its operating costs from $11 million to $4 million per year. The savings came primarily through their new economies of scale and through the elimination of duplicate technology and support staff. BofA didn’t stop there, though. Recently they added a business intelligence ac- cess capability to achieve greater accuracy in marketing and pricing financial prod- ucts, such as home equity loans. Now that they had all their data in one spot, they were able to mine that data to explore new relationships and to learn more about their customers and markets. To use the data mining function, data extracted from the data warehouse are an- alyzed by data mining software to discover hidden patterns. For example, data min- ing discovered that a certain set of customers was 15 times more likely to purchase a high-margin lending product. The bank also wanted to determine the sequence of events leading to purchasing. They fed the parameters to the data mining software and built a model for finding other customers. This model proved to be so accurate that it discovered people already in the process of applying and being approved for the lending product. Using this profile, a final list of quality prospects for solicita- tion was prepared. The resulting direct marketing response rates dramatically exceeded past results [1, 11].Traditional File How would you feel if you were an executive of a company and were told that some information you wanted about your employees was too difficult and too costly toProcessing obtain? Suppose the vice president of information services gave you the following reasons: • The information you want is in several different files, each organized in a different way. • Each file has been organized to be used by a different application program, none of which produces the information you want in the form you need. • No application program is available to help get the information you want from these files. That’s how end users can be frustrated when an organization relies on file processing systems in which data are organized, stored, and processed in independent files of data records. In the traditional file processing approach that was used in business data pro- cessing for many years, each business application was designed to use one or more spe- cialized data files containing only specific types of data records. For example, a bank’s checking account processing application was designed to access and update a data file containing specialized data records for the bank’s checking account customers. Simi- larly, the bank’s installment loan–processing application needed to access and update a specialized data file containing data records about the bank’s installment loan cus- tomers. See Figure 5.20.Problems of File The file processing approach finally became too cumbersome, costly, and inflexible toProcessing supply the information needed to manage modern business, and as we shall soon see, was replaced by the database management approach. Despite its apparent logic and sim- plicity, file processing systems had the following major problems:
  • 25. Chapter 5 / Data Resource Management ● 173FIGURE 5.20 CheckingExamples of file processing Account Customersystems in banking. Note Statements Processingthe use of separatecomputer programs andindependent data files in afile processing approach to Checking Checkingthe savings, installment Account Fileloan, and checking account Program Updateapplications. Savings Customer Customer Processing Transactions Receipts Savings Savings Program File Update Installment Loan Analysis Loan Reports Processing Installment Installment Loan Loan File Program Update Data Redundancy. Independent data files included a lot of duplicated data; the same data (such as a customer’s name and address) were recorded and stored in several files. This data redundancy caused problems when data had to be updated. Separate file main- tenance programs had to be developed and coordinated to ensure that each file was properly updated. Of course, this proved difficult in practice, so a lot of inconsistency occurred among data stored in separate files. Lack of Data Integration. Having data in independent files made it difficult to provide end users with information for ad hoc requests that required accessing data stored in several different files. Special computer programs had to be written to retrieve data from each independent file. This was so difficult, time-consuming, and costly for some organizations that it was impossible to provide end users or management with such information. End users had to manually extract the required information from the various reports produced by each separate application and prepare customized reports for management. Data Dependence. In file processing systems, major components of a system—the or- ganization of files, their physical locations on storage hardware, and the application software used to access those files—depended on one another in significant ways. For example, application programs typically contained references to the specific format of the data stored in the files they used. Thus, changes in the format and structure of data and records in a file required that changes be made to all of the programs that used that file. This program maintenance effort was a major burden of file processing
  • 26. 174 ● Module II / Information Technologies systems. It proved difficult to do properly, and it resulted in a lot of inconsistency in the data files. Lack of Data Integrity or Standardization. In file processing systems, it was easy for data elements such as stock numbers and customer addresses to be defined differently by different end users and applications. This caused serious inconsistency problems in the development of programs to access such data. In addition, the integrity (that is, the accuracy and completeness) of the data was suspect because there was no control over their use and maintenance by authorized end users. Thus, a lack of standards caused major problems in application program development and maintenance, and in the se- curity and integrity of the data files needed by the organization.The Database To solve the problems encountered with the file processing approach, the database management approach was conceived as the foundation of modern methods of man-Management aging organizational data. The database management approach consolidates dataApproach records formerly in separate files into databases that can be accessed by many different application programs. In addition, a database management system (DBMS) serves as a software interface between users and databases. This helps users easily access the data in a database. Thus, database management involves the use of database management software to control how databases are created, interrogated, and maintained to provide information needed by end users. For example, customer records and other common types of data are needed for several different applications in banking, such as check processing, automated teller systems, bank credit cards, savings accounts, and installment loan accounting. These data can be consolidated into a common customer database, rather than being kept in separate files for each of those applications. See Figure 5.21.FIGURE 5.21An example of a database Inquirymanagement approach Customer Transaction Processingin a banking informationsystem. Note how thesavings, checking, andinstallment loan programsuse a database managementsystem to share a customerdatabase. Note also that theDBMS allows a user to Checking Savings Installment Account Account Loanmake a direct, ad hoc Program Program Programinterrogation of thedatabase without usingapplication programs. Database Management System Checking Account Data Savings Account Data Customer Database Installment Loan Data Other Customer Data
  • 27. Chapter 5 / Data Resource Management ● 175Kingslake Sales reps on motor scooters, on the road for weeks at a time. Field forces of a util-International: ity, journeying for days from their office to read meters. An environment where people and businesses are so dispersed, and communication so vital, that someConnecting would choose a mobile phone over a refrigerator. It’s business as usual in Sri Lanka.Customers and These scenarios explain why companies in Sri Lanka are clamoring for ways to im-Data in Sri Lanka prove the productivity of their sales and field forces and welcoming technology as the tool for doing so. They also explain why Kingslake International in Sri Lanka, a UK-based consulting and project management company for Europe, focusing on innovative IT solutions in Sri Lanka, has put quite a bit of stock in developing mobile computing and data management solutions. Kingslake found itself limited in the type of mobile sales force automation applications it could develop because no databases worked on the 16-bit devices. Without database software, applications had to rely on basic, low-level data management functions—they couldn’t sync and didn’t allow users to manipulate data on their hand-held devices. Further, Kingslake had four nonnegotiable criteria—reliability, ability to work with battery power, a small footprint, and excellent support from the vendor. Their criteria, and the opportunity to expand their ability to develop sophisticated mobile solutions, came from IBM’s DB2 Everyplace. Examples of the benefits of mobile data management are many. One is a sales force automation solution for a company that sends its sales representatives far afield with vans laden with goods. Previously, these representatives spent 30 percent of their time on paperwork—writing orders for products not on the van, issuing invoices, and then trekking back to headquarters to make sure the paperwork was processed. The DB2 Everyplace solution allowed them to work more productively 100 percent of the time. Carrying mobile devices, sales representatives were able to connect to their back-office system to place orders directly from the field, issue invoices, and do on-the-spot credit checks. Another application tracks the entire process of a company that picks up and retreads truck tires for customers and then delivers them to the customer. In this case, the integration between the hand-held application and the company’s enter- prise resource planning (ERP) application is critical because the sales force has to confirm manufacturing schedules and delivery dates right from the field. Still another solution, for an electric utility, allows the meter readers who travel far and wide to download customer account and subsidy codes, enter and store weeks’ worth of data, accurately calculate costs on their hand-held devices, and issue invoices on the spot. The most important advantage is that the citizens of an economy challenged by distance have come closer to each other, and to their data. Business is booming! [4, 7]Database A database management system (DBMS) is the main software tool of the databaseManagement System management approach, since it controls the creation, maintenance, and use of the databases of an organization and its end users. As we saw in Figure 5.16, microcom- puter database management packages such as Microsoft Access, Lotus Approach, or Corel Paradox allow you to set up and manage databases on your PC, network server, or the World Wide Web. In mainframe and server computer systems, the database management system is an important system software package that controls the devel- opment, use, and maintenance of the databases of computer-using organizations. Examples of popular mainframe and server versions of DBMS software are IBM’s DB2 Universal Database, Oracle 10g by Oracle Corp., and MySQL, a popular open- source DBMS. See Figure 5.22. Common DBMS components and functions are sum- marized in Figure 5.23.
  • 28. 176 ● Module II / Information TechnologiesFIGURE 5.22Database managementsoftware like MySQL, apopular open-sourceDBMS, supports thedevelopment, maintenance,and use of the databases ofan organization. Source: Courtesy of MySQL.com. The three major functions of a database management system are (1) to create new databases and database applications, (2) to maintain the quality of the data in an organization’s databases, and (3) to use the databases of an organization to provide the information needed by its end users. See Figure 5.24. Database development involves defining and organizing the content, relationships, and structure of the data needed to build a database. Database application development involves using a DBMS to develop prototypes of queries, forms, reports, and Web pages for a proposed business application. Database maintenance involves us- ing transaction processing systems and other tools to add, delete, update, and correct the data in a database. The primary use of a database by end users involves employing the database interrogation capabilities of a DBMS to access the data in a database to selectively retrieve and display information and produce reports, forms, and other documents.FIGURE 5.23 Common software components and functions of a database management system. Common DBMS Software Components • Database definition Language and graphical tools to define entities, relationships, integrity constraints, and authorization rights • Nonprocedural access Language and graphical tools to access data without complicated coding • Application development Graphical tools to develop menus, data entry forms, and reports • Procedural language interface Language that combines nonprocedural access with full capabilities of a programming language • Transaction processing Control mechanisms to prevent interference from simultaneous users and recover lost data after a failure • Database tuning Tools to monitor and improve database performanceSource: Adapted from Michael V. Mannino, Database Application Development and Design (Burr Ridge, IL: McGraw-Hill/Irwin, 2001), p. 7.
  • 29. Chapter 5 / Data Resource Management ● 177FIGURE 5.24 OperatingThe three major uses of SystemDBMS software are to Databasecreate, maintain, and use Management Databasethe databases of an Management Databasesorganization. System Application Programs Create: Database and Application Development Maintain: Database Maintenance Use: Database InterrogationDatabase A database interrogation capability is a major benefit of the database managementInterrogation approach. End users can use a DBMS by asking for information from a database using a query feature or a report generator. They can receive an immediate response in the form of video displays or printed reports. No difficult programming is required. The query language feature lets you easily obtain immediate responses to ad hoc data requests: You merely key in a few short inquiries—in some cases, using common sentence structures just like you would use to ask a question. The report generator feature allows you to quickly specify a report format for information you want pre- sented as a report. Figure 5.25 illustrates the use of a DBMS report generator. SQL Queries. SQL (pronounced “see quill”), or Structured Query Language, is an international standard query language found in many DBMS packages. In most cases, SQL is the language structure used to “ask a question” that the DBMS will retrieve the data to answer. The basic form of a SQL query is: SELECT . . . FROM . . . WHERE . . . After SELECT you list the data fields you want retrieved. After FROM you list the files or tables from which the data must be retrieved. After WHERE you specify con- ditions that limit the search to only those data records in which you are interested. Figure 5.26 compares a SQL query to a natural language query for information on customer orders. Graphical and Natural Queries. Many end users (and IS professionals) have diffi- culty correctly phrasing SQL and other database language queries. So most end user database management packages offer GUI (graphical user interface) point-and-clickFIGURE 5.25Using the report generatorof Microsoft Access tocreate an employee report. Source: Courtesy of Microsoft Corp.
  • 30. 178 ● Module II / Information TechnologiesFIGURE 5.26 A Sample Natural Language-to-SQL Translation for Microsoft AccessComparing a naturallanguage query with Natural Languagea SQL query. WHAT CUSTOMERS HAD NO ORDERS LAST MONTH? SQL SELECT [Customers].[Company Name],[Customers].[Contact Name] FROM [Customers] WHERE not Exists {SELECT [Ship Name] FROM [Orders] WHERE Month {[Order Date]}=l and Year {[Order Date]}=2004 and [Customers].[Customer ID]=[Orders].{[Customer ID]} methods, which are easier to use and are translated by the software into SQL commands. See Figure 5.27. Other packages are available that use natural language query statements similar to conversational English (or other languages), as was illus- trated in Figure 5.26.Database The database maintenance process is accomplished by transaction processing systems andMaintenance other end user applications, with the support of the DBMS. End users and informa- tion specialists can also employ various utilities provided by a DBMS for database maintenance. The databases of an organization need to be updated continually to re- flect new business transactions (such as sales made, products produced, or inventory shipped) and other events. Other miscellaneous changes must also be made to update and correct data (such as customer or employee name and address changes) to ensure accuracy of the data in the databases. We introduced transaction processing systems in Chapter 1 and will discuss them in more detail in Chapter 7.Application DBMS packages also play a major role in application development. End users, sys-Development tems analysts, and other application developers can use the internal 4GL program- ming language and built-in software development tools provided by many DBMS packages to develop custom application programs. For example, you can use a DBMS to easily develop the data entry screens, forms, reports, or Web pages of a business application that accesses a company database to find and update the data it needs. A DBMS also makes the job of application software developers easier, since they do not have to develop detailed data-handling procedures using conventional program- ming languages every time they write a program. Instead, they can include features such as data manipulation language (DML) statements in their software that call on the DBMS to perform necessary data-handling activities.FIGURE 5.27Using the Query Wizardof the Microsoft Accessdatabase managementpackage to develop a queryabout employee health planchoices. Source: Courtesy of Microsoft Corp.
  • 31. Chapter 5 / Data Resource Management ● 179 Summary• Data Resource Management. Data resource manage- distributed, and external databases. Data warehouses ment is a managerial activity that applies information are a central source of data from other databases that technology and software tools to the task of managing have been cleaned, transformed, and cataloged for busi- an organization’s data resources. Early attempts to man- ness analysis and decision support applications. That age data resources used a file processing approach in includes data mining, which attempts to find hidden which data were organized and accessible only in spe- patterns and trends in the warehouse data. Hypermedia cialized files of data records that were designed for pro- databases on the World Wide Web and on corporate cessing by specific business application programs. This intranets and extranets store hyperlinked multimedia approach proved too cumbersome, costly, and inflexible pages at a website. Web server software can manage to supply the information needed to manage modern such databases for quick access and maintenance of the business processes and organizations. Thus, the data- Web database. base management approach was developed to solve the problems of file processing systems. • Data Access. Data must be organized in some logical manner on physical storage devices so that they can• Database Management. The database management be efficiently processed. For this reason, data are com- approach affects the storage and processing of data. monly organized into logical data elements such as The data needed by different applications are consoli- characters, fields, records, files, and databases. Database dated and integrated into several common databases structures, such as the hierarchical, network, relational, instead of being stored in many independent data files. and object-oriented models, are used to organize the Also, the database management approach emphasizes relationships among the data records stored in data- updating and maintaining common databases, having bases. Databases and files can be organized in either a users’ application programs share the data in the data- sequential or direct manner and can be accessed and base, and providing a reporting and an inquiry/response maintained by either sequential access or direct access capability so end users can easily receive reports and processing methods. quick responses to requests for information. • Database Development. The development of data-• Database Software. Database management systems are bases can be easily accomplished using microcomputer software packages that simplify the creation, use, and database management packages for small end-user appli- maintenance of databases. They provide software tools cations. However, the development of large corporate so end users, programmers, and database administrators databases requires a top-down data planning effort. This can create and modify databases, interrogate a database, may involve developing enterprise and entity relation- generate reports, do application development, and per- ship models, subject area databases, and data models form database maintenance. that reflect the logical data elements and relationships needed to support the operation and management of the• Types of Databases. Several types of databases are basic business processes of the organization. used by business organizations, including operational, K e y Te r m s a n d C o n c e p t sThese are the key terms and concepts of this chapter. The page number of their first explanation is in parentheses. 1. Data dependence (•••) 12. Database management system (•••) d. File (•••) 2. Data dictionary (•••) 13. Database structures (•••) e. Database (•••) f. Attribute (•••) 3. Data integration (•••) a. Hierarchical (•••) g. Entity (•••) 4. Data integrity (•••) b. Multidimensional (•••) c. Network (•••) 17. Metadata (•••) 5. Data mining (•••) d. Object-oriented (•••) 18. Structured Query Language (•••) 6. Data modeling (•••) e. Relational (•••) 19. Replication (•••) 7. Data redundancy (•••) 14. Duplication (•••) 20. Types of databases (•••) 8. Data resource management (•••) 15. File processing (•••) a. Data warehouse (•••) 9. Database administrator (•••) 16. Logical data elements (•••) b. Distributed (•••)10. Database interrogation (•••) a. Character (•••) c. External (•••)11. Database management b. Field (•••) d. Hypermedia (•••) approach (•••) c. Record (•••) e. Operational (•••)
  • 32. 180 ● Module II / Information Technologies Review QuizMatch one of the key terms and concepts listed previously with one of the brief examples or definitions that follow. Try tofind the best fit for answers that seem to fit more than one term or concept. Defend your choices. 1. The use of integrated collections of data records 13. Developing conceptual views of the relationships and files for data storage and processing. among data in a database. 2. A DBMS allows you to create, interrogate, and 14. A customer’s name. maintain a database, create reports, and develop 15. A customer’s name, address, and account balance. application programs. 16. The names, addresses, and account balances of all 3. A specialist in charge of the databases of an of your customers. organization. 17. An integrated collection of all of the data about 4. This DBMS feature allows users to easily your customers. interrogate a database. 18. Business application programs use specialized data 5. Defines and catalogs the data elements and data files. relationships in an organization’s database. 19. A treelike structure of records in a database. 6. Helps you specify and produce reports from a database. 20. A tabular structure of records in a database. 7. The main software package that supports a 21. Records are organized as cubes within cubes in a database management approach. database. 8. Databases are dispersed to the Internet and 22. Databases that support the major business corporate intranets and extranets. processes of an organization. 9. Databases that organize and store data as objects. 23. A centralized and integrated database of current and historical data about an organization. 10. Databases of hyperlinked multimedia documents on the Web. 24. Databases available on the Internet or provided by commercial information services. 11. The management of all the data resources of an organization. 12. Processing data in a data warehouse to discover key business factors and trends. Discussion Questions 1. How should a business store, access, and distribute data 7. Refer to the Real World Case on Emerson and Sanofi and information about their internal operations and and others in the chapter. What are the advantages to external environment? avoiding the find-it-and-fix-it approach to data resource 2. What role does database management play in managing management? data as a business resource? 8. What are the benefits and limitations of the relational 3. What are the advantages of a database management database model for business applications today? approach to the file processing approach? Give 9. Why is the object-oriented database model gaining examples to illustrate your answer. acceptance for developing applications and managing 4. Refer to the Real World Case on Harrah’s Entertainment the hypermedia databases at business websites? in the chapter. Why do analysts, users, and vendors feel 10. How have the Internet, intranets, and extranets there is a danger associated with business intelligence? affected the types and uses of data resources available Use Harrah’s Entertainment as an example. to business professionals? What other database trends 5. What is the role of a database management system in a are also affecting data resource management in business information system? business? 6. Databases of information about a firm’s internal operations were formerly the only databases that were considered to be important to a business. What other kinds of databases are important for a business today?
  • 33. Chapter 5 / Data Resource Management ● 181 Analysis ExercisesComplete the following exercises as individual or group projects that apply chapter concepts to real world business. 1. Training-Cost Management Part 1 notice in the event that you need to reschedule a class. You have the responsibility for managing technical You want to make sure your classes are at least two- training classes within your organization. These classes thirds full before this deadline. You also want to make fall under two general types: highly technical training sure you send timely reminders to all attendees so they and end user training. Software engineers sign up for do not forget to show up. Use the database you created the former, and administrative staff sign up for the in exercise 1 above to perform the following activities. latter. Your supervisor measures your effectiveness in a. Using the information provided in the sample below, part based upon the average cost per training hour and add a course schedule table to your training database. type of training. In short, your supervisor expects the Designate the ScheduleID field as a “Primary Key,” best training for the least cost. and allow your database program to automatically To meet this need, you have negotiated an exclusive generate a value for this field. Make the CourseID on-site training contract with Hands On Technology field a number field and the StartDate field a date field. Transfer Inc. (HOTT) (www.traininghott.com), a high- b. Using the information provided in the sample below, quality technical training provider. Your negotiated add a class roster table to your training database. rates are reproduced below in the pricing table. A Make the ScheduleID field a number field. Make the separate table contains a sample list of courses you Reminder and Confirmed fields both “yes/no” routinely make available for your organization. (Boolean) fields. a. Using the data below, design and populate a table that c. Since the Class Schedule table relates back to the includes basic training rate information. Designate Course Table and the Course Table relates to the the “Technical” field type as “Yes/No” (Boolean). Pricing Table, why is it appropriate to record the Price b. Using the data below, design and populate a course Per Day information in the Class Schedule table, too? table. Designate the CourseID field as a “Primary d. What are the advantages and disadvantages of using Key,” and allow your database to automatically the participant’s name and e-mail address in the generate a value for this field. Designate the Class Roster table? What other database design “Technical” field type as “Yes/No” (Boolean). might you use to record this information? c. Prepare a query that lists each course name and its e. Write a query that shows how many people have cost per day of training. registered for each scheduled class. Include the class d. Prepare a query that lists the cost per student for each name, capacity, date, and count of attendees. class. Assume maximum capacity and that you will schedule two half-day classes on the same day to take Class Schedule full advantage of HOTT’s per-day pricing schedule. Schedule Course Price Per ID ID Location Start Date Day Pricing Table 1 1 101-A 7/12/2004 2680 Technical Price Per Day Capacity 2 1 101-A 7/19/2004 2680 Yes 2680 15 3 1 101-B 7/19/2004 2680 No 2144 30 4 4 101-A&B 7/26/2004 2144 5 5 101-A&B 8/2/2004 2144 Course Table ... Course Class Roster ID Course Name Duration Technical Schedule 1 ASP Programming 5 Yes ID Participant e-mail Reminder Confirmed 2 XML Programming 5 Yes 1 Linda adams.l@ . . . Yes Yes 3 PHP Programming 4 Yes Adams 4 Microsoft Word–Advanced .5 No 1 Fatima ahmad.f@ . . . Yes No 5 Microsoft Excel–Advanced .5 No Ahmad ... 1 Adam Alba alba.a@ . . . Yes Yes 2. Training-Cost Management Part 2 4 Denys Alyea alyea.d@ . . . No No Having determined the cost per student for each of the 4 Kathy Bara bara.k@ . . . Yes No ... classes in the previous problem, you now must carefully manage class registration. Since you pay the same 3. Tracking Project Work at AAA Systems flat rates no matter how many students attend (up You are responsible for managing information to capacity), you want to do all you can to ensure systems development projects at AAA Systems Inc. maximum attendance. Your training provider, Hands To better track progress in completing projects, you On Technology Transfer Inc., requires two weeks’ have decided to maintain a simple database table to
  • 34. 182 ● Module II / Information Technologies track the time your employees spend on various tasks b. Create a query that will list the hours worked for all and the projects with which they are associated. It will workers who worked more than 40 hours during also allow you to keep track of employees’ billable production week 20. hours each week. The following table provides a c. Create a report grouped by project that will show sample data set. the number of hours devoted to each task on the a. Build a database table to store the data shown, and project and the total number of hours devoted to enter the records as a set of sample data. each project as well as a grand total of hours worked. Project_Name Task_Name Employee_ID Production_Week Hours_Worked Fin-Goods-Inv App. Devel. 456 Fin-Goods-Inv DB Design 345 20 Fin-Goods-Inv UI Design 234 16 HR Analysis 234 24 HR Analysis 456 48 HR UI Design 123 8 HR UI Design 123 40 HR UI Design 234 32 Shipmt-Tracking DB Design 345 24 Shipmt-Tracking DB Design 345 16 Shipmt-Tracking DB Development 345 20 Shipmt-Tracking UI Design 123 32 Shipmt-Tracking UI Design 234 24 d. Create a report grouped by employee that will show The user should be able to select a production week hours worked on each task and the total hours worked. and have data for just that week presented.
  • 35. Chapter 5 / Data Resource Management ● 183 3REAL WORLD Acxiom Corporation: Data CASE Demands RespectA cxiom Corp. has always been good at managing data. Lots of data. By its own estimate, Acxiom manages more than 20 billion customer and prospect records. “We do three things really well,” says Alex Dietz, who’sreferred to internally as “products and infrastructure tech- ing to five years of data. “We like to think that these cus- tomers look at the Acxiom data center as an extension of their own data centers,” he says. Acxiom isn’t satisfied with just managing other compa- nies’ data; it wants to manage their data centers as well. Innology leader” and functions as Acxiom’s CIO. (Acxiom has late 2004, Acxiom signed one of its most significant IT out-no traditional titles.) Those three things, Dietz says, are sourcing deals, with Information Resources Inc. (IRI), whichmanaging large volumes of data; cleaning, transforming, and collects and sells terabytes of retail point-of-sale and con-enhancing that data; and distilling business intelligence from sumer spending data. Under a contract potentially worththe data to drive smart decisions. That data is used by hundreds of millions of dollars, Acxiom will run IRI’s data-Acxiom’s approximately 1,000 clients for everything from center operations and build a real-time content delivery sys-developing telemarketing lists and identifying prospects for tem for the company.credit-card offers to screening prospective employees and Outsourcing isn’t a new venture for Acxiom. Its first data-detecting fraudulent financial transactions. center-services customer was Norman Vincent Peale’s Guide- No one at Acxiom seems to know exactly how much data posts organization some 15 years ago. Acxiom’s outsourcingthe company manages in its 11 tightly guarded data centers. division was created officially in 1998 following its acquisitionThe company’s central data center is located north of Little of May & Speh Inc., an information-management servicesRock, in Conway, Arkansas, where Acxiom was founded in provider. Outsourcing deals now account for a quarter of the1969 as a spin-off of a local bus manufacturer. Acxiom also company’s sales.has data centers in Downer’s Grove, Illinois, outside of Acxiom sees outsourcing as a growth area, targetingChicago, and as far away as Sunderland in the United companies in data-intensive industries such as consumerKingdom. The company recently opened a data center in packaged goods, financial services, and retail, where itsPhoenix, and an additional facility is under construction in data-management expertise and technology give it an edgeWest Little Rock. The best estimates are that the Conway over IT-services stalwarts such as IBM and EDS, says Leesystems alone store between 1.5 and 2.0 petabytes of data, or Hodges, chief operations leader. Acxiom holds up the IRIup to 2,000 terabytes. deal as emblematic of the company’s outsourcing strategy. A portion of that data makes up Acxiom’s information IRI collects huge amounts of product-sales data fromproducts, such as its InfoBase database of consumer data and supermarkets, pharmacies, and other retailers, as well asits Personicx list of U.S. households segmented into 70 cate- from 70,000 volunteer households, and sells it to packaged-gories. Acxiom clients use those offerings to build marketing goods manufacturers. As part of the deal, worth $25 millionprospect lists, check the accuracy of names and phone num- to $30 million a year for up to 15 years, Acxiom is develop-bers in their customer databases, and to add demographic ing a content-on-demand system to process retail data indetails or verify personnel data. Acxiom continues to add to near–real time, combine it with demographic data, andits portfolio: In August, it debuted Personicx LifeChanges, a sell it to IRI’s clients to identify and exploit consumersystem that tracks U.S. households through life stages such buying trends as quickly as possible. IRI and Acxiom areas marriage or the purchase of a home. Those and other data developing applications that will help retailers and manu-products account for just over a fifth of Acxiom’s revenue. facturers manage new product introductions, analyze price To build its data library, Acxiom collects information and promotion effectiveness, and track products that arefrom a wide range of public and private sources. The com- out of stock.pany has property deed–registration information from 930 But its business strategy isn’t what has garnered publicitycounties across the United States, as well as from 3,500 tele- for Acxiom. The company’s involvement in situations thatphone directories. It also purchases data from private raise privacy implications has served to remind everyone thatsources, such as catalog and magazine subscriber lists and Acxiom’s business, at its core, is collecting and selling dataresearch from consumer surveys. about people. But a lot of the data Acxiom manages belongs to other In September 2003 a controversy erupted over JetBluecompanies. More than half of its revenue is generated by turning over customer data to the federal government. Adata-related services, such as building and hosting data ware- year earlier, the airline had provided data on 1.5 million pas-houses, integrating and cleaning customer data, running cus- sengers to Torch Concepts Inc., a military contractor devel-tomer relationship management applications, developing oping a system for screening visitors to military bases.customer marketing lists, and analyzing data or providing JetBlue and Torch Concepts were both Acxiom clients, andclients with the means to analyze it themselves. Clients typi- Acxiom provided the military contractor with demographiccally store three years of complete customer-history data data and Social Security numbers to use with the passengerwith Acxiom, CEO Charles Morgan says, but that’s expand- list provided by the airline.
  • 36. 184 ● Module II / Information Technologies The case has drawn Acxiom into the debate over when 2003 and named Frank Caserta, a senior technical adviser init’s appropriate for companies to supply customer data to the database and data warehouse group, to the position. Hisgovernment agencies. Acxiom does provide data to govern- job is to make sure Acxiom has a centralized, strategic viewment agencies, acknowledges the company’s privacy leader, of data security and to champion best data-security practicesJennifer Barrett, though she declines to offer specifics. The within Acxiom and among its clients, Caserta says.company also provides privacy-consulting services to the An increasingly security-conscious world is opening upgovernment, Barrett says, pointing out that the government’s new opportunities for Acxiom to provide data in areas such asprivacy efforts and policies lag the private sector by about 10 employee background checks, compliance with regulations likeyears. “So when government agencies ask companies for the USA Patriot Act and the Do Not Call Registry, and frauddata, they often haven’t thought about these issues the way detection. In February, 2004, credit-data provider and AcxiomAcxiom has,” she says. Still, government contracts account customer TransUnion LLC and Acxiom introduced a systemfor less than 1 percent of the company’s revenue. for fraud prevention and regulatory compliance for financial Acxiom considers privacy a core competency, and provid- services, insurance, and telecommunications companies.ing clients with privacy-consulting services is a lucrative part Acxiom has more ambitious plans on the global front.of its business. It was among the first companies to create a CEO Morgan says information systems for checking cus-chief privacy officer post and has nine full-time employees de- tomer credit are woefully underdeveloped in China, wherevoted to privacy issues, along with dedicated resources in each consumers can wait months to get approval for car loans.country where Acxiom does business. Also, Acxiom is working “You can imagine the fees,” he says. “And you wouldn’twith several members of Congress to help shape a national believe the paperwork.”privacy policy, says the company’s legal leader, Jerry Jones. Privacy’s twin is security, and Acxiom is working hard on Adapted from Rick Whiting, “Data Demands Respect,” InformationWeek,that, too. The company created a security chief post in late October 25, 2004. Copyright © 2005 CMP Media LLC. All rights reserved. CASE STUDY QUESTIONS REAL WORLD ACTIVITIES 1. Acxiom is in a unique type of business. How would you 1. In the case, it was stated that Acxiom started as the re- describe the business of Acxiom? Are they a service- or sult of a spin-off from a bus company. Using the Inter- a product-oriented business? net, see if you can find the history of Acxiom. How does 2. From the case, it is easy to see that Acxiom has focused a bus company evolve into a data collection and dissem- on a wide variety of data from different sources. How ination company? does Acxiom decide which data to collect and for 2. The privacy problems faced by Acxiom were associated whom? with the accidental dissemination of data deemed sensi- 3. Acxiom’s business raises many issues related to privacy. tive by a third party. Break into small groups with your Is the data collected by Acxiom really private? classmates, and discuss the privacy issues associated with Acxiom’s business. Do you think they are doing anything wrong?