2. Data
Data
A necessity for almost any enterprise to carry out its business. Consists of raw
facts, and when organized may be transformed into information
Database
A collection of data organized to meet users’ needs
Database management system (DBMS)
A group of programs that manipulate the database and provide an interface
between the database and the user of the database or other application programs
3. Hierarchy of Data
Hierarchy of data Example
Database
Files
Records
Fields
Characters
(bytes)
Personel file
Department file
Payroll file
(Project database)
005-10-6321 Johns Francine 10-7-65
549-77-1001 Buckley Bill 2-17-79
098-40-1370 Fiske Steven 1-5-85
(Personnel file)
098-40-1370 Feleke Steven 1-5-85 598 (Record containing SSN,
last name, first name, date
of hire)
FELEKE (Last name field)
1000100 (Letter ‘F’ in ASCII)
4. Terminology
Database
A collection of integrated and related files
File
A collection of related records
Record
A collection of related fields
Field
A group of characters
Character
Basic building block of information, represented by a byte
5. Data Entities, Attributes, and Keys
Entity
A generalized class of people, places, or things (objects) for which data are
collected, stored, and maintained
E.g., Customer, Employee
Attribute
A characteristic of an entity; something the entity is identified by
E.g., Customer name, Employee name
Keys
A field or set of fields in a record that is used to identify the record
E.g, A field or set of fields that uniquely identifies the record
6. Keys and Attributes
Key field Attributes (fields)
Entities
(records)
Employee # Last name First name Hire date Dept. #
005-10-6321 Johns Francine 10-7-65 257
549-77-1001 Buckley Bill 2-17-79 650
098-40-1370 Fiske Steven 1-5-85 598
7. The Traditional Approach
The traditional approach…
Separate files are created and stored for each application program
Schematic
9. Drawbacks
Data redundancy
Duplication of data in separate files
Lack of data integrity
The degree to which the data in any one file is accurate
Program-data dependence
A situation in which program and data organized for one application
are incompatible with programs and data organized differently for
another application
10. Database Approach
The database approach…
A pool of related data is shared by multiple application programs
Rather than having separate data files, each application uses a
collection of data that is either joined or related in the database
Schematic
12. Advantages
Improved strategic use of corporate data
Reduced data redundancy
Improved data integrity
Easier modification and updating
Data and program independence
Better access to data and information
Standardization of data access
A framework for program development
Better overall protection of the data
Shared data and information resources
13. Disadvantages
Relatively high cost of purchasing and operating a DBMS in a
mainframe operating environment
Increased cost of specialized staff
Increased vulnerability
14. Data Modeling and Database Models (1)
Planned data redundancy
A way of organizing data in which the logical database design is
altered so that certain data entities are combined
Summary totals are carried in the data records rather than calculated
from elemental data
Some data attributes are repeated in more than one data entity to
improve database performance
15. Data Modeling and Database Models (2)
Data model
A map or diagram of entities and their relationships
Enterprise data modeling
Data modeling done at the level of the entire organization
Entity-relationship (ER) diagrams
A data model that uses basic graphical symbols to show the organization of and
relationships between data
18. Hierarchical Database Model
Hierarchical database model
A data model in which data are organized in a top-down, or inverted tree structure
Department
C
Employee
1
Employee
2
Employee
3
Employee
4
Employee
5
Employee
6
Department
B
Projects
Department
A
19. Network Data Model
Network data model
An expansion of the hierarchical database model with an owner-member
relationship in which a member may have many owners
Project
1
Project
2
Department
A
Department
B
Department
C
20. Relational Data Model
Relational data model
All data elements are placed in two-dimensional tables, called
relations, that are the logical equivalent of files
Schematic
21. Project Number Description Dept. Number
155 Payroll 257
498 Widgets 632
226 Sales manager 598
Dept. Number Dept. Name Manager SSN
257 Accounting 421-55-99993
632 Manufacturing 765-00-3192
598 Marketing 098-40-1370
SSN Last Name First Name Hire Date Dept. Number
005-10-6321 Johns Francine 10-7-65 257
549-77-1001 Buckley Bill 2-17-79 650
098-40-1370 Fiske Steven 1-5-85 598
Data Table 1: Project Table Data Table 2: Department Table
Data Table 3: Manager Table
Relational Data Model
22. Relational Database Terminology
Selecting
Data manipulation that eliminates rows according to certain criteria
Projecting
Data manipulation that eliminates columns in a table
Joining
Data manipulation that combines two or more tables
Linked
Relating tables in a relational database together
23. Linking Data Tables to Answer an Inquiry
Project Number Description Dept. Number
155 Payroll 257
498 Widgets 632
226 Sales manager 598
Dept. Number Dept. Name Manager SSN
257 Accounting 421-55-99993
632 Manufacturing 765-00-3192
598 Marketing 098-40-1370
SSN Last Name First Name Hire Date Dept. Number
005-10-6321 Johns Francine 10-7-65 257
549-77-1001 Buckley Bill 2-17-79 650
098-40-1370 Fiske Steven 1-5-85 598
25. Schemas and Subschemas
Schema
A description of the entire database
Subschema
A file that contains a description of a subset of the database and
identifies which users can perform modifications on the data items in
that subset
Schematic
27. Data Definition Language
Data Definition Language (DDL)
A collection of instructions and commands used to define and describe data and data relationships in
a specific database
statements are used to define the database structure or schema. Some examples:
CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are
removed
COMMENT - add comments to the data dictionary
RENAME - rename an object
28. Data Manipulation Language (DML)
Data Manipulation Language (DML) statements are used for managing data within
schema objects. Some examples:
SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL
LOCK TABLE - control concurrency
29. Transaction and data control language
Data Control Language (DCL) statements used to control the data modifications. Some
examples:
GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command
Transaction Control (TCL) statements are used to manage the changes made by DML
statements. It allows statements to be grouped together into logical transactions.
COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back
ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what
rollback segment to use
30. Data Dictionary Features
Provide a standard definition of terms and data elements
Assist programmers in designing and writing programs
Simplify database modification
Reduce data redundancy
Increase data reliability
Faster program development
Easier modification of data and information
31. Logical and Physical Access Paths
Logical access path (LAP)
Application requires information from the DBMS
Physical access path (PAP)
DBMS accesses a storage device to retrieve data
Schematic
33. Manipulating Data
Concurrency control
A method of dealing with a situation in which two or more people need
to access the same record in a database at the same time
Data manipulation language (DML)
The commands that are used to manipulate the data in a database
Structured query language (SQL)
A standardized data manipulation language
34. Structured Query Language (SQL)
“Invented” at IBM’s Almaden Research Centre (San Jose,
CA) in the 1970s
E.g.,
Select all (“*”) columns from the EMPLOYEE table in
which the JOB_CLASSIFICATION field is equal to “C2”
SELECT * FROM EMPLOYEE WHERE
JOB_CLASSIFICATION = “C2”
36. Popular Database Management Systems for End Users
Microsoft Access
Lotus Approach
Inprise (formerly Borland) dBASE
DBMS Selection Criteria
Database size
Number of concurrent users
Performance
Integration
Features
The vendor
Cost
37. Distributed Databases
Distributed database…
A database in which the actual data may be spread across several
smaller databases connected via telecommunications devices
‘Pretty’ picture
38.
39. Data Warehouse
Data warehouse
A relational database management system designed specifically to support
management decision making
Current evolution of Decision Support Systems (DSSs)
Data mart
A subset of a data warehouse for small and medium-size businesses or
departments within larger companies
Schematic
41. Designing a Customer Data Warehouse
Sharply define your goals and objectives before you build the
warehouse
Choose the software that best fits your goals
Determine who/what should be in the database
Develop a plan
Measure results
42. Data Mining Applications
Data mining
The automated discovery of patterns and relationships in a data warehouse
Data mining applications
Market segmentation
Customer queries
Fraud detection
Direct marketing
Market basket analysis
Trend analysis
43. On-Line Analytical Processing (OLAP)
On-line analytic processing (OLAP)
Access to multidimensional databases providing managerially useful
display techniques
Now used to store and deliver data warehouse information
Data warehouse and OLAP
Provides top-down, query-driven analysis
Data mining
Provides bottom-up, discovery-driven analysis
44. Open Database Connectivity (ODBC)
Open database connectivity (ODBC)
A set of standards that ensures software written to comply with
these standards can be used with any ODBC-compliant database
Schematic