DBF-Lecture11-Chapter12.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
Lecture11- CHAPTER 12: Transaction Management and
Concurrency Control
Presented by Rabia Cherouk
*
ObjectivesIn this chapter, you will learn:About database
transactions and their propertiesWhat concurrency control is
and what role it plays in maintaining the database’s
integrityWhat locking methods are and how they workHow
stamping methods are used for concurrency controlHow
optimistic methods are used for concurrency controlHow
database recovery management is used to maintain database
integrity
*
What is a Transaction?A transaction is a logical unit of work
that must be either entirely completed or abortedSuccessful
transaction changes database from one consistent state to
anotherOne in which all data integrity constraints are
satisfiedMost real-world database transactions are formed by
two or more database requestsEquivalent of a single SQL
statement in an application program or transaction
Same as Fig. 12.1 in your book
*
Same as Fig. 12.1 in your book
*
Evaluating Transaction Results Not all transactions update the
databaseSQL code represents a transaction because database
was accessedImproper or incomplete transactions can have
devastating effect on database integritySome DBMSs provide
means by which user can define enforceable constraintsOther
integrity rules are enforced automatically by the DBMS
Same as Fig. 12.2 in your book
*
Figure 9.2
Same as Fig. 12.2 in your book
*
Transaction Properties
All transactions must display atomicity, consistency, durability
and serializability (ACIDS).AtomicityAll operations of a
transaction must be completedConsistency Permanence of
database’s consistent stateIsolation Data used during transaction
cannot be used by second transaction until the first is completed
*
Transaction Properties (cont..)Durability Once transactions are
committed, they cannot be undoneSerializabilityConcurrent
execution of several transactions yields consistent
resultsMultiuser databases are subject to multiple concurrent
transactions
*
Transaction Management with SQLANSI (American National
Standard Institute) has defined standards that govern SQL
database transactionsTransaction support is provided by two
SQL statements: COMMIT and ROLLBACKTransaction
sequence must continue until:COMMIT statement is
reachedROLLBACK statement is reachedEnd of program is
reachedProgram is abnormally terminated
*
The Transaction LogA DBMS uses a Transaction log to store:A
record for the beginning of transactionFor each transaction
component: Type of operation being performed (update, delete,
insert)Names of objects affected by transaction“Before” and
“after” values for updated fieldsPointers to previous and next
transaction log entries for the same transactionEnding
(COMMIT) of the transaction
Table 12.1 in your book
*
The Transaction Log
Table 12.1 in your book
*
Concurrency ControlIs the coordination of simultaneous
transaction execution in a multiprocessing databaseObjective is
to ensure serializability of transactions in a multiuser
environmentSimultaneous execution of transactions over a
shared database can create several data integrity and
consistency problemsLost updatesUncommitted dataInconsistent
retrievals
*
Lost UpdatesLost update problem:Two concurrent transactions
update same data elementOne of the updates is lostOverwritten
by the other transaction
Lost Updates
*
Lost Updates (cont..)
*
*
Uncommitted Data Uncommitted data phenomenon:Two
transactions executed concurrentlyFirst transaction rolled back
after second already accessed uncommitted data
Uncommitted Data
*
Uncommitted Data (cont..)
*
*
Inconsistent RetrievalsInconsistent retrievals:First transaction
accesses dataSecond transaction alters the dataFirst transaction
accesses the data againTransaction might read some data before
they are changed and other data after changedYields
inconsistent results
*
*
*
The SchedulerSpecial DBMS program Purpose is to establish
order of operations within which concurrent transactions are
executedInterleaves execution of database operations:Ensures
serializabilityEnsures isolationSerializable scheduleInterleaved
execution of transactions yields same results as serial execution
The Scheduler (cont..) Bases its actions on concurrency control
algorithmsEnsures computer’s central processing unit (CPU) is
used efficientlyFacilitates data isolation to ensure that two
transactions do not update same data element at same time
*
*
Database Recovery Management
Database recoveryRestores database from given state, usually
inconsistent, to previously consistent stateBased on atomic
transaction propertyAll portions of transaction treated as single
logical unit of workAll operations applied and completed to
produce consistent database
If transaction operation cannot be completed, transaction must
be aborted, and any changes to database must be rolled back
(undone)
Transaction RecoveryMakes use of deferred-write and write-
through techniquesDeferred write Transaction operations do not
immediately update physical databaseOnly transaction log is
updatedDatabase is physically updated only after transaction
reaches its commit point using transaction log information
*
*
Transaction Recovery (cont..)Write-through techniqueDatabase
is immediately updated by transaction operations during
transaction’s execution, even before transaction reaches its
commit pointRecovery processIdentify last checkpointIf
transaction was committed before checkpointDo nothingIf
transaction committed after last checkpointDBMS redoes the
transaction using “after” valuesIf transaction had ROLLBACK
or was left activeDo nothing because no updates were made
Transaction Recovery (cont..)
*
*
SummaryTransaction: sequence of database operations that
access databaseLogical unit of workNo portion of transaction
can exist by itselfFive main properties: atomicity, consistency,
isolation, durability, and serializabilityCOMMIT saves changes
to diskROLLBACK restores previous database stateSQL
transactions are formed by several SQL statements or database
requests
*
Summary (cont..) Transaction log keeps track of all transactions
that modify databaseConcurrency control coordinates
simultaneous execution of transactionsScheduler establishes
order in which concurrent transaction operations are
executedLock guarantees unique access to a data item by
transactionTwo types of locks: binary locks and
shared/exclusive locks
*
Summary (cont..) Serializability of schedules is guaranteed
through the use of two-phase lockingDeadlock: when two or
more transactions wait indefinitely for each other to release
lockThree deadlock control techniques: prevention, detection,
and avoidanceTime stamping methods assign unique time stamp
to each transaction Schedules execution of conflicting
transactions in time stamp order
*
Summary (cont..) Optimistic methods assume the majority of
database transactions do not conflictTransactions are executed
concurrently, using private copies of the dataDatabase recovery
restores database from given state to previous consistent state
CHAPTER 12: Transaction Management and Concurrency
Control
ADDITIONAL SLIDES pages 635 to 644 in your
Book..
*
*
Two-Phase Locking to Ensure Serializability (cont..)Governed
by the following rules:Two transactions cannot have conflicting
locksNo unlock operation can precede a lock operation in the
same transactionNo data are affected until all locks are
obtained—that is, until transaction is in its locked point
*
Concurrency Control
with Locking MethodsLock Guarantees exclusive use of a data
item to a current transactionRequired to prevent another
transaction from reading inconsistent dataLock
managerResponsible for assigning and policing the locks used
by transactions
*
Lock GranularityIndicates level of lock useLocking can take
place at following levels: DatabaseTablePageRowField
(attribute)
*
Lock Granularity (cont..) Database-level lockEntire database is
lockedTable-level lockEntire table is lockedPage-level
lockEntire diskpage is locked Row-level lock Allows concurrent
transactions to access different rows of same tableEven if rows
are located on same page Field-level lock Allows concurrent
transactions to access same row as long as they Require the use
of different fields (attributes) within the row
Fig 12.3 in your book
*
Fig 12.3 in your book
Fig 12.4 in your book
*
Fig 12.4 in your book
Fig. 12.5 in your book
*
Lock Granularity (cont..)
Fig. 12.5 in your book
Fig. 12.6 in your book
*
Lock Granularity (cont..)
Fig. 12.6 in your book
*
Lock TypesBinary lockTwo states: locked (1) or unlocked
(0)Exclusive lock Access is specifically reserved for transaction
that locked objectMust be used when potential for conflict
existsShared lock Concurrent transactions are granted read
access on basis of a common lock
Table 12.10 in your book
*
Table 12.10 in your book
*
Two-Phase Locking to Ensure SerializabilityDefines how
transactions acquire and relinquish locksGuarantees
serializability, but does not prevent deadlocks Growing
phaseTransaction acquires all required locks without unlocking
any dataShrinking phaseTransaction releases all locks and
cannot obtain any new lock
Deadlocks (cont..)
*
*
DeadlocksCondition that occurs when two transactions wait for
each other to unlock dataPossible only if one of the transactions
wants to obtain an exclusive lock on a data itemNo deadlock
condition can exist among shared locks
*
Table 12.11 in your book
Deadlocks (cont..)
*
Deadlocks (cont..)Three techniques to control
deadlock:Prevention Detection Avoidance Choice of deadlock
control method depends on database environmentLow
probability of deadlock, detection recommendedHigh
probability, prevention recommended
*
Concurrency Control
with Time Stamping Methods Assigns global unique time stamp
to each transactionProduces explicit order in which transactions
are submitted to DBMSUniqueness Ensures that no equal time
stamp values can existMonotonicityEnsures that time stamp
values always increase
*
Wait/Die and Wound/Wait SchemesWait/die Older transaction
waits and younger is rolled back and rescheduledWound/wait
Older transaction rolls back younger transaction and
reschedules it
Wait/Die and Wound/Wait Schemes (cont..)
*
*
Concurrency Control
with Optimistic Methods Optimistic approach Based on
assumption that majority of database operations do not
conflictDoes not require locking or time stamping
techniquesTransaction is executed without restrictions until it is
committedPhases: read, validation, and write
MA 106 Final Exam Name________________________
Solve by Factoring.
1) �2 + 4� − 5 = 0
2) (� − 2) (� + 5) = 8
Solve.
3) A batter hits a baseball in the air. The height h (in feet) of
the baseball after t seconds is
given by the equation ℎ = −16�2 + 64� + 3 = 0. When is the
baseball at a height of three feet?
Simplify.
5) 50
6)
10
x25
7)
3
7
x3
x48
Solve.
8) �2 − 64 = 0
MA 106 Final Exam Name________________________
9) (� + 2)2 = 36
10) Solve by completing the square.
7x6x
2
11) Solve by using the quadratic formula.
4-x6x
2
Find all values that make the rational expression undefined. If
the rational expression is
defined for all real numbers, so state.
12)
18x9x
64x
2
2
Simplify the expression.
13)
2
32
9x3x
x217x
Perform the indicated operation. Simplify if possible.
14)
ara
a
ra
ra
2
22
MA 106 Final Exam Name________________________
15)
12
2
4
1x
x
1-x
48x
16)
6-m
18
6-m
9mm
2
17)
3x
6
3x
4
18)
49x
x
142x
11
2
Solve the equation.
19)
x
7
7
1
x
2
MA 106 Final Exam Name________________________
Simplify the complex fraction.
20)
1
a
7
1
a
7
21) The triangles in the figure are similar. Find the length of
the side marked with an x.
Simplify.
Solve.
10
8
x
5
4
3
Database Principles: Fundamentals of Design, Implementations
and Management
Lecture7- CHAPTER 8 : Beginning Structured Query Language
Presented by Rabia Cherouk
*
ObjectivesIn this chapter, you will learn:The basic commands
and functions of SQLHow to use SQL for data administration
(to create tables, indexes, and views)How to use SQL for data
manipulation (to add, modify, delete, and retrieve data)How to
use SQL to query a database for useful information
*
Introduction to SQLSQL functions fit into two broad
categories:Data definition language (DDL)Create database
objects, such as tables, indexes, and viewsDefine access rights
to those database objectsData manipulation language
(DML)SQL is relatively easy to learnBasic command set has
vocabulary of less than 100 words – Non-procedural
languageAmerican National Standards Institute (ANSI)
prescribes a standard SQLand standards are accepted by ISO
(International Organisation for Standardisation) .Several SQL
dialects exist
Introduction to SQL (cont..)
*
Introduction to SQL (cont..)
*
*
Data Definition CommandsThe database modelIn this chapter, a
simple database with these tables is used to illustrate
commands:CUSTOMERINVOICELINEPRODUCTVENDORFoc
us on PRODUCT and VENDOR tables
*
The Database Model
Figure 8.1 in your book
The Database Model (cont..)
*
*
Creating the DatabaseTwo tasks must be completed:
1/ Create database structure
2/ Create tables that will hold end-user dataFirst
task:RDBMS creates physical files that will hold
databaseDiffers substantially from one RDBMS to another
*
The Database SchemaAuthentication Process through which
DBMS verifies that only registered users are able to access
databaseLog on to RDBMS using user ID and password created
by database administratorSchemaIs a group of database
objects—such as tables and indexes— that are related to each
other. Usually a schema belongs to a single user or application.
A single database can hold multiple schemas belonging to
different users or applications.
*
Data TypesData type selection is usually dictated by nature of
data and by intended usePay close attention to expected use of
attributes for sorting and data retrieval purposesSupported data
types:Number(L,D), Integer, Smallint, Decimal(L,D)Char(L),
Varchar(L), Varchar2(L)Date, Time, TimestampReal, Double,
FloatInterval day to hourMany other types
Data Types (cont..)
*
*
Creating Table StructuresUse one line per column (attribute)
definitionUse spaces to line up attribute characteristics and
constraintsTable and attribute names are capitalizedNOT NULL
specification UNIQUE specification Primary key attributes
contain both a NOT NULL and a UNIQUE specificationRDBMS
will automatically enforce referential integrity for foreign keys
*
Creating Table Structures (cont..)Command sequence ends with
semicolon
Example:
CREATE TABLE EMP_2
( EMP_NUM CHAR(3) NOT NULL UNIQUE,
EMP_LNAME VARCHAR(15) NOT NULL,
EMP_FNAME VARCHAR(15) NOT NULL,
EMP_INITIAL CHAR(1),
EMP_HIRE DATE NOT NULL,
JOB_CODE CHAR(3) NOT NULL,
PRIMARY KEY (EMP_NUM),
FOREIGN KEY (JOB_CODE) REFERENCES JOB);
*
SQL ConstraintsNOT NULL constraint Ensures that column
does not accept nullsUNIQUE constraint Ensures that all values
in column are uniqueDEFAULT constraint Assigns value to
attribute when a new row is added to tableCHECK constraint
Validates data when attribute value is entered
*
SQL IndexesWhen primary key is declared, DBMS
automatically creates unique indexOften need additional
indexesUsing CREATE INDEX command, SQL indexes can be
created on basis of any selected attributeComposite indexIndex
based on two or more attributesOften used to prevent data
duplication
SQL Indexes (cont..)
*
Data Manipulation CommandsAdding table rowsSaving table
changesListing table rowsUpdating table rowsRestoring table
contentsDeleting table rowsInserting table rows with a select
subquery
*
*
*
Data Manipulation
CommandsINSERTSELECTCOMMITUPDATEROLLBACK
DELETE
*
Adding Table RowsINSERT Used to enter data into
tableSyntax: INSERT INTO columnname
VALUES (value1, value2, … , valueN);
*
Adding Table Rows (cont..)When entering values, notice
that:Row contents are entered between parenthesesCharacter
and date values are entered between apostrophesNumerical
entries are not enclosed in apostrophesAttribute entries are
separated by commasA value is required for each columnUse
NULL for unknown values
*
Saving Table ChangesChanges made to table contents are not
physically saved on disk until:Database is closedProgram is
closedCOMMIT command is usedSyntax:COMMIT
[WORK];Will permanently save any changes made to any table
in the database
*
Listing Table RowsSELECT Used to list contents of
tableSyntax: SELECT columnlistFROM tablename;Columnlist
represents one or more attributes, separated by commasAsterisk
can be used as wildcard character to list all attributes
Listing Table Rows (cont..)
*
*
Updating Table RowsUPDATE Modify data in a tableSyntax:
UPDATE tablename
SET columnname = expression [, columnname = expression]
[WHERE conditionlist];If more than one attribute is to be
updated in row, separate corrections with commas
*
Restoring Table ContentsROLLBACKUsed to restore database
to its previous conditionOnly applicable if COMMIT command
has not been used to permanently store changes in
databaseSyntax:ROLLBACK;COMMIT and ROLLBACK only
work with manipulation commands that are used to add, modify,
or delete table rows
*
Deleting Table RowsDELETE Deletes a table rowSyntax:
DELETE FROM tablename
[WHERE conditionlist ];WHERE condition is optionalIf
WHERE condition is not specified, all rows from specified table
will be deleted
*
Inserting Table Rows with a
SELECT SubqueryINSERTInserts multiple rows from another
table (source)Uses SELECT subquerySubquery: query that is
embedded (or nested) inside another querySubquery is executed
firstSyntax:
INSERT INTO tablename SELECT columnlist FROM
tablename;
*
SELECT QueriesFine-tune SELECT command by adding
restrictions to search criteria using:Conditional
restrictionsArithmetic operatorsLogical operatorsSpecial
operators
*
Selecting Rows with
Conditional RestrictionsSelect partial table contents by placing
restrictions on rows to be included in outputAdd conditional
restrictions to SELECT statement, using WHERE clauseSyntax:
SELECT columnlist
FROM tablelist
[ WHERE conditionlist ] ;
Selecting Rows with
Conditional Restrictions (continued)
*
Selecting Rows with
Conditional Restrictions (continued)
*
*
Selecting Rows with
Conditional Restrictions (cont..)
Selecting Rows with
Conditional Restrictions (continued)
*
Selecting Rows with
Conditional Restrictions (cont..)
*
Selecting Rows with
Conditional Restrictions (continued)
*
Selecting Rows with
Conditional Restrictions (continued)
*
Selecting Rows with
Conditional Restrictions (cont..)
*
Selecting Rows with
Conditional Restrictions (continued)
*
Selecting Rows with
Conditional Restrictions (continued)
*
*
Arithmetic Operators:
The Rule of PrecedencePerform operations within
parenthesesPerform power operationsPerform multiplications
and divisionsPerform additions and subtractionsTable 8.7 in
your book in T
*
Logical Operators: AND, OR, and NOTSearching data involves
multiple conditionsLogical operators: AND, OR, and NOTCan
be combinedParentheses placed to enforce precedence
orderConditions in parentheses always executed firstBoolean
algebra: mathematical field dedicated to use of logical
operatorsNOT negates result of conditional expression
*
Special OperatorsBETWEEN: checks whether attribute value is
within a rangeIS NULL: checks whether attribute value is
nullLIKE: checks whether attribute value matches given string
patternIN: checks whether attribute value matches any value
within a value listEXISTS: checks if subquery returns any rows
*
Advanced Data Definition CommandsAll changes in table
structure are made by using ALTER commandThree
optionsADD adds a columnMODIFY changes column
characteristicsDROP deletes a columnCan also be used to: Add
table constraintsRemove table constraints
*
Changing a Column’s Data TypeALTER can be used to change
data typeSome RDBMSs do not permit changes to data types
unless column is empty
Changing a Column’s Data CharacteristicsUse ALTER to
change data characteristicsChanges in column’s characteristics
permitted if changes do not alter the existing data type
*
Adding a Column
Dropping a ColumnUse ALTER to add columnDo not include
the NOT NULL clause for new columnUse ALTER to drop
columnSome RDBMSs impose restrictions on the deletion of an
attribute
*
SummarySQL commands can be divided into two overall
categories: Data definition language commands Data
manipulation language commandsThe ANSI standard data types
are supported by all RDBMS vendors in different waysBasic
data definition commands allow you to create tables, indexes,
and views
*
Summary (cont..)DML commands allow you to add, modify, and
delete rows from tablesThe basic DML commands:SELECT,
INSERT, UPDATE, DELETE, COMMIT, and
ROLLBACKSELECT statement is main data retrieval command
in SQL
*
Summary (cont..)WHERE clause can be used with SELECT,
UPDATE, and DELETE statementsAggregate functionsSpecial
functions that perform arithmetic computations over a set of
rowsORDER BY clauseUsed to sort output of SELECT
statementCan sort by one or more columnsAscending or
descending order
*
Summary (cont..)Join output of multiple tables with SELECT
statementJoin performed every time you specify two or more
tables in FROM clauseIf no join condition specified, DBMX
performs Cartesian productNatural join uses join condition to
match only rows with equal values in specified columnsRight
outer join and left outer join select rows with no matching
values in other related table
*
Advanced Data UpdatesUPDATE command updates only data in
existing rowsIf relationship between entries and existing
columns, can assign values to slotsArithmetic operators useful
in data updatesIn Oracle, ROLLBACK command undoes
changes made by last two UPDATE statements
*
Advanced Data Updates
*
Copying Parts of TablesSQL permits copying contents of
selected table columnsData need not be reentered manually into
newly created table(s)First create the table structureNext add
rows to new table using table rows from another table
Copying Parts of Tables (cont..)
*
*
Adding Primary and Foreign Key DesignationsWhen table is
copied, integrity rules do not copyPrimary and foreign keys
manually defined on new tableUser ALTER TABLE
commandSyntax:
ALTER TABLE tablename
ADD PRIMARY KEY (fieldname);For foreign key, use
FOREIGN KEY in place of PRIMARY KEY
*
Deleting a Table from the DatabaseDROPDeletes table from
databaseSyntax:
DROP TABLE tablename;Can drop a table only if it is not the
“one” side of any relationshipOtherwise RDBMS generates an
error messageForeign key integrity violation
*
Advanced SELECT QueriesLogical operators work well in the
query environmentSQL provides useful functions
that:CountFind minimum and maximum valuesCalculate
averages, etc.SQL allows user to limit queries to:Entries having
no duplicatesEntries whose duplicates may be grouped
*
Ordering a ListingORDER BY clause useful when listing order
importantSyntax:
SELECT columnlist
FROM tablelist
[WHERE conditionlist]
[ORDER BY columnlist [ASC | DESC]];Ascending order
by default
Ordering a Listing
*
Ordering a Listing (cont..)
*
Ordering a Listing (cont..)
*
*
Listing Unique ValuesDISTINCT clause produces list of only
values that are different from one anotherExample:
SELECT DISTINCT V_CODE
FROM PRODUCT;Access places nulls at the top of the
listOracle places it at the bottomPlacement of nulls does not
affect list contents
Listing Unique Values
*
*
Aggregate FunctionsCOUNT function tallies number of non-null
values of an attributeTakes one parameter: usually a column
nameMAX and MIN find highest (lowest) value in a
tableCompute MAX value in inner queryCompare to each value
returned by the querySUM computes total sum for any specified
attributeAVG function format similar to MIN and MAX
Aggregate Functions
*
Aggregate Functions (cont..)
Figure 8.21 COUNT function output examples
*
Aggregate Functions (cont..)
Figure 8.22 MIN and MAX Output Examples
*
Aggregate Functions (cont..)
Figure 8.23 The total values of all items in the PRODUCT table
*
Aggregate Functions (cont..)
Figure 8.24 AVG Function Output Examples
*
*
Grouping DataFrequency distributions created by GROUP BY
clause within SELECT statementSyntax:
SELECT columnlist
FROM tablelist
[WHERE conditionlist]
[GROUP BY columnlist]
[HAVINGconditionlist]
[ORDER BY columnlist [ASC | DESC] ] ;
Grouping Data
Figure 8.25 GROUP BY Clause Output Examples
*
Grouping Data (cont..)
Figure 8.27 An application of the HAVING clause
*
*
Virtual Tables: Creating a ViewView is virtual table based on
SELECT queryCreate view by using CREATE VIEW
commandSpecial characteristics of relational view:Name of
view can be used anywhere a table name is expectedView
dynamically updatedRestricts users to only specified columns
and rowsViews may be used as basis for reports
Virtual Tables: Creating a View (cont..)
Figure 8.28 Creating a virtual table using the CREATE VIEW
command
*
*
Joining Database TablesAbility to combine (join) tables on
common attributes is most important distinction between
relational database and other databasesJoin is performed when
data are retrieved from more than one table at a timeEquality
comparison between foreign key and primary key of related
tablesJoin tables by listing tables in FROM clause of SELECT
statementDBMS creates Cartesian product of every table
Joining Database Tables (cont..)
*
Joining Database Tables (cont..)
*
*
Joining Tables with an AliasAlias identifies the source table
from which data are takenAlias can be used to identify source
tableAny legal table name can be used as aliasAdd alias after
table name in FROM clauseFROM tablename alias
Joining Database Tables (cont..)
*
*
Recursive Joins - Outer JoinsAlias especially useful when a
table must be joined to itselfRecursive queryUse aliases to
differentiate the table from itselfTwo types of outer joinLeft
outer joinRight outer join
Recursive Joins
*
Recursive Joins (cont..)
*
Outer Joins
*
Outer Joins (cont..)
*
Grouping Data (cont..)
Figure 8.26 Incorrect and Correct use of the GROUP BY Clause
*
Lect10-Conceptual, Logical and Physical.ppt
9
*
Database Principles: Fundamentals of Design, Implementations
and Management
Lecture 10 - CHAPTER 11: CONCEPTUAL, LOGICAL AND
PHYSICAL DATABASE DESIGN
*
9
*
In this chapter, you will learn:About the three stages of
database design: conceptual, logical and physical.How to design
a conceptual model to represent the business and its key
functional areas.How the conceptual model can be transformed
into a logically equivalent set of relations.How to translate the
logical data model into a set of specific DBMS table
specifications.About different types of file organization.How
indexes can be applied to improve data access and retrieval.How
to estimate data storage requirements.
*
9
*
Database DesignNecessary to focus on the dataMust
concentrate on the data characteristics required to build
database modelAt this point there are two views of data within
system:Business view of data as an information
sourceDesigner’s view of the data structure, its access, and
activities required to transform data into information
9
*
Database Design (cont..)
9
*
Database Design (cont..)
To complete the design phase, we must remember
these points:The process of database design is loosely related
to analysis and design of larger system The data component is
only one element of a larger systemSystems analysts or systems
programmers are in charge of designing other system
componentsTheir activities create procedures that will help
transform data within database into useful informationThe
Database Design does not constitute a sequential
processIterative process that provides continuous feedback
designed to trace previous steps
9
*
Database Design (cont..)
9
*
3 Stages of Database Design
*
9
*
I. Conceptual Design (CD)In the CD, Data modeling is used to
create an abstract database structure that represents real-world
objects in most realistic way possibleThe CD must embody
clear understanding of business and its functional areasEnsure
that all data needed are in the model, and that all data in the
model are neededRequires four steps:Data analysis and
requirementsEntity relationship modeling and
normalisationData model verificationDistributed database
design
9
*
I. Conceptual Design (cont..)Data Analysis and Requirements
First step is to discover the data element characteristicsObtains
characteristics from different sourcesMust take into account the
business rulesDerived from the description of operations which
is a Document that provides precise, detailed, up-to-date, and
thoroughly reviewed description of activities that define
organization’s operating environment
9
*
I. Conceptual Design (cont...)Entity Relationship (ER)
Modeling
and Normalization Designer must communicate and enforce
appropriate standards to be used in documentation of designUse
of diagrams and symbolsDocumentation writing
styleLayoutOther conventions to be followed during
documentation
9
*
I. Conceptual Design (cont..)
*
9
*
I. Conceptual Design (cont..)
Fig 11.2 in your book
9
*
I. Conceptual Design (cont..)
9
*
I. Conceptual Design (cont..)
*
9
*
I. Conceptual Design (cont..)
*
9
*
I. Conceptual Design (cont..)Entity Relationship (ER) Modeling
and
Normalization (cont…)Data dictionary Defines all objects
(entities, attributes, relations, views, and so on) Used in tandem
with the normalization process to help eliminate data anomalies
and redundancy problems
9
*
I. Conceptual Design (cont..)The Data Model Verification The
ER Model must be verified against the proposed system
processes to corroborate (confirm) that the intended processes
can be supported by the database modelA revision of the
original design starts with careful reevaluation of the entities,
followed by the detailed examination of the attributes that
describe these entitiesDefine design’s major components as
modules:A module is an information system component that
handles a specific function
9
*
I. Conceptual Design (cont..)
*
9
*
I. Conceptual Design (cont..)
*
9
*
I. Conceptual Design (cont..)Data Model Verification
(cont..)Verification process starts with:Selecting a central
(most important) entityWhich is defined in terms of its
participation in most of the model’s relationshipsThe next step
is to identify the module or subsystem to which central entity
belongs and to define boundaries and scopeOnce the module is
identified, the central entity is placed within module’s
framework
9
*
I. Conceptual Design (cont..)Distributed Database
DesignPortions of the database may reside in different physical
locationsDesigner must also develop data distribution and
allocation strategies
9
*
II. DBMS Software SelectionThe selection of the software is
critical to an information system’s smooth operationAdvantages
and disadvantages should be carefully studiedSome common
factors that may affect the purchasing decision are:CostDBMS
features and toolsUnderlying model: Hierarchical, network,
relational etc…PortabilityDBMS requirements
9
*
III. Logical DesignUsed to translate the conceptual design into
internal model for selected database management systemLogical
design is software-dependentRequires that all objects in the
model be mapped to specific constructs used by selected
database software
9
*
III. Logical Design (cont..)Used to translate the conceptual
design into internal model for the selected database management
systemLogical design is software-dependentThe logical design
stage consists of the following phases:
Creating the logical data model.
Validating the logical data model using normalization.
Assigning and validating integrity constraints.
Merging logical models constructed for different parts for the
database together.
Reviewing the logical data model with the user.
*
9
*
III. Logical Design (cont..)
9
*
III. Logical Design (cont…)
9
*
Review the complete logical model with the userReviewing the
completed logical model with the users to ensure that all the
data requirements have been modelled Ensure that all the
transactions are supported within the different user views. This
stage is very important as any problems need to be solved
before beginning the physical database design stage.
*
9
*
IV. Physical Database DesignPhysical database design requires
the definition of specific storage or access methods that will be
used by the database. Involves the translation of the logical
model into a set of specific DBMS specifications for storing and
accessing data.The ultimate goal must be to ensure that data
storage is effective to ensure integrity and security and efficient
in terms of query response time.
*
9
*
IV. Physical Design (cont..)Is the process of selecting data
storage and data access characteristics of the databaseThe
storage characteristics are a function of device types supported
by the hardware, the type of data access methods supported by
the system, and DBMSParticularly important in older
hierarchical and network modelsBecomes more complex when
data are distributed at different locations
9
*
IV. Physical Database Design (cont..)
The following information needs to have been collected:
A set of normalized relations devised from the ER model and
the normalization process.
An estimate of the volume of data which will be stored in each
database table and the usage statistics.
An estimate of the physical storage requirements for each field
(attribute) within the database.
The physical storage characteristics of the DBMS that are being
used
*
9
*
Stages of Physical Database Design
Analysing data volume and database usage.
Translate each relation identified in the logical data model into
a table.
Determine a suitable file organization.
Define indexes.
Define user views.
Estimate data storage requirements.
Determine database security for users.
Additional slides are after the summary
*
9
*
Analysing Data Volume and Database UsageThe steps required
to carrying out this phase are:
Identifying the most frequent and critical transactions.
Analysis of critical transactions to determine which relations in
the database participate in these transactions.
*
9
*
Analysing Data Volume and Database Usage (cont….)Data
volume and data usage statistics are usually shown on a
simplified version of the ERD. This diagram is known as a
composite usage map or a transaction usage map.
*
9
*
Analysing Data Volume and Database Usage (cont….)
*
9
*
Translate logical relations into tables
Identify the primary and any foreign keys for each table.
Identify those attributes which are not allowed to contain NULL
values and those which should be UNIQUE. You can exclude
the primary key attribute(s) here as the PRIMARY KEY
constraint automatically imposes the NOT NULL and UNIQUE
constraints.
*
9
*
Translate logical relations into tablesFor each relation you
should:
Identify each attribute name and its domain from the data
dictionary. Note any attributes which require DEFAULT values
to be inserted into the attribute whenever new rows are inserted
into the database.
Determine any attributes that require a CHECK constraint in
order to validate the value of the attribute.
*
9
*
Translate logical relations into tables (continued)
*
9
*
Translate logical relations into tables (cont..)
*
9
*
Determine Suitable File OrganisationSelecting the most suitable
file organization is very important to ensure that the data is
stored efficiently and data can be retrieved as quickly as
possible.To do this the DBMS must know where this record is
stored and how it can identify it. Look at the future growth of
the database and whether the type of file organization provides
some protection against data loss.
*
9
*
Determine Suitable File Organisation (cont..)There are three
categories of file organizations:files which contain randomly
ordered records known as heap filesfiles which are sorted on
one or more fields such as file organizations which are based on
indexesfiles hashed on one or more fields known as hash files.
*
9
*
Determine Suitable File Organisation (cont..)Sequential File
OrganizationsRecords are stored in a sequence based on the
value of one or more fields which is often the primary key. In
order to locate a specific record the whole file must be searched
and every record in the file must be read in turn until the
required record is located.
*
9
*
Determine Suitable File Organisation (cont..)Heap File
OrganizationsRecords are unordered and inserted into the file as
they come. Only used when a large quantity of data needs to be
inserted into a table for the first time.
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)Indexed File
OrganizationsRecords can be stored in a sorted or unsorted
sequence and an index is created locate specific records
quickly.
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)Types of
IndexesPrimary index —these indexes are placed on unique
fields such as the primary key.Secondary index —these indexes
can be placed on any field in the file that is unordered.Multi-
level index —is used where one index becomes too large and so
is split into a number of separate indexes in order to reduce the
search.
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)B-treesBalanced or
B-trees and are used to maintain an ordered set of indexes or
data to allow efficient operations to select, delete and insert
data.A special kind of B-tree is known as the B+-tree where all
keys reside in the leaves. This tree is mostoften used to
represent indexes which act as a ‘road_map’ so that each index
can be quickly located
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Conceptual Design of the DVD rental store
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)Bitmap
IndexesBitmap indexes are usually applied to attributes which
are sparse in their given domain.A two-dimensional array is
constructed. One column is generated for every row in the table
which we want to index with each column representing a
distinct value within the bitmapped index. The two-dimensional
array represents each value within the index multiplied by the
number of rows in the table.
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)Bitmap indexes are
usually used when:A column in the table has low cardinality.
The table is not used often for data manipulation activities and
is large.Specific SQL queries reference a number of low
cardinality values in their where clauses.
*
9
*
Determine Suitable File Organisation (cont..)Join IndexCan be
applied to columns from two or more tables whose values come
from the same domain. It is often referred to as a bitmap join
index and it is a way of saving space by reducing the volume of
data that must be joined. The bitmap join stores the ROWIDS of
corresponding rows in a separate table.
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)
*
9
*
Determine Suitable File Organisation (cont..)Hashed File
OrganizationsUses a hashing algorithm to map a primary key
value onto a specific record address in the file. Records are
stored in a random order throughout the file. Often referred to
as random or direct files.
*
9
*
Define Indexes (cont..)In SQL, indexes are created using the
CREATE INDEX statement. For example, if we wanted to
create a primary index on the DVD_ID primary key field from
the DVDs table the SQL would be:
CREATE UNIQUE INDEX DVDINDEX ON
DVD(DVD_ID)
*
9
*
Define Indexes (cont..)As a general rule, indexes are likely to
be used:When an indexed column appears by itself in a search
criteria of a WHERE or HAVING clause.When an indexed
column appears by itself in a GROUP BY or ORDER BY
clause.When a MAX or MIN function is applied to an indexed
column.When the data sparsity on the indexed column is high.
*
9
*
Guidelines for creating IndexesCreate indexes for each single
attribute used in a WHERE, HAVING, ORDER BY, or GROUP
BY clause.Do not use indexes in small tables or tables with low
sparsity.Declare primary and foreign keys so the query
optimizer within a specific DBMS can use the indexes in join
operations.Declare indexes in join columns other than PK/FK.
*
9
*
Define User ViewsDuring the conceptual design stage the
different user views required for the database are
determined.Using the relations defined in the logical data
model, these views must now be defined. Views are often
defined taking database security into account as they can help to
define the roles of different types of users.
*
9
*
Estimate Data Storage Requirements
*
9
*
SecurityData must be protected from access by unauthorized
usersMust provide for following: Physical securityPassword
securityAccess rightsAudit trailsData encryptionDiskless
workstations
9
*
Determine database security for usersDuring physical database
design security requirements must be implementedDatabase
privileges for users will need to be established. For example,
privileges may include selecting rows from specified tables or
views, being able to modify or delete data in specified tables
etc.
*
9
*
Determine database security for usersDuring physical database
design security requirements must be implementedDatabase
privileges for users will need to be established. For example,
privileges may include selecting rows from specified tables or
views, being able to modify or delete data in specified tables
etc.
*
9
*
Security in ORACLEThe SQL commands GRANT and REVOKE
are used to authorize or withdraw privileges on specific user
accounts. For example, the following two SQL statements grant
the account with the username ‘Craig’ the ability to select rows
from the DVD table and the ability to create tables.
GRANT SELECT ON DVD TO Craig;
GRANT CREATE TABLE TO Craig;
*
9
*
Security in ORACLE (cont..)Removing these privileges can be
done using the following SQL statements:
REVOKE SELECT ON MOVIE FROM Craig;
REVOKE CREATE TABLE FROM Craig;
*
9
*
Security in ORACLE (cont..)A role is simply a collection of
privileges referred to under a single name. The major benefit of
roles is that a DBA can add or revoke privileges from a role at
any time. These changes will then automatically apply to all the
users who have been assigned that role.
*
9
*
Security in ORACLE (cont..)For example, in the DVD rental
store, the sales staff need to perform SELECT and UPDATE
operations on the CUSTOMER table. The SQL command
CREATE ROLE is used to create the role
STAFF_CUSTOMER_ROLE:CREATE ROLE
STAFF_CUSTOMER_ROLE;Once created, privileges can then
be granted on selected database objects to the new role.
*
9
*
Security in ORACLE (cont..)For example:
GRANT SELECT ON CUSTOMERS TO
STAFF_CUSTOMER_ROLE;
GRANT UPDATE ON CUSTOMERS TO
STAFF_CUSTOMER_ROLE;The last stage then involves
granting the role to individual users accounts, e.g. Frank:
GRANT STAFF_CUSTOMER_ROLE TO Frank;
*
9
*
SummaryConceptual database design is where the conceptual
representation of the database is created by producing a data
model which identifies the relevant entities and relationships
within the system.
*
9
*
Summary (cont..)Logical database design is the second stage in
the Database Life Cycle, where relations are designed based on
each entity and its relationships within the conceptual model.
*
9
*
Summary (cont..)Physical database design is where the logical
data model is mapped onto the physical database tables to be
implemented in the chosen DBMS. The ultimate goal must be to
ensure that data storage is used effectively, to ensure integrity
and security and to improve efficiency in terms of query
response time.
*
9
*
Summary (cont…)Selecting a suitable file organization is
important for fast data retrieval and efficient use of storage
space.Indexes are crucial in speeding up data access. Indexes
facilitate searching, sorting, and using aggregate functions and
even join operations.
*
Chap 2.pptx
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 2: DATA MODELS
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
20/03/2017
1
In this chapter, you will learn:
Why data models are important
About the basic data-modeling building blocks
What business rules are and how they influence database design
How the major data models evolved
How data models can be classified by level of abstraction
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
2
20/03/2017
The Importance of Data Models
Data models
Relatively simple representations, usually graphical, of complex
real-world data structures
Facilitate interaction among the designer, the applications
programmer, and the end user
End-users have different views and needs for data
Data model organizes data for various users
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
3
20/03/2017
Data Model Basic Building Blocks
Entity - anything about which data are to be collected and
stored
Attribute - a characteristic of an entity
Relationship - describes an association among entities
One-to-many (1:*) relationship
Many-to-many (*:*) relationship
One-to-one (1:1) relationship
Constraint - a restriction placed on the data
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
4
20/03/2017
Business Rules
A business rule is a brief, precise, and unambiguous
descriptions of a policies, procedures, or principles within a
specific organization
Apply to any organization that stores and uses data to generate
information
Description of operations that help to create and enforce actions
within that organization’s environment
Must be rendered in writing
Must be kept up to date
Sometimes are external to the organization
Must be easy to understand and widely disseminated
Describe characteristics of the data as viewed by the company
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
5
20/03/2017
Discovering Business Rules
Sources of Business Rules:
Company managers
Policy makers
Department managers
Written documentation
Procedures
Standards
Operations manuals
Direct interviews with end users
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
6
20/03/2017
Translating Business Rules into Data Model Components
Standardize company’s view of data
Constitute a communications tool between users and designers
Allow designer to understand the nature, role, and scope of data
Allow designer to understand business processes
Allow designer to develop appropriate relationship participation
rules and constraints
Promote creation of an accurate data model
Generally, nouns translate into entities
Verbs translate into relationships among entities
Relationships are bi-directional
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
7
20/03/2017
The Evolution of Data Models
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
8
20/03/2017
The Evolution of Data Models (cont..)
Hierarchical
Network
Relational
Entity relationship
Object oriented (OO)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
9
20/03/2017
The Hierarchical Model
Developed in the 1960s to manage large amounts of data for
complex manufacturing projects
Basic logical structure is represented by an upside-down “tree”
The hierarchical structure contains levels, or segments
Depicts a set of one-to-many (1:*) relationships between a
parent and its children segments
Each parent can have many children
each child has only one parent
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
10
20/03/2017
The Hierarchical Model (cont…)
Advantages
Many of the hierarchical data model’s features formed the
foundation for current data models
Its database application advantages are replicated, though in a
different form, in current database environments
Generated a large installed (mainframe) base, created a pool of
programmers who developed numerous tried-and-true business
applications
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
11
Albeit= Bien que , quoique que , though
20/03/2017
The Hierarchical Model (cont..)
Disadvantages
Complex to implement
Difficult to manage
Lacks structural independence
Implementation limitations
Lack of standards
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
12
20/03/2017
The Network Model
Created to
Represent complex data relationships more effectively than the
hierarchical model
Improve database performance
Impose a database standard
While the Network model is not used today, the definitions of
standard database concepts are still used by modern data models
such as:
Schema
Conceptual organization of entire database as viewed by the
database administrator
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
13
20/03/2017
The Network Model (cont..)
Subschema
Defines database portion “seen” by the application programs
that actually produce the desired information from data
contained within the database
Data Management Language (DML)
Defines the environment in which data can be managed and is
used to work with the data in the database
Schema Data Definition Language (DDL)
Enables database administrator to define schema components
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
14
20/03/2017
The Network Model (cont..)
Disadvantages
Too cumbersome
The lack of ad hoc query capability put heavy pressure on
programmers
Any structural change in the database could produce havoc in
all application programs that drew data from the database
Many database old-timers can recall the interminable
information delays
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
15
20/03/2017
The Relational Model
Developed by Codd (IBM) in 1970
Considered ingenious but impractical in 1970
Conceptually simple
Computers lacked power to implement the relational model
Today, microcomputers can run sophisticated Relational
Database Software called Relational Database Management
System (RDBMS)- Ex: Oracle : mainframe relational software
Performs same basic functions provided by hierarchical and
network DBMS systems, in addition to a host of other functions
Most important advantage of the RDBMS is its ability to hide
the complexities of the relational model from the user
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
16
20/03/2017
The Relational Model (cont..)
Table
Matrix consisting of a series of row/column intersections
Related to each other through sharing a common entity
characteristic
Tables, also called relations are related to each other through
the sharing of a common field
Relational diagram
Is a representation of relational database’s entities, attributes
within those entities, and relationships between those entities
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
17
20/03/2017
The Relational Model (cont..)
Relational Table
Stores a collection of related entities
Resembles a file
Relational table is purely a logical structure
How data are physically stored in the database is of no concern
to the user or the designer
This property became the source of a real database revolution
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
18
20/03/2017
The Relational Model (cont..)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
19
20/03/2017
The Relational Model (continued)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
20
20/03/2017
The Relational Model (cont..)
Another raison for the database model’s rise to dominance is its
powerful and flexible query language
Structured Query Language (SQL) allows the user to specify
what must be done without specifying how it must be done
SQL-based relational database application involves 3 parts:
User interface
A set of tables stored in the database
SQL engine
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
21
20/03/2017
The Entity Relationship Model
Widely accepted and adapted graphical tool for data modeling
Introduced by Peter Chen in 1976
It was the graphical representation of entities and their
relationships in a database structure.
More recently the class diagram component of the Unified
Modeling Language (UML) has been used to produce entity
relationship models.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
22
20/03/2017
The Entity Relationship Model (cont..)
Entity relationship diagram (ERD)
Uses graphic representations to model database components
Entity is mapped to a relational table
Entity instance (or occurrence) is a row in table
Each entity is described by a set of attributes that describe
particular characteristics of the entity
Entity set is collection of like entities
Connectivity labels types of relationships (1-1, 1- M, M-M)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
23
20/03/2017
The Entity Relationship Model (cont..)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
24
20/03/2017
The Entity Relationship Model (cont..)
Fig 2.4 The basic Crow’s foot ERD
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
25
20/03/2017
Data Models: A Summary
Each new data model capitalized on the shortcomings of
previous models
Common characteristics that data models must have in order to
be widely accepted:
Conceptual simplicity without compromising the semantic
completeness of the database
Represent the real world as closely as possible
Representation of real-world transformations (behavior) must
comply with consistency and integrity characteristics of any
data model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
26
20/03/2017
Data Models: A Summary (cont..)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
27
20/03/2017
Degrees of Data Abstraction
Way of classifying data models
Many processes begin at high level of abstraction and proceed
to an ever-increasing level of detail
Designing a usable database follows the same basic process
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
28
20/03/2017
Degrees of Data Abstraction (cont..)
In the early 1970s, the American National Standards Institute
(ANSI) Standards Planning and Requirements Committee
(SPARC)
Defined a framework for data modeling based on degrees of
data abstraction:
External
Conceptual
Internal
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
29
20/03/2017
Degrees of Data Abstraction (cont..)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
30
20/03/2017
The External Model
End users’ view of the data environment
Requires that the modeler subdivide set of requirements and
constraints into functional modules that can be examined within
the framework of their external models
Advantages:
Easy to identify specific data required to support each business
unit’s operations
Facilitates designer’s job by providing feedback about the
model’s adequacy
Creation of external models helps to ensure security constraints
in the database design
Simplifies application program development
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
31
20/03/2017
The External Model (cont..)
Fig 2.9 External Models for Tiny College
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
32
20/03/2017
The External Model (cont..)
Fig 2.9 External Models for Tiny College
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
33
20/03/2017
The Conceptual Model
Represents global view of the entire database
Representation of data as viewed by the entire organization
The conceptual is the basis for identification and high-level
description of main data objects, avoiding details
Most widely used conceptual model is the entity relationship
(ER) model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
34
20/03/2017
The Conceptual Model (cont..)
Fig 2.10 The Conceptual Model for Tiny College
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
35
20/03/2017
The Conceptual Model (cont..)
First, the CM provides a relatively easily understood macro
level view of data environment
Second, the CM is independent of both software and hardware
Does not depend on the DBMS software used to implement the
model
Does not depend on the hardware used in the implementation of
the model
Changes in either hardware or DBMS software have no effect on
the database design at the conceptual level
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
36
20/03/2017
The Internal Model
Is the representation of the database as “seen” by the DBMS
The internal model should map the conceptual model to the
DBMS
The internal schema depicts a specific representation of an
internal model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
37
20/03/2017
Fig 2.11 An Internal Model for Tiny College
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
38
20/03/2017
The Physical Model
Operates at lowest level of abstraction, describing the way data
are saved on storage media such as disks or tapes
Software and hardware dependent
Requires that database designers have a detailed knowledge of
the hardware and software used to implement database design
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
39
20/03/2017
The Physical Model (cont..)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
40
20/03/2017
Summary
A data model is a (relatively) simple abstraction of a complex
real-world data environment
Basic data modeling components are:
Entities
Attributes
Relationships
Constraints
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
41
20/03/2017
Summary (cont..)
Hierarchical model
Depicts a set of one-to-many (1:*) relationships between a
parent and its children segments
Network data model
Uses sets to represent 1:* relationships between record types
Relational model
Current database implementation standard
ER model is a popular graphical tool for data modeling that
complements the relational model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
42
20/03/2017
Summary (cont..)
Object is basic modeling structure of object oriented data model
The relational model has adopted many object-oriented
extensions to become the extended relational data model
(ERDM)
Data modeling requirements are a function of different data
views (global vs. local) and level of data abstraction
NoSQL databases are a new generation of databases that do not
use the relational model and are geared to support the very
specific needs of Big Data organizations
Additional slides are next
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
43
20/03/2017
The Object Oriented Model
Modeled both data and their relationships in a single structure
known as an object
Object-oriented data model (OODM) is the basis for the object-
oriented database management system (OODBMS)
OODM is said to be a semantic data model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
44
20/03/2017
The Object Oriented Model (cont..)
Object described by its factual content
Like relational model’s entity
Includes information about relationships between facts within
object, and relationships with other objects
Unlike relational model’s entity
Subsequent OODM development allowed an object to also
contain all operations
Object becomes basic building block for autonomous structures
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
45
20/03/2017
The Object Oriented Model (cont..)
Object is an abstraction of a real-world entity
Attributes describe the properties of an object
Objects that share similar characteristics are grouped in classes
Classes are organized in a class hierarchy
Inheritance is the ability of an object within the class hierarchy
to inherit the attributes and methods of classes above it
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
46
20/03/2017
The Object Oriented Model (cont..)
Fig 2.5 A comparison of the OO model and the ER model
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
47
20/03/2017
Other Models
Extended Relational Data Model (ERDM)
Semantic data model developed in response to increasing
complexity of applications
DBMS based on the ERDM often described as an
object/relational database management system (O/RDBMS)
Primarily geared to business applications
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
48
20/03/2017
Emerging Data Models: Big Data and NoSQL
Big Data refers to a movement to find new and better ways to
manage large amounts of Web-generated data and derive
business insight from it, while simultaneously providing high
performance and scalability at a reasonable cost.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
Emerging Data Models: Big Data and NoSQL (cont…)
The relational approach does not always match the needs of
organizations with Big Data challenges.
It is not always possible to fi t unstructured, social media data
into the conventional relational structure of rows and columns.
Adding millions of rows of multi-format (structured and
nonstructured) data on a daily basis will
inevitably lead to the need for more storage, processing power,
and sophisticated data analysis
tools that may not be available in the relational environment.
The type of high-volume implementations required in the
RDBMS environment
for the Big Data problem comes with a hefty price tag for
expanding hardware, storage, and software licenses.
Data analysis based on OLAP tools has proven to be very
successful in relational environments with highly structured
data.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
Database Models and the Internet
Internet drastically changed role and scope of database market
OODM and ERDM-O/RDM have taken a backseat to
development of databases that interface with Internet
Dominance of Web has resulted in growing need to manage
unstructured information
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
51
20/03/2017
NoSQL Databases
NoSQL to refer to a new generation of databases that address
the specific challenges of the Big Data era.
They have the following general characteristics:
Not based on the relational model, hence the name NoSQL.
Supports distributed database architectures.
Provides high scalability, high availability, and fault tolerance.
Supports very large amounts of sparse data.
Geared toward performance rather than transaction consistency.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
NoSQL Databases (cont..)
The key-value data model is based on a structure composed of
two data elements: a key and a value, in which every key has a
corresponding value or set of values.
The key-value data model is also referred to as the attribute-
value or associative data model.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
NoSQL Databases (cont…)
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
NoSQL Databases (cont..)
The data type of the “value” column is generally a long string to
accommodate the variety of actual data types of the values
placed in the column.
To add a new entity attribute in the relational model, you need
to modify the table definition.
To add a new attribute in the key-value store, you add a row to
the key-value store, which is why it is said to be “schema-less.”
NoSQL databases do not store or enforce relationships among
entities.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
NoSQL Databases (cont..)
NoSQL databases use their own native application programming
interface (API) with simple data access commands, such as put,
read, and delete.
Indexing and searches can be difficult. Because the “value”
column in the key-value data model could contain many
different data types, it is often difficult to create indexes on the
data. At the same time, searches can become very complex.
Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
© 2013 Cengage Learning EMEA
Lecture7_ch07.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 7 Normalizing Database Designs
*
ObjectivesIn this chapter, you will learn:What normalization is
and what role it plays in the database design processAbout the
normal forms 1NF, 2NF, 3NF, BCNF,
and 4NFHow normal forms can be transformed from lower
normal forms to higher normal formsThat normalization and ER
modeling are used concurrently to produce a good database
designThat some situations require denormalization to generate
information efficiently
*
Database Tables and NormalizationNormalization Process for
evaluating and correcting table structures to minimize data
redundanciesReduces data anomaliesWorks through a series of
stages called normal forms: First normal form (1NF)Second
normal form (2NF)Third normal form (3NF)
*
Database Tables and Normalization (cont..)Normalization
(cont..)2NF is better than 1NF; 3NF is better than 2NFFor most
business database design purposes, 3NF is as high as needed in
normalizationHighest level of normalization is not always most
desirableDenormalization produces a lower normal formPrice
paid for increased performance is greater data redundancy
*
The Need for NormalizationExample: company that manages
building projectsCharges its clients by billing hours spent on
each contractHourly billing rate is dependent on employee’s
positionPeriodically, report is generated that contains
information such as displayed in Table 5.1
The Need for Normalization
*
The Need for Normalization
*
*
The Need for Normalization (cont..)Structure of data set in
Figure 7.1 does not handle data very wellTable structure
appears to work; report generated with easeUnfortunately report
may yield different results depending on what data anomaly has
occurredRelational database environment suited to help designer
avoid data integrity problems
*
The Normalization ProcessEach table represents a single
subjectNo data item will be unnecessarily stored in more than
one tableAll attributes in a table are dependent on the primary
keyEach table void of insertion, update, deletion anomalies
Void = depourvu de
*
The Normalization Process (cont..)
*
*
The Normalization Process (cont..)Objective of normalization is
to ensure all tables in at least 3NFHigher forms not likely to be
encountered in business environmentNormalization works one
relation at a timeProgressively breaks table into new set of
relations based on identified dependencies
*
Conversion to First Normal FormRepeating groupDerives its
name from the fact that a group of multiple entries of same type
can exist for any single key attribute occurrenceRelational table
must not contain repeating groupsNormalizing table structure
will reduce data redundanciesNormalization is three-step
procedure
*
Conversion to First Normal Form (cont.)Step 1: Eliminate the
Repeating Groups Present data in tabular format, where each
cell has single value and there are no repeating groupsEliminate
nulls: each repeating group attribute contains an appropriate
data value
Step 2: Identify the Primary Key Primary key must uniquely
identify attribute valueNew key must be composed
Conversion to First Normal Form (cont..)
*
Conversion to First Normal Form (cont..)
*
Conversion to First Normal Form (cont..)Step 3: Identify All
Dependencies Dependencies can be depicted with help of a
diagramDependency diagram: Depicts all dependencies found
within given table structureHelpful in getting bird’s-eye view of
all relationships among table’s attributesMakes it less likely
that will overlook an important dependency
*
Conversion to First Normal Form (cont..)
*
*
Conversion to First Normal Form (cont.)First normal form
describes tabular format in which:All key attributes are
definedThere are no repeating groups in the tableAll attributes
are dependent on primary keyAll relational tables satisfy 1NF
requirementsSome tables contain partial
dependenciesDependencies based on part of the primary
keySometimes used for performance reasons, but should be used
with cautionStill subject to data redundancies
Conversion to Second Normal FormRelational database design
can be improved by converting the database into second normal
form (2NF)Two steps
*
*
Conversion to Second Normal Form (cont..)Step 1: Write Each
Key Component
on a Separate Line Write each key component on separate line,
then write original (composite) key on last lineEach component
will become key in new table
Step 2: Assign Corresponding Dependent Attributes Determine
those attributes that are dependent on other attributesAt this
point, most anomalies have been eliminated
Conversion to Second Normal Form (cont..)
*
*
Conversion to Second Normal Form (cont..)Table is in second
normal form (2NF) when:It is in 1NF and It includes no partial
dependencies:No attribute is dependent on only portion of
primary key
*
Conversion to Third Normal FormData anomalies created are
easily eliminated by completing three stepsStep 1: Identify Each
New Determinant For every transitive dependency, write its
determinant as PK for new tableDeterminant: any attribute
whose value determines other values within a row
Conversion to Third Normal Form (cont..)Step 2: Identify the
Dependent Attributes Identify attributes dependent on each
determinant identified in Step 1 and identify dependency
Name table to reflect its contents and function
*
*
Conversion to Third Normal Form (cont.)Step 3: Remove the
Dependent Attributes from Transitive Dependencies Eliminate
all dependent attributes in transitive relationship(s) from each
of the tablesDraw new dependency diagram to show all tables
defined in Steps 1–3Check new tables as well as tables modified
in Step 3 to make sure that:Each table has a determinant and
thatNo table contains inappropriate dependencies
Conversion to Third Normal Form (cont..)
*
*
Conversion to Third Normal Form (cont.)A table is in third
normal form (3NF) when both of the following are true:It is in
2NFIt contains no transitive dependencies
*
Improving the DesignTable structures are cleaned up to
eliminate troublesome initial partial and transitive dependencies
Normalization cannot, by itself, be relied on to make good
designs
It is valuable because its use helps eliminate data redundancies
*
Improving the Design (cont..)Issues to address in order to
produce a good normalized set of tables: Evaluate PK
AssignmentsEvaluate Naming ConventionsRefine Attribute
AtomicityIdentify New AttributesIdentify New
RelationshipsRefine Primary Keys as Required for Data
GranularityMaintain Historical AccuracyEvaluate Using
Derived Attributes
Improving the Design (cont..)
*
Improving the Design (cont..)
*
Improving the Design (cont..)
*
*
Surrogate Key ConsiderationsWhen primary key is considered
to be unsuitable, designers use surrogate keysData entries in
Table 7.3 are inappropriate because they duplicate existing
recordsYet there has been no violation of either entity integrity
or referential integrity
*
Higher-Level Normal FormsTables in 3NF perform suitably in
business transactional databasesHigher order normal forms
useful on occasionTwo special cases of 3NF:Boyce-Codd
normal form (BCNF)Fourth normal form (4NF)
*
The Boyce-Codd Normal Form (BCNF)Every determinant in
table is a candidate keyHas same characteristics as primary key,
but for some reason, not chosen to be primary keyWhen table
contains only one candidate key, the 3NF and the BCNF are
equivalentBCNF can be violated only when table contains more
than one candidate key
*
The Boyce-Codd Normal Form (BCNF) (cont..)Most designers
consider the BCNF as special case of 3NFTable is in 3NF when
it is in 2NF and there are no transitive dependenciesTable can
be in 3NF and fails to meet BCNFNo partial dependencies, nor
does it contain transitive dependenciesA nonkey attribute is the
determinant of a key attribute
The Boyce-Codd Normal Form (BCNF) (cont...)
*
The Boyce-Codd Normal Form (BCNF) (cont..)
*
The Boyce-Codd Normal Form (BCNF) (cont..)
*
*
SummaryNormalization is used to minimize data
redundanciesFirst three normal forms (1NF, 2NF, and 3NF) are
most commonly encounteredTable is in 1NF when:All key
attributes are definedAll remaining attributes are dependent on
primary key
*
Summary (continued)Table is in 2NF when it is in 1NF and
contains no partial dependenciesTable is in 3NF when it is in
2NF and contains no transitive dependenciesTable that is not in
3NF may be split into new tables until all of the tables meet
3NF requirementsNormalization is important part—but only
part—of the design process
*
Summary (continued)Table in 3NF may contain multivalued
dependencies Numerous null values or redundant dataConvert
3NF table to 4NF by:Splitting table to remove multivalued
dependenciesTables are sometimes denormalized to yield less
I/O, which increases processing speed
Additional SlidesPlease have a look a the following slides
*
*
Fourth Normal Form (4NF)Table is in fourth normal form (4NF)
when both of the following are true:It is in 3NF No multiple
sets of multivalued dependencies4NF is largely academic if
tables conform to following two rules:All attributes dependent
on primary key, independent of each otherNo row contains two
or more multivalued facts about an entity
Fourth Normal Form (4NF) (continued)
*
*
Fourth Normal Form (4NF)
Fourth Normal Form (4NF)
*
*
Normalization and Database DesignNormalization should be
part of the design processMake sure that proposed entities meet
required normal form before table structures are createdMany
real-world databases have been improperly designed or
burdened with anomaliesYou may be asked to redesign and
modify existing databases
*
Normalization and Database Design (cont.)ER diagram Identify
relevant entities, their attributes, and their relationshipsIdentify
additional entities and attributesNormalization procedures
Focus on characteristics of specific entitiesMicro view of
entities within ER diagramDifficult to separate normalization
process from ER modeling processTwo techniques should be
used concurrently
Figure 7.13 in your book
*
Normalization and Database Design (cont.)
Figure 7.13 in your book
*
Normalization and Database Design (cont.)
Figure 7.14 in your book
Figure 7.14 in your book
*
Figure 7.15 in your book
Normalization and Database Design (cont.)
Figure 7.15 in your book
Normalization and Database Design (continued)
*
Normalization and Database Design (continued)
*
Normalization and Database Design (continued)
*
*
*
Void = depourvu de
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
Lecture9-Chap10- Database Development Process.pptx
Database Principles: Fundamentals of Design, Implementations
and Management
Lecture9-CHAPTER 10 : Database Development Process
In this chapter, you will learn:
That successful database design must reflect the information
system of which the database is a part
That successful information systems are developed within a
framework known as the Systems Development Life Cycle
(SDLC)
That within the information system, the most successful
databases are subject to frequent evaluation and revision within
a framework known as the Database Life Cycle (DBLC)
How to conduct evaluation and revision within the SDLC and
DBLC frameworks
2
In this chapter, you will learn (cont..):
About database design strategies: top-down vs. bottom-up
design and centralized vs. decentralized design
Common threats to the security of the data and what security
measures could be put in place
The importance of the database administration in an
organization
The technical and managerial roles of the database
administrator (DBA)
3
The Information System
Provides for data collection, storage, and retrieval
Composed of people, hardware, software, database(s),
application programs, and procedures
Systems analysis
Is the process that establishes the need for and extent of an
information system
Systems development
Is the process of creating information system
4
The Information System (cont..)
Applications
Transform data into information that forms the basis for
decision making
Usually produce the following:
Formal report
Tabulations
Graphic displays
Composed of following two parts:
Data
Code by which data are transformed into information
5
The Information System (cont..)
6
The Information System (cont..)
Information system performance depends on triad of factors:
Database design and implementation
Application design and implementation
Administrative procedures
Database development
Is the process of database design and implementation
The primary objective is to create complete, normalized, non-
redundant (to the extent possible), and fully integrated
conceptual, logical, and physical database models
7
The Systems Development Life Cycle (SDLC)
Traces history (life cycle) of information system
Provides “big picture” within which database design and
application development can be mapped out and evaluated
Divided into following five phases:
Planning
Analysis
Detailed systems design
Implementation
Maintenance
Iterative rather than sequential process
8
The Systems Development Life Cycle (SDLC) (cont..)
9
Planning
Yields a general overview of the company and its objectives
Is an initial assessment made of information-flow-and-extent
requirements
Must begin to study and evaluate alternate solutions
Technical aspects of hardware and software requirements
System cost
10
Analysis
The problems defined during planning phase are examined in
greater detail during analysis
Thorough audit of user requirements
The existing hardware and software systems are studied
Goal is a better understanding of :
system’s functional areas,
The actual and potential problems,
and opportunities
11
Analysis (cont..)
Includes the creation of logical system design
Must specify appropriate conceptual data model, inputs,
processes, and expected output requirements
Might use tools such as data flow diagrams (DFDs), hierarchical
input process output (HIPO) diagrams, and entity relationship
(ER) diagrams
Yields functional descriptions of system’s components
(modules) for each process within database environment
12
Detailed Systems Design
The designer completes the design of the system’s processes
Includes all necessary technical specifications
The steps are laid out for conversion from old to new system
The training principles and methodologies are also planned
Submitted for management approval
13
Implementation
Hardware, DBMS software, and application programs are
installed,
and the database design is implemented
The system enters into:
A cycle of coding,
Testing, and debugging continues until it is ready to be
delivered
The actual database is created and the system is customized by
creation of tables and views,
and user authorizations
14
Maintenance
Maintenance activities can be grouped into three types:
Corrective maintenance in response to systems errors
Adaptive maintenance due to changes in business environment
Perfective maintenance to enhance system
Computer-assisted systems engineering (CASE)
Make it possible to produce better systems within reasonable
amount of time and at reasonable cost
CASE-produced applications are structured, documented,
standardized
15
16
The Database Life Cycle (DBLC)
Six phases:
Database initial study
Database design
Implementation and loading
Testing and evaluation
Operation
Maintenance and evolution
The Database Life Cycle (DBLC)
17
The Database Initial Study
Overall purpose:
Analyze company situation
Define problems and constraints
Define objectives
Define scope and boundaries
Fig 10.4 in the next slide depicts the interactive and iterative
processes required to complete first phase of DBLC successfully
18
The Database Initial Study (cont..)
Fig 10.4 in your book
19
Analyze the Company Situation
Analysis–To break up any whole into its parts so as to find out
their nature, function, and so on
Company situation
General conditions in which company operates, its
organizational structure, and its mission
Analyze company situation
Discover what company’s operational components are, how they
function, and how they interact
20
Define Problems and Constraints
Managerial view of company’s operation is often different from
that of end users
The Database Designer must continue to carefully probe to
generate additional information that will help define problems
within larger framework of company operations
Finding precise answers is important
Defining problems does not always lead to perfect solution
21
Define Objectives
Designer must ensure that database system objectives
correspond to those envisioned by end user(s)
Designer must begin to address following questions:
What is proposed system’s initial objective?
Will system interface with other existing or future systems in
the company?
Will system share data with other systems or users?
22
Define Scope and Boundaries
Scope
Defines extent of design according to operational requirements
Helps define required data structures, type and number of
entities, and physical size of database
Boundaries
Limits external to system
Often imposed by existing hardware and software
23
Database Design
Necessary to concentrate on data
Characteristics required to build database model
Two views of data within system:
Business view of
data as information source
Designer’s view of
data structure, its access, and activities required to transform
data into information
24
Database Design (cont..)
Fig 10.5 in your book
25
Database Design (cont..)
Loosely related to analysis and design of larger system
The Systems analysts or systems programmers are in charge of
designing other system components
Their activities create procedures that will help transform data
within database into useful information
Does not constitute sequential process
Iterative process that provides continuous feedback designed to
trace previous steps
26
Database Design (cont..)
27
I. Conceptual Design Overview
Data modeling used to create an abstract database structure
that represents real-world objects in most realistic way possible
Must embody clear understanding of business and its functional
areas
Ensure that all data needed are in model, and that all data in the
model are needed
28
I. Conceptual Design Overview (cont..)
Requires four steps
Data analysis and requirements
Discover data element characteristics
Obtains characteristics from different sources
Take into account business rules
Derived from description of operations
Entity relationship modeling and normalization
Designer enforces standards in design documentation
Use of diagrams and symbols, documentation writing style,
layout, other conventions
29
30
I. Conceptual Design Overview (cont..)
3. Data model verification
Verified against proposed system processes
Revision of original design
Careful reevaluation of entities
Detailed examination of attributes describing entities
Define design’s major components as modules:
Module: information system component that handles specific
function
31
I. Conceptual Design Overview (cont..)
Data model verification (cont…)
Verification process
Select central (most important) entity
Defined in terms of its participation in most of model’s
relationships
Identify module or subsystem to which central entity belongs
and define boundaries and scope
Place central entity within module’s framework
32
I. Conceptual Design Overview(cont..)
Distributed database design
Portions of the database may reside in different physical
locations
Processes accessing the database vary from one location to
another
The Designer must also develop data distribution and allocation
strategies
II. DBMS Software Selection
Critical to information system’s smooth operation
Common factors affecting purchasing decisions:
Cost
DBMS features and tools
Underlying model
Portability
DBMS hardware requirements
Advantages and disadvantages should be carefully studied
33
III. Logical Design Overview
Used to translate conceptual design into internal model for
selected database management system
Logical design is software-dependent
Requires that all objects in model be mapped to specific
constructs used by selected database software
Definition of attribute domains, design of required tables,
access restriction formats
Tables must correspond to entities in conceptual design
Translates software-independent conceptual model into
software-dependent logical model
34
III. Logical Design Overview (cont..)
The logical design stage consists of the following phases:
Creating the logical data model.
Validating the logical data model using normalization.
Assigning and validating integrity constraints.
Merging logical models constructed for different parts for the
database together.
Reviewing the logical data model with the use
35
IV. Physical Design Overview
Is the Process of selecting data storage and data access
characteristics of database
Storage characteristics are function of device types supported
by hardware, type of data access methods supported by system,
and DBMS
Particularly important in older hierarchical and network models
Becomes more complex when data are distributed at different
locations
36
IV. Physical Design Overview (cont..)
Physical database design can be broken down into a number of
stages:
Analyze data volume and database usage.
Translate each relation identified in the logical data model into
tables.
Determine a suitable file organization.
Define indexes.
Define user views.
Estimate data storage requirements.
Determine database security for users.
37
Implementation and Loading
New database implementation requires creation of special
storage-related constructs to house end-user tables
38
Performance
Is one of the most important factors in certain database
implementations
Not all DBMSs have performance-monitoring and fine-tuning
tools embedded in their software
Performance evaluation is rendered more difficult as there is no
standard measurement for database performance
39
Backup and Recovery
Database can be subject to data loss through unintended data
deletion and power outages
Data backup and recovery procedures
Create safety valve
Allow database administrator to ensure availability of consistent
data
Integrity
Enforced through proper use of primary and foreign key rules
40
Company Standards
May partially define database standards
Database administrator must implement and enforce such
standards
Database Security
Data must be protected from access by unauthorized users
Establish security goals
- What are we trying to protect the database from?
- What security related problems are we trying to prevent?
The most common security goals relate to the integrity,
confidentiality and the availability of data.
41
Data Security Measures
Physical security allows only authorized personnel physical
access to specific areas.
User authentication is a way of identifying the user and
verifying that the user is allowed to access some restricted data
or application.
achieved through the use of passwords and access rights.
Audit trails are usually provided by the DBMS to check for
access violations.
42
Data Security Measures (cont..)
Data encryption
Can be used to render data useless to unauthorised users.
ORACLE DBMS has a Transparent Data Encryption
User-defined policies and procedures
Backup and recovery strategies should be in place in the event
of a disaster occurring
Antivirus software
Firewalls are systems comprising of hardware devices or
software applications which act as gatekeepers to an
organisation’s network.
For more details on security measures read the slides after the
chapter summary
43
Testing and Evaluation
This phase occurs in parallel with applications programming
Programmers use database tools to prototype applications
during coding of the programs
If the DB implementation fails to meet some of system’s
evaluation criteria, several options may be considered to
enhance the system:
Fine-tune specific system and DBMS configuration parameters
Modify physical design
Modify logical design
Upgrade or change DBMS software and/or hardware platform
44
Operation
Once the database has passed the evaluation stage, it is
considered operational
The beginning of the operational phase starts the process of
system maintenance and evolution
45
Maintenance and Evolution
Required periodic maintenance:
Preventive maintenance (backup)
Corrective maintenance (recovery)
Adaptive maintenance
Assignment of access permissions and their maintenance for
new and old users
Generation of database access statistics
Periodic security audits
Periodic system-usage summaries
46
Parallel Activities in the DBLC and the SDLC
47
Summary
Information system is designed to facilitate transformation of
data into information and to manage both data and information
SDLC traces history (life cycle) of an application within the
information system
DBLC describes history of database within the information
system
Database design and implementation process moves through
series of well-defined stages
Conceptual portion of design may be subject to several
variations, based on two design philosophies
48
Summary (cont..)
Threats to database security include the loss of integrity,
confidentiality and availability of data.
The database administrator (DBA) is responsible for managing
the corporate database.
The development of the data administration strategy is closely
related to the company’s mission and objectives.
49
Threats to Security
Threats are any set of circumstances that have the potential to
cause loss, misuse or harm to the system and/or its data.
Threats can cause:
The loss of the integrity of data through unauthorized
modification.
For example a person gaining unauthorized access to a bank
account and removing some money from the account.
50
Threats to Security
The loss of availability of the data.
For example some adversary causes the database system from
being operational which stops authorized users of the data from
accessing it.
The loss of confidentiality of the data (also referred to as the
privacy of data).
This could be caused by a person gaining access to private
information such as a password or a bank account balance.
51
Examples of Threats
Theft and fraud of data.
Human error which causes accidental loss of data.
Electronic infections
Viruses
Email Viruses
Worms
Trojan Horses
52
Examples of Threats (cont..)
The occurrence of natural disasters such as hurricanes, fires or
floods.
Unauthorized access and modification of data.
Employee sabotage is concerned with the deliberate acts of
malice against the organization.
Poor database administration.
53
Examples of Threats (cont..)
54
Database Design Strategies
Two classical approaches to database design:
Top-down design
Identifies data sets
Defines data elements for each of those sets
Bottom-up design
Identifies data elements (items)
Groups them together in data sets
55
Database Design Strategies Top-down vs. bottom-up design
sequencing
56
Centralized vs. Decentralized Design
Database design may be based on two very different design
philosophies:
Centralized design
Productive when data component is composed of relatively
small number of objects and procedures
Decentralized design
Used when data component of system has considerable number
of entities and complex relations on which very complex
operations are performed
57
Centralized vs. Decentralized Design Centralized Design
58
Decentralized Design
59
Centralized vs. Decentralized Design (cont..)
Aggregation process
Requires designer to create single model in which various
aggregation problems must be addressed:
Synonyms and homonyms
Entity and entity subtypes
Conflicting object definitions
60
Centralized vs. Decentralized Design Summary of aggregation
problems
61
Database Administration
Data management is a complex job
Led to the development of the database administration function.
The person responsible for the control of the centralized and
shared database is the database administrator (DBA).
62
DBA Activities
Database planning, including the definition of standards,
procedures and enforcement.
Database requirements gathering and conceptual design.
Database logical design and transaction design.
63
DBA Activities (cont..)
Database physical design and implementation.
Database testing and debugging.
Database operations and maintenance, including installation,
conversion and migration.
Database training and support.
64
The DBA in the Organisation
65
DBA Skills
66
The Managerial Role of the DBA
67
The Managerial Role of the DBA (cont..)
End-User Support
Gathering user requirements
Building end-user confidence.
Resolving conflicts and problems.
Finding solutions to information needs.
Ensuring quality and integrity of applications and data.
Managing the training and support of DBMS users.
68
The Managerial Role of the DBA (cont..)
Policies, Procedures and Standards
Policies are general statements of direction or action that
communicate and support DBA goals.
Standards are more detailed and specific than policies and
describe the minimum requirements of a given DBA activity.
Procedures are written instructions that describe a series of
steps to be followed during the performance of a given activity.
69
The Managerial Role of the DBA (cont..)
Data Security, Privacy and Integrity
Protecting the security and privacy of the data in the database is
a function of authorization management.
Authorization management defines procedures to protect and
guarantee database security and integrity.
Includes: user access management, view definition, DBMS
access control and DBMS usage monitoring.
70
The Managerial Role of the DBA (cont..)
Data Backup and Recovery
Many DBA departments have created a position staffed by the
database security officer (DSO).
The DSO’s activities are often classified as disaster
management.
Disaster management includes all of the DBA activities
designed to secure data availability following a physical
disaster or a database integrity failure.
Disaster management includes all planning, organizing and
testing of database contingency plans and recovery procedures.
71
The Managerial Role of the DBA (cont..)
Data Distribution and Use
The DBA is responsible for ensuring that the data are
distributed to the right people, at the right time and in the right
format.
72
The Technical Role of the DBA
Evaluating, selecting and installing the DBMS and related
utilities.
Designing and implementing databases and applications.
Testing and evaluating databases and applications.
Operating the DBMS, utilities and applications.
Training and supporting users.
Maintaining the DBMS, utilities and applications.
73
Evaluating, Selecting and Installing the DBMS and Utilities
(DBA)
Covers the selection of the database management system, utility
software and supporting hardware for use in the organization.
Must be based primarily on the organization’s needs
The DBA would be wise to develop a checklist of desired
DBMS features.
74
Designing and Implementing Databases and Applications (DBA)
Covers data modelling and design services to the end-user
community
Determine and enforce standards and procedures to be used.
DBA then provides the necessary assistance and support during
the design of the database at the conceptual, logical and
physical levels
75
Testing and Evaluating Databases and Applications (DBA)
The DBA must also provide testing and evaluation services for
all of the database and end-user applications.
Those services are the logical extension of the design,
development and implementation services.
Testing procedures and standards must already be in place
before any application program can be approved for use in the
company.
76
Operating the DBMS, Utilities and Applications (DBA)
DBMS operations can be divided into four main areas:
System support.
Performance monitoring and tuning.
Backup and recovery.
Security auditing and monitoring.
77
Training and Supporting Users (DBA)
Training people to use the DBMS and its tools is included in the
DBA’s technical activities.
The DBA also provides or secures technical training in the use
of the DBMS and its utilities for the applications programmers.
78
Maintaining the DBMS, Utilities and Applications (DBA)
The maintenance activities of the DBA are an extension of the
operational activities.
Maintenance activities are dedicated to the preservation of the
DBMS environment.
79
Chap1 .pptx
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 1
THE DATABASE APPROACH
In this chapter, you will learn:
The difference between data and information
What a database is, what the different types of databases are,
and why they are valuable assets for decision making
The importance of database design
How modern databases evolved from file systems
About flaws in file system data management
What the database system’s main components are and how a
database system differs from a file system
The main functions of a database management system (DBMS)
The role of Open Source Database Systems
The importance of Data Governance and Data Quality
2
Data vs. Information
Data:
Raw facts; building blocks of information
Unprocessed information
Information:
Data processed to reveal meaning
Accurate, relevant, and timely information is the key to good
decision making
Good decision making is the key to survival in a global
environment
3
Transforming Raw Data into Information
Fig 1.1 p6 Initial survey screen
4
Transforming Raw Data into Information (cont..)
Fig 1.1 Information in graphic format
5
Data Quality and Data Governance
Data Quality can be examined at a number of different levels
including:
Accuracy: Is the data accurate and come from a verifiable
source?
Relevance: Is the data relevant to the organisation?
Completeness: Is the required data being stored?
Timeliness: Is the data updated frequently in order to meet the
business requirements?
Uniqueness: Is the data unique and there is no redundancy in the
database?
Unambiguous: Is the meaning of the data clear.
6
Data Quality and Data Governance (cont…)
Data governance is the term used to describe a strategy or
methodology defined by an organisation to safeguard data
quality.
Each organisation produces its own data governance strategy
which will involve the development of a series of policies and
procedures for managing availability, usability, quality,
integrity, and security of data within the organisation.
Introducing the Database and the DBMS
Database—shared, integrated computer structure that stores:
End user data (raw facts)
Metadata (data about data)
DBMS (database management system):
Collection of programs that manages database structure and
controls access to data
Possible to share data among multiple applications or users
Makes data management more efficient and effective
The DBMS hides much of the database’s internal complexity
from the application programs and users.
The application program might be written by a programmer
using a programming language such as COBOL, Visual basic,
C++, or Java or it might be created through a DBMS utility
program.
8
Role and Advantages of the DBMS (cont.)
A DBMS provides advantages such as :
Improved data sharing. Users have better access to more and
better-managed data
Better data integration. Promotes integrated view of
organization’s operations
Minimised data inconsistency. Probability of data inconsistency
is greatly reduced
Improved data access. Possible to produce quick answers to ad
hoc queries
9
Role and Advantages of the DBMS (cont..)
10
Types of Databases
Single-user:
Supports only one user at a time
Desktop:
Single-user database running on a personal computer
Multi-user:
Supports multiple users at the same time
Workgroup:
Multi-user database that supports a small group of users or a
single department
Enterprise:
Multi-user database that supports a large group of users or an
entire organization
11
Types of Databases (cont..)
Can be classified by location:
Centralized:
Supports data located at a single site
Distributed:
Supports data distributed across several sites
Can be classified by use:
Transactional (or production):
Supports a company’s day-to-day operations
Data warehouse:
Stores data used to generate information required to make
tactical or strategic decisions
Often used to store historical data
Structure is quite different
12
Why Database Design is Important
Database design refers to the activities that focus on the design
of the database structure that will be used to store and manages
end-user data.
Defines the database’s expected use
Different approach needed for different types of databases
Avoid redundant data
13
Historical Roots: Files and Data Processing
Managing data with file systems is obsolete
Understanding file system characteristics makes database design
easier to understand
Awareness of problems with file systems helps prevent similar
problems in DBMS
Knowledge of file systems is helpful if you plan to convert an
obsolete file system to a DBMS
14
Historical Roots: Files and Data Processing (cont..)
Manual File systems:
Collection of file folders kept in file cabinet
Organization within folders based on data’s expected use
(ideally logically related)
System adequate for small amounts of data with few reporting
requirements
Finding and using data in growing collections of file folders
became time-consuming and cumbersome
15
Historical Roots: Files and Data Processing (cont..)
Computerised file systems:
Conversion from manual to computer system:
Could be technically complex, requiring hiring of data
processing (DP) specialists
Resulted in numerous “home-grown” systems being created
Initially, computer files were similar in design to manual files
(see Figure 1.3)
16
Historical Roots: Files and Data Processing cont..)
17
Historical Roots: Files and Data Processing (cont..)
Fig 1.3
18
Historical Roots: Files and Data Processing (cont..)
DP specialist wrote programs for reports:
Monthly summaries of types and amounts of insurance sold by
agents
Monthly reports about which customers should be contacted for
renewal
Reports that analyzed ratios of insurance types sold by agent
Customer contact letters summarizing coverage
Other departments requested databases be written for them
SALES database created for sales department
AGENT database created for personnel department (see Fig 1.4
next)
19
Historical Roots: Files and Data Processing (cont…)
20
Historical Roots: Files and Data Processing(cont..)
As number of databases increased, small file system evolved
Each file used its own application programs
Each file was owned by individual or department who
commissioned its creation
21
Historical Roots: Files and Data Processing (cont)
22
Example of Early Database Design (cont…)
As system grew, demand for DP’s (Data Specialists)
programming skills grew
Additional programmers hired
DP specialist evolved into DP manager, supervising a DP
department
Primary activity of department (and DP manager) remained
programming
23
Problems with File System Data Management
Every task requires extensive programming in a third-generation
language (3GL)
Programmer must specify task and how it must be done
Modern databases use fourth-generation languages (4GL)
Allow users to specify what must be done without specifying
how it is to be done
24
Problems with File System Data Management
Lengthy development times.
Difficulty in getting quick answers.
Complex System Administration
Lack of security and limited data sharing
Extensive Programming
25
Structural and Data Dependence
Structural dependence (SD)
A file systems exhibits SD; that is, access to a file depends on
its structure
Data independence
Changes in the data storage characteristics without affecting the
application program’s ability to access the data
The practical significance of data dependence is the difference
between the:
Logical data format
How the human being views the data
And the Physical data format
How the computer “sees” the data
26
Field Definitions and Naming Conventions
Flexible record definition anticipates reporting requirements by
breaking up fields into their component parts
27
Data Redundancy
Data redundancy results in data inconsistency
Different and conflicting versions of the same data appear in
different places
Errors more likely to occur when complex entries are made in
several different files and/or recur frequently in one or more
files
A Data anomaly develops when required changes in the
redundant data are not made successfully
Types of data anomalies:
Update anomalies
Occur when changes must be made to existing records
Insertion anomalies
Occur when entering new records
Deletion anomalies
Occur when deleting records
28
Database Systems
Problems inherent in file systems make using a database system
very desirable
File system
Many separate and unrelated files
Database
Logically related data stored in a single logical data repository
29
Database Systems
30
The Database System Environment
Database system is composed of five main parts:
Hardware
Software
Operating system software
DBMS software
Application programs and utility software
People
Procedures
Data
31
The Database System Environment (cont…)
32
DBMS Functions
DBMS performs functions that guarantee integrity and
consistency of data
Data dictionary management
defines data elements and their relationships
Data storage management
stores data and related data entry forms, report definitions, etc.
Data transformation and presentation
translates logical requests into commands to physically locate
and retrieve the requested data
Security management
enforces user security and data privacy within database
33
DBMS Functions (cont…)
Multiuser access control
uses sophisticated algorithms to ensure multiple users can
access the database concurrently without compromising the
integrity of the database
Backup and recovery management
provides backup and data recovery procedures
Data integrity management
promotes and enforces integrity rules
Database access languages and application programming
interfaces
provide data access through a query language
Database communication interfaces
allow database to accept end-user requests via multiple,
different network environments
34
DBMS Functions (continued)
35
DBMS Functions (cont…)
36
Summary
Data are raw facts. Information is the result of processing data
to reveal its meaning.
To implement and manage a database, use a DBMS.
Database design defines the database structure.
A well-designed database facilitates data management and
generates accurate and valuable information.
A poorly designed database can lead to bad decision making,
and bad decision making can lead to the failure of an
organization.
Databases were preceded by file systems.
Limitations of file system data management:
requires extensive programming
system administration complex and difficult
making changes to existing structures is difficult
security features are likely to be inadequate
independent files tend to contain redundant data
DBMS’s were developed to address file systems’ inherent
weaknesses
37
Types of Databases (cont..)
Open Source
Open Source software is that it is free to acquire and use the
product itself.
However, there will be costs involved in the development and
on-going support of the software.
LAMP is used to define the most the most popular open source
software namely; Linux, Apache Web server, MySQL DBMS,
and the Perl PHP/Python development languages.
38
R_Ch06- Data Modelling Advanced Concepts.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
Lecture 6 - CHAPTER 6 : Data Modelling
Advanced Concepts
*
ObjectivesIn this chapter, you will learn:About the extended
entity relationship (EER) model’s main constructsHow entity
clusters are used to represent multiple entities and
relationshipsThe characteristics of good primary keys and how
to select themHow to use flexible solutions for special data
modeling casesWhat issues to check for when developing data
models based on EER diagrams
*
The Extended Entity Relationship ModelResult of adding more
semantic constructs to original entity relationship (ER)
modelDiagram using this model is called an EER diagram
(EERD)
Entity Supertypes and SubtypesEntity supertype Generic entity
type that is related to one or more entity subtypesContains
common characteristicsEntity subtypes Contains unique
characteristics of each entity subtype
*
*
Entity Supertypes and Subtypes (cont..)
*
Specialization HierarchyDepicts an arrangement of higher-level
entity supertypes and lower-level entity subtypesRelationships
are described in terms of “IS-A” relationshipsSubtype exists
only within context of supertypeEvery subtype has only one
supertype to which it is directly relatedCan have many levels of
supertype/subtype relationships
Figure 6.2 in your book as well
*
Specialization Hierarchy (cont..)
Figure 6.2 in your book as well
Specialization Hierarchy (cont..)Support attribute
inheritanceDefine special supertype attribute known as subtype
discriminatorDefine disjoint/overlapping constraints and
complete/partial constraints
*
*
InheritanceEnables entity subtype to inherit attributes and
relationships of supertypeAll entity subtypes inherit their
primary key attribute from their supertypeAt implementation
level, supertype and its subtype(s) maintain a 1:1
relationshipEntity subtypes inherit all relationships in which
supertype entity participatesLower-level subtypes inherit all
attributes and relationships from all upper level-supertypes
Inheritance (cont..)
*
Inheritance (cont..)
*
*
Natural Keys and Primary KeysNatural key is a real-world
identifier used to uniquely identify real-world objectsFamiliar
to end users and forms part of their day-to-day business
vocabularyGenerally data modeler uses natural identifier as
primary key of entity being modeledMay instead use composite
primary key or surrogate key
*
Primary Key GuidelinesA Primary key is an attribute or
combination of attributes that uniquely identifies entity
instances in an entity setCould also be combination of
attributesMain function is to uniquely identify an entity
instance or row within a tableGuarantee entity integrity, not to
“describe” the entityPrimary keys and foreign keys implement
relationships among entitiesBehind the scenes, hidden from user
Primary Key Guidelines (cont..)
*
Primary Key Guidelines (cont..)
*
*
Entity Integrity:
Selecting Primary KeysPrimary key most important
characteristic of an entitySingle attribute or some combination
of attributesPrimary key’s function is to guarantee entity
integrityPrimary keys and foreign keys work together to
implement relationshipsProperly selecting primary key has
direct bearing on efficiency and effectiveness
*
When to Use Composite Primary KeysComposite primary keys
are useful in two cases:As identifiers of composite
entitiesWhere each primary key combination allowed once in
M:N relationshipAs identifiers of weak entitiesWhere weak
entity has a strong identifying relationship with the parent
entityAutomatically provides benefit of ensuring that there
cannot be duplicate values
Figure 6.7 in your book
*
When to Use Composite Primary Keys (cont..)
Figure 6.7 in your book
*
When to Use Composite Primary Keys (cont..)When used as
identifiers of weak entities normally used to represent:Real-
world object that is existent-dependent on another real-world
objectReal-world object that is represented in data model as two
separate entities in strong identifying relationshipDependent
entity exists only when it is related to parent entity
*
When To Use Surrogate Primary KeysEspecially helpful when
there is:No natural keySelected candidate key has embedded
semantic contentsSelected candidate key is too long or
cumbersomeIf you use surrogate keyEnsure that candidate key
of entity in question performs properlyUse “unique index” and
“not null” constraints
When To Use Surrogate Primary Keys (cont..)
*
*
Design Cases:
Learning Flexible Database DesignData modeling and design
requires skills acquired through experienceExperience acquired
through practiceFour special design cases that
highlight:Importance of flexible designProper identification of
primary keysPlacement of foreign keys
*
Design Case #1: Implementing 1:1 RelationshipsForeign keys
work with primary keys to properly implement relationships in
relational modelPut primary key of the “one” side (parent
entity) on the “many” side (dependent entity) as foreign
keyPrimary key: parent entityForeign key: dependent entity
*
Design Case #1: Implementing 1:1 Relationships (cont..)In 1:1
relationship two options:Place a foreign key in both entities
(not recommended)Place a foreign key in one of the entities
Primary key of one of the two entities appears as foreign key of
other
Design Case #1: Implementing
1:1 Relationships (continued)
*
Figure 6.9 in your book
*
Design Case #1: Implementing
1:1 Relationships (cont..)
Figure 6.9 in your book
*
Design Case #2: Maintaining History of Time-Variant
DataNormally, existing attribute values replaced with new value
without regard to previous valueTime-variant data:Values
change over timeMust keep a history of data changesKeeping
history of time-variant data equivalent to having a multivalued
attribute in your entityMust create new entity in 1:M
relationships with original entityNew entity contains new value,
date of change
Figure 6.10 in your book
*
Design Case #2: Maintaining
History of Time-Variant Data (cont..)
Figure 6.10 in your book
Figure 6.11 in your book
*
Design Case #2: Maintaining
History of Time-Variant Data (cont..)
Figure 6.11 in your book
*
Design Case #3: Fan TrapsDesign trap occurs when relationship
is improperly or incompletely identifiedRepresented in a way
not consistent with the real worldMost common design trap is
known as fan trapFan trap occurs when one entity is in two 1:M
relationships to other entitiesProduces an association among
other entities not expressed in the model
Figure 6.12 in your book
*
Design Case #3: Fan Traps (cont..)
Figure 6.12 in your book
*
Design Case #4:
Redundant RelationshipsRedundancy is seldom a good thing in
database environmentOccur when there are multiple relationship
paths between related entitiesMain concern is that redundant
relationships remain consistent across modelSome designs use
redundant relationships to simplify the design
Figure 6.13 in your book
*
Design Case #4:
Redundant Relationships (cont..)
Figure 6.13 in your book
Figure 6.14 in your book.
*
Design Case #4:
Redundant Relationships (cont..)
Figure 6.14 in your book.
*
Data Modeling ChecklistData modeling translates specific real-
world environment into data modelRepresents real-world data,
users, processes, interactionsEERM (Extented Entity
Relationship Model) enables the designer to add more semantic
content to the modelData modeling checklist helps ensure data
modeling tasks successfully performedBased on concepts and
tools learned since Chapter 3
Data Modeling Checklist
*
Data Modeling Checklist (cont..)
*
*
SummaryExtended entity relationship (EER) model adds
semantics to ER modelAdds semantics via entity supertypes,
subtypes, and clustersEntity supertype is a generic entity type
related to one or more entity subtypesSpecialization hierarchy
Depicts arrangement and relationships between entity
supertypes and entity subtypesInheritance means an entity
subtype inherits attributes and relationships of supertype
*
Summary (cont..)Subtype discriminator determines which entity
subtype the supertype occurrence is related to:Partial or total
completenessSpecialization vs. generalizationEntity cluster is
“virtual” entity typeRepresents multiple entities and
relationships in ERDFormed by combining multiple interrelated
entities and relationships into a single object
*
Summary (cont..)Natural keys are identifiers that exist in real
worldSometimes make good primary keysCharacteristics of
primary keys:Must have unique valuesShould be
nonintelligentMust not change over timePreferably numeric or
composed of single attribute
*
Summary (cont..)Composite keys are useful to represent M:N
relationships Weak (strong-identifying) entitiesSurrogate
primary keys are useful when no suitable natural key makes
primary keyIn a 1:1 relationship, place the PK of mandatory
entityAs FK in optional entityAs FK in entity that causes least
number of nullsAs FK where the role is played
*
Summary (cont..)Time-variant data Data whose values change
over timeRequires keeping a history of changesTo maintain
history of time-variant data:Create entity containing the new
value, date of change, other time-relevant dataEntity maintains
1:M relationship with entity for which history maintained
*
Summary (cont..)Fan trap occurs when you have:One entity in
two 1:M relationships to other entities and there is an
Association among the other entities not expressed in
modelRedundant relationships occur when multiple relationship
paths between related entitiesMain concern is that they remain
consistent across the modelData modeling checklist provides
way to check that the ERD meets minimum requirements
ADDITIONAL SLIDES
Please find additional slides to have a look at..
*
*
Subtype DiscriminatorAttribute in supertype entity Determines
to which entity subtype each supertype occurrence is
relatedDefault comparison condition for subtype discriminator
attribute is equality comparisonSubtype discriminator may be
based on other comparison condition
*
Disjoint and Overlapping ConstraintsDisjoint subtypesAlso
known as non-overlapping subtypesSubtypes that contain unique
subset of supertype entity setOverlapping subtypesSubtypes that
contain nonunique subsets of supertype entity set
Figure 6.4 Same as in your book
*
Disjoint and Overlapping Constraints (cont..)
Figure 6.4 Same as in your book
Disjoint and Overlapping Constraints (cont..)
*
*
Completeness ConstraintSpecifies whether entity supertype
occurrence must be a member of at least one subtypeCan be
partial or totalPartial completenessSymbolized by a circle over
a single lineSome supertype occurrences that are not members
of any subtypeTotal completenessSymbolized by a circle over a
double lineEvery supertype occurrence must be member of at
least one subtype
Table 6.2 same as in your book..
*
Completeness Constraint (cont..)
Table 6.2 same as in your book..
*
Specialization and GeneralizationSpecializationIdentifies more
specific entity subtypes from higher-level entity supertypeTop-
down process of identifying lower-level, more specific entity
subtypes from higher-level entity supertypeBased on grouping
unique characteristics and relationships of the subtypes
*
Specialization and Generalization
(cont..)GeneralizationIdentifies more generic entity supertype
from lower-level entity subtypesBottom-up process of
identifying higher-level, more generic entity supertype from
lower-level entity subtypesBased on grouping common
characteristics and relationships of the subtypes
Composition and AggregationAggregationa larger entity can be
composed of smaller entities.Compositionspecial case of
aggregationwhen the parent entity instance is deleted, all child
entity instances are automatically deleted.
*
Composition and Aggregation (cont..)
*
Using Aggregation and CompositionAn aggregation construct is
used when an entity is composed of (or is formed by) a
collection of other entities, but the entities are independent of
each other. the relationship can be classified as a ‘has_a’
relationship type.A composition construct is used when two
entities are associated in an aggregation association with a
strong identifying relationship. deleting the parent deletes the
children instances.
*
Aggregation and Composition
*
*
Entity ClusteringA “Virtual” entity type is used to represent
multiple entities and relationships in ERDConsidered “virtual”
or “abstract” because it is not actually an entity in final
ERDTemporary entity used to represent multiple entities and
relationshipsEliminate undesirable consequencesAvoid display
of attributes when entity clusters are used
Figure 6.6 in your book
*
Entity Clustering (cont..)
Figure 6.6 in your book
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
DB-Lecture3_ch03.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 3
Relational Model Characteristics
ObjectivesIn this chapter, you will learn:That the relational
database model offers a logical view of dataAbout the relational
model’s basic component: relationsThat relations are logical
constructs composed of rows (tuples) and columns
(attributes)That relations are implemented as tables in a
relational DBMSAbout relational database operators, the data
dictionary, and the system catalogHow data redundancy is
handled in the relational database modelWhy indexing is
important
*
A Logical View of DataRelational model Enables the
programmer to view data logically rather than physicallyTable
Has structural and data independenceResembles a file
conceptuallyRelational database model easier to understand than
its hierarchical and network database predecessors modelsTable
also called a relation because the relational model’s creator,
Codd, used the term relation as a synonym for table
*
Tables and Their CharacteristicsLogical view of relational
database based on relationRelation thought of as a tableThink of
a table as a persistent relation:A relation whose contents can be
permanently saved for future useTable: two-dimensional
structure composed of rows and columnsPersistent
representation of logical relationContains group of related
entities = an entity set
*
Properties of a Relation
*
Example Relation / Table
*
Attributes and Domains
*Each attribute is a named column within the relational table
and draws its values from a domain.
The domain of values for an attribute should contain only
atomic values and any one value should not be divisible into
components.
No attributes with more than one value are allowed.
Degree and CardinalityDegree and cardinality are two important
properties of the relational model.
A relation with N columns and N rows is said to be of degree N
and cardinality N.
The degree of a relation is the number of its attributes and the
cardinality of a relation is the number of its tuples.
The product of a relation’s degree and cardinality is the number
of attribute values it contains.
*
Relational Schema
A relational schema is a textual representation of the database
tables, where each table is described by its name followed by
the list of its attributes in parentheses.
KeysA key consists of one or more attributes that determine
other attributes
Primary key (PK) is an attribute (or a combination of attributes)
that uniquely identifies any given entity (row)
A Key’s role is based on determinationIf you know the value of
attribute A, you can look up (determine) the value of attribute B
*
*
Keys (cont..)
Relational Database Keys (cont….)Composite keyComposed of
more than one attributeKey attributeAny attribute that is part of
a keySuperkeyAny key that uniquely identifies each
rowCandidate key A superkey without redundancies and without
unnecessary attributesEx: Stud_ID, Stud_lastname
*
Keys (cont..)Nulls:No data entryNot permitted in primary
keyShould be avoided in other attributesCan representAn
unknown attribute valueA known, but missing, attribute valueA
“not applicable” conditionCan create problems when functions
such as COUNT, AVERAGE, and SUM are usedCan create
logical problems when relational tables are linkedControlled
redundancy:Makes the relational database workTables within
the database share common attributes that enables the tables to
be linked togetherMultiple occurrences of values in a table are
not redundant when they are required to make the relationship
workRedundancy exists only when there is unnecessary
duplication of attribute values
*
Keys (cont..)
*
Keys (cont..)
*
Keys (cont..)Foreign key (FK) An attribute whose values match
primary key values in the related tableReferential integrity FK
contains a value that refers to an existing valid tuple (row) in
another relationSecondary key Key used strictly for data
retrieval purposes
*
Integrity RulesMany RDBMs enforce integrity rules
automaticallyIt is safer to ensure that your application design
conforms to entity and referential integrity rules Rules are
summarized in the next slide
Designers use flags to avoid nullsFlags indicate absence of
some valueFor Ex, the code -99 could be used as the
AGENT_CODE entry for the 4th row of the CUSTOMER Table
to indicate that customer Paul Olowsky does not have yet an
agent assigned to it
*
Integrity Rules
*
Integrity Rules
*
The Data Dictionary and System CatalogData dictionary
Provides detailed accounting of all tables found within the
user/designer-created database
Contains (at least) all the attribute names and characteristics for
each table in the system
Contains metadata: data about data
Sometimes described as “the database designer’s database”
because it records the design decisions about tables and their
structures
*
*
A Sample Data Dictionary
The Data Dictionary and System Catalog (cont..)System
catalogContains metadata
Detailed system data dictionary that describes all objects within
the database
Terms “system catalog” and “data dictionary” are often used
interchangeably
Can be queried just like any user/designer-created table
*
Relationships within the Relational Database1:M relationship
Relational modeling idealShould be the norm in any relational
database design1:1 relationship Should be rare in any
relational database designM:N relationships Cannot be
implemented as such in the relational modelM:N relationships
can be changed into two 1:M relationships
*
The 1:M RelationshipRelational database normFound in any
database environment
*
*
The 1:M Relationship (cont…)
The 1:1 RelationshipOne entity related to only one other entity,
and vice versaSometimes means that entity components were
not defined properlyCould indicate that two entities actually
belong in the same tableCertain conditions absolutely require
their useAs rare as 1:1 relationships should be, certain
conditions absolutely require their use
*
*
The 1:1 Relationship (cont…)
*
The 1:1 Relationship (cont…)
The M:N RelationshipCan be implemented by breaking it up to
produce a set of 1:M relationshipsAvoid problems inherent to
M:N relationship by creating a composite entity or a bridge
entityThe composite entity Includes -as foreign keys- at least
the primary keys of the tables that are to to be linked
*
Implementation of a composite entityYields required M:M to
1:M conversionComposite entity table must contain at least the
primary keys of original tablesLinking table contains multiple
occurrences of the foreign key valuesAdditional attributes may
be assigned as needed
*
The M:M Relationship (cont..)
*
The M:M Relationship (cont…)
*
Figure 3.16 in the book
*
Figure 3.17 in your book
*
Data Redundancy RevisitedData redundancy leads to data
anomaliesSuch anomalies can destroy the effectiveness of the
database
Foreign keysControl data redundancies by using common
attributes shared by tablesCrucial to exercising data redundancy
control
Sometimes, data redundancy is necessary
*
Data Redundancy Revisited (cont…)
*
Data Redundancy Revisited (cont..)
*
*
Data Redundancy Revisited (cont…)
IndexesOrderly arrangement to logically access rows in a
tableIndex key Index’s reference pointPoints to data location
identified by the keyUnique indexIndex in which the index key
can have only one pointer value (row) associated with itEach
index is associated with only one table
*
*
Indexes (cont..)
Similar to Figure 3.20 of your book and better explained
Codd’s Relational Database RulesIn 1985, Codd published a list
of 12 rules to define a relational database system
The reason was the concern that many vendors were marketing
products as “relational” even though those products did not
meet minimum relational standards
*
SummaryTables (relations) are basic building blocks of a
relational databaseKeys are central to the use of relational
tablesKeys define functional dependenciesSuperkeyCandidate
keyPrimary keySecondary keyForeign keyEach table row must
have a primary key which uniquely identifies all attributes
Tables can be linked by common attributes. Thus, the primary
key of one table can appear as the foreign key in another table
to which it is linked
Good design begins by identifying appropriate entities and
attributes and the relationships among the entities. Those
relationships (1:1, 1:M, M:N) can be represented using ERDs.
*
Chap 5.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 5 Data Modelling
With Entity Relationship Diagrams
*
ObjectivesIn this chapter, you will learn:The main
characteristics of entity relationship componentsHow
relationships between entities are defined, refined, and
incorporated into the database design processHow ERD
components affect database design and implementationThat
real-world database design often requires the reconciliation of
conflicting goals
*
The Entity Relationship (ER) ModelER model forms the basis
of an ER diagramERD represents conceptual database as viewed
by the end userERDs depict database’s main
components:EntitiesAttributesRelationships
*
EntitiesRefers to entity set and not to single entity
occurrenceCorresponds to a table and not to row in relational
environmentIn Chen and Crow’s Foot models, entity is
represented by a rectangle with an entity’s nameEntity name, a
noun, written in capital letters
*
AttributesCharacteristics of entitiesChen notation: attributes
represented by ovals connected to entity rectangle with a
lineEach oval contains the name of attribute it representsCrow’s
Foot notation: attributes written in attribute box below entity
rectangle
*
*
Attributes (cont..)Required attribute: must have a valueOptional
attribute: may be left emptyDomain: set of possible values for
an attributeAttributes may share a domainIdentifiers: one or
more attributes that uniquely identify each entity
instanceComposite identifier: primary key composed of more
than one attribute
*
*
Attributes (cont..)Composite attribute can be subdividedSimple
attribute cannot be subdividedSingle-value attribute can have
only a single valueMultivalued attributes can have many values
*
Attributes (cont..)M:N relationships and multivalued attributes
should not be implementedCreate several new attributes for
each of the original multivalued attributes componentsCreate
new entity composed of original multivalued attributes
componentsDerived attribute: value may be calculated from
other attributesNeed not be physically stored within database
*
*
RelationshipsAssociation between entitiesParticipants are
entities that participate in a relationshipRelationships between
entities always operate in both directionsRelationship can be
classified as 1:MRelationship classification is difficult to
establish if only one side of the relationship is known
*
Connectivity and CardinalityConnectivity Describes the
relationship classificationCardinality Expresses minimum and
maximum number of entity occurrences associated with one
occurrence of related entityEstablished by very concise
statements known as business rules
*
*
Existence DependenceExistence dependenceEntity exists in
database only when it is associated with another related entity
occurrenceExistence independenceEntity can exist apart from
one or more related entitiesSometimes such an entity is referred
to as a strong or regular entity
*
Relationship StrengthWeak (non-identifying)
relationshipsExists if PK of related entity does not contain PK
component of parent entity
Strong (identifying) relationshipsExists when PK of related
entity contains PK component of parent entity
*
Weak (Non-Identifying) Relationships
*
Strong (Identifying) Relationships
*
Weak EntitiesWeak entity meets two conditionsExistence-
dependentCannot exist without entity with which it has a
relationshipHas a primary key that is partially or totally
derived from parent entity in relationshipDatabase designer
usually determines whether an entity can be described as weak
based on business rules
*
Strong Entity
Weak Entity
*
Weak Entities (cont..)
*
Relationship ParticipationOptional participationOne entity
occurrence does not require corresponding entity occurrence in
particular relationshipMandatory participationOne entity
occurrence requires corresponding entity occurrence in
particular relationship
*
Relationship Participation (cont..)
*
Relationship Participation (cont..)
*
Relationship DegreeIndicates number of entities or participants
associated with a relationshipUnary relationshipAssociation is
maintained within single entity Binary relationship Two entities
are associatedTernary relationship Three entities are associated
*
Relationship Degree (cont..)
*
Relationship Degree (cont..)
*
Recursive RelationshipsRelationship can exist between
occurrences of the same entity setNaturally found within unary
relationship
*
Recursive Relationships (cont..)
*
Recursive Relationships (cont..)
*
Associative (Composite) EntitiesAlso known as bridge
entitiesUsed to implement M:N relationshipsComposed of
primary keys of each of the entities to be connectedMay also
contain additional attributes that play no role in connective
process
*
Composite Entities (cont..)
*
Composite Entities (cont..)
*
Developing an ER DiagramDatabase design is iterative rather
than linear or sequential processIterative process Based on
repetition of processes and proceduresBuilding an ERD usually
involves the following activities:Create detailed narrative of
organization’s description of operationsIdentify business rules
based on description of operationsIdentify main entities and
relationships from business rulesDevelop initial ERDIdentify
attributes and primary keys that adequately describe
entitiesRevise and review ERD
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Database Design Challenges:
Conflicting GoalsDatabase designers must make design
compromisesConflicting goals: design standards, processing
speed, information requirementsImportant to meet logical
requirements and design conventionsDesign of little value
unless it delivers all specified query and reporting
requirementsSome design and implementation problems do not
yield “clean” solutions
*
Database Design Challenges: Conflicting Goals (cont.)
*
SummaryEntity relationship (ER) model Uses ERD to represent
conceptual database as viewed by end userERM’s main
components:EntitiesRelationshipsAttributesIncludes
connectivity and cardinality notationsMultiplicities are based on
business rulesIn ERM, M:M relationship is valid at conceptual
levelERDs may be based on many different ERMsDatabase
designers are often forced to make design compromises
Chap 5.ppt
Database Principles: Fundamentals of Design, Implementations
and Management
CHAPTER 5 Data Modelling
With Entity Relationship Diagrams
*
ObjectivesIn this chapter, you will learn:The main
characteristics of entity relationship componentsHow
relationships between entities are defined, refined, and
incorporated into the database design processHow ERD
components affect database design and implementationThat
real-world database design often requires the reconciliation of
conflicting goals
*
The Entity Relationship (ER) ModelER model forms the basis
of an ER diagramERD represents conceptual database as viewed
by the end userERDs depict database’s main
components:EntitiesAttributesRelationships
*
EntitiesRefers to entity set and not to single entity
occurrenceCorresponds to a table and not to row in relational
environmentIn Chen and Crow’s Foot models, entity is
represented by a rectangle with an entity’s nameEntity name, a
noun, written in capital letters
*
AttributesCharacteristics of entitiesChen notation: attributes
represented by ovals connected to entity rectangle with a
lineEach oval contains the name of attribute it representsCrow’s
Foot notation: attributes written in attribute box below entity
rectangle
*
*
Attributes (cont..)Required attribute: must have a valueOptional
attribute: may be left emptyDomain: set of possible values for
an attributeAttributes may share a domainIdentifiers: one or
more attributes that uniquely identify each entity
instanceComposite identifier: primary key composed of more
than one attribute
*
*
Attributes (cont..)Composite attribute can be subdividedSimple
attribute cannot be subdividedSingle-value attribute can have
only a single valueMultivalued attributes can have many values
*
Attributes (cont..)M:N relationships and multivalued attributes
should not be implementedCreate several new attributes for
each of the original multivalued attributes componentsCreate
new entity composed of original multivalued attributes
componentsDerived attribute: value may be calculated from
other attributesNeed not be physically stored within database
*
*
RelationshipsAssociation between entitiesParticipants are
entities that participate in a relationshipRelationships between
entities always operate in both directionsRelationship can be
classified as 1:MRelationship classification is difficult to
establish if only one side of the relationship is known
*
Connectivity and CardinalityConnectivity Describes the
relationship classificationCardinality Expresses minimum and
maximum number of entity occurrences associated with one
occurrence of related entityEstablished by very concise
statements known as business rules
*
*
Existence DependenceExistence dependenceEntity exists in
database only when it is associated with another related entity
occurrenceExistence independenceEntity can exist apart from
one or more related entitiesSometimes such an entity is referred
to as a strong or regular entity
*
Relationship StrengthWeak (non-identifying)
relationshipsExists if PK of related entity does not contain PK
component of parent entity
Strong (identifying) relationshipsExists when PK of related
entity contains PK component of parent entity
*
Weak (Non-Identifying) Relationships
*
Strong (Identifying) Relationships
*
Weak EntitiesWeak entity meets two conditionsExistence-
dependentCannot exist without entity with which it has a
relationshipHas a primary key that is partially or totally
derived from parent entity in relationshipDatabase designer
usually determines whether an entity can be described as weak
based on business rules
*
Strong Entity
Weak Entity
*
Weak Entities (cont..)
*
Relationship ParticipationOptional participationOne entity
occurrence does not require corresponding entity occurrence in
particular relationshipMandatory participationOne entity
occurrence requires corresponding entity occurrence in
particular relationship
*
Relationship Participation (cont..)
*
Relationship Participation (cont..)
*
Relationship DegreeIndicates number of entities or participants
associated with a relationshipUnary relationshipAssociation is
maintained within single entity Binary relationship Two entities
are associatedTernary relationship Three entities are associated
*
Relationship Degree (cont..)
*
Relationship Degree (cont..)
*
Recursive RelationshipsRelationship can exist between
occurrences of the same entity setNaturally found within unary
relationship
*
Recursive Relationships (cont..)
*
Recursive Relationships (cont..)
*
Associative (Composite) EntitiesAlso known as bridge
entitiesUsed to implement M:N relationshipsComposed of
primary keys of each of the entities to be connectedMay also
contain additional attributes that play no role in connective
process
*
Composite Entities (cont..)
*
Composite Entities (cont..)
*
Developing an ER DiagramDatabase design is iterative rather
than linear or sequential processIterative process Based on
repetition of processes and proceduresBuilding an ERD usually
involves the following activities:Create detailed narrative of
organization’s description of operationsIdentify business rules
based on description of operationsIdentify main entities and
relationships from business rulesDevelop initial ERDIdentify
attributes and primary keys that adequately describe
entitiesRevise and review ERD
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Developing an ER Diagram (cont..)
*
Database Design Challenges:
Conflicting GoalsDatabase designers must make design
compromisesConflicting goals: design standards, processing
speed, information requirementsImportant to meet logical
requirements and design conventionsDesign of little value
unless it delivers all specified query and reporting
requirementsSome design and implementation problems do not
yield “clean” solutions
*
Database Design Challenges: Conflicting Goals (cont.)
*
SummaryEntity relationship (ER) model Uses ERD to represent
conceptual database as viewed by end userERM’s main
components:EntitiesRelationshipsAttributesIncludes
connectivity and cardinality notationsMultiplicities are based on
business rulesIn ERM, M:M relationship is valid at conceptual
levelERDs may be based on many different ERMsDatabase
designers are often forced to make design compromises

DBF-Lecture11-Chapter12.pptDatabase Principles Fundam.docx

  • 1.
    DBF-Lecture11-Chapter12.ppt Database Principles: Fundamentalsof Design, Implementations and Management Lecture11- CHAPTER 12: Transaction Management and Concurrency Control Presented by Rabia Cherouk * ObjectivesIn this chapter, you will learn:About database transactions and their propertiesWhat concurrency control is and what role it plays in maintaining the database’s integrityWhat locking methods are and how they workHow stamping methods are used for concurrency controlHow optimistic methods are used for concurrency controlHow database recovery management is used to maintain database integrity * What is a Transaction?A transaction is a logical unit of work that must be either entirely completed or abortedSuccessful transaction changes database from one consistent state to anotherOne in which all data integrity constraints are satisfiedMost real-world database transactions are formed by two or more database requestsEquivalent of a single SQL
  • 2.
    statement in anapplication program or transaction Same as Fig. 12.1 in your book * Same as Fig. 12.1 in your book * Evaluating Transaction Results Not all transactions update the databaseSQL code represents a transaction because database was accessedImproper or incomplete transactions can have devastating effect on database integritySome DBMSs provide means by which user can define enforceable constraintsOther integrity rules are enforced automatically by the DBMS Same as Fig. 12.2 in your book * Figure 9.2 Same as Fig. 12.2 in your book * Transaction Properties All transactions must display atomicity, consistency, durability and serializability (ACIDS).AtomicityAll operations of a transaction must be completedConsistency Permanence of database’s consistent stateIsolation Data used during transaction cannot be used by second transaction until the first is completed
  • 3.
    * Transaction Properties (cont..)DurabilityOnce transactions are committed, they cannot be undoneSerializabilityConcurrent execution of several transactions yields consistent resultsMultiuser databases are subject to multiple concurrent transactions * Transaction Management with SQLANSI (American National Standard Institute) has defined standards that govern SQL database transactionsTransaction support is provided by two SQL statements: COMMIT and ROLLBACKTransaction sequence must continue until:COMMIT statement is reachedROLLBACK statement is reachedEnd of program is reachedProgram is abnormally terminated * The Transaction LogA DBMS uses a Transaction log to store:A record for the beginning of transactionFor each transaction component: Type of operation being performed (update, delete, insert)Names of objects affected by transaction“Before” and “after” values for updated fieldsPointers to previous and next transaction log entries for the same transactionEnding (COMMIT) of the transaction Table 12.1 in your book * The Transaction Log Table 12.1 in your book
  • 4.
    * Concurrency ControlIs thecoordination of simultaneous transaction execution in a multiprocessing databaseObjective is to ensure serializability of transactions in a multiuser environmentSimultaneous execution of transactions over a shared database can create several data integrity and consistency problemsLost updatesUncommitted dataInconsistent retrievals * Lost UpdatesLost update problem:Two concurrent transactions update same data elementOne of the updates is lostOverwritten by the other transaction Lost Updates * Lost Updates (cont..) *
  • 5.
    * Uncommitted Data Uncommitteddata phenomenon:Two transactions executed concurrentlyFirst transaction rolled back after second already accessed uncommitted data Uncommitted Data * Uncommitted Data (cont..) * * Inconsistent RetrievalsInconsistent retrievals:First transaction accesses dataSecond transaction alters the dataFirst transaction accesses the data againTransaction might read some data before they are changed and other data after changedYields inconsistent results *
  • 6.
    * * The SchedulerSpecial DBMSprogram Purpose is to establish order of operations within which concurrent transactions are executedInterleaves execution of database operations:Ensures serializabilityEnsures isolationSerializable scheduleInterleaved execution of transactions yields same results as serial execution The Scheduler (cont..) Bases its actions on concurrency control algorithmsEnsures computer’s central processing unit (CPU) is used efficientlyFacilitates data isolation to ensure that two transactions do not update same data element at same time * * Database Recovery Management Database recoveryRestores database from given state, usually inconsistent, to previously consistent stateBased on atomic transaction propertyAll portions of transaction treated as single logical unit of workAll operations applied and completed to produce consistent database If transaction operation cannot be completed, transaction must be aborted, and any changes to database must be rolled back (undone)
  • 7.
    Transaction RecoveryMakes useof deferred-write and write- through techniquesDeferred write Transaction operations do not immediately update physical databaseOnly transaction log is updatedDatabase is physically updated only after transaction reaches its commit point using transaction log information * * Transaction Recovery (cont..)Write-through techniqueDatabase is immediately updated by transaction operations during transaction’s execution, even before transaction reaches its commit pointRecovery processIdentify last checkpointIf transaction was committed before checkpointDo nothingIf transaction committed after last checkpointDBMS redoes the transaction using “after” valuesIf transaction had ROLLBACK or was left activeDo nothing because no updates were made Transaction Recovery (cont..) * *
  • 8.
    SummaryTransaction: sequence ofdatabase operations that access databaseLogical unit of workNo portion of transaction can exist by itselfFive main properties: atomicity, consistency, isolation, durability, and serializabilityCOMMIT saves changes to diskROLLBACK restores previous database stateSQL transactions are formed by several SQL statements or database requests * Summary (cont..) Transaction log keeps track of all transactions that modify databaseConcurrency control coordinates simultaneous execution of transactionsScheduler establishes order in which concurrent transaction operations are executedLock guarantees unique access to a data item by transactionTwo types of locks: binary locks and shared/exclusive locks * Summary (cont..) Serializability of schedules is guaranteed through the use of two-phase lockingDeadlock: when two or more transactions wait indefinitely for each other to release lockThree deadlock control techniques: prevention, detection, and avoidanceTime stamping methods assign unique time stamp to each transaction Schedules execution of conflicting transactions in time stamp order * Summary (cont..) Optimistic methods assume the majority of database transactions do not conflictTransactions are executed concurrently, using private copies of the dataDatabase recovery
  • 9.
    restores database fromgiven state to previous consistent state CHAPTER 12: Transaction Management and Concurrency Control ADDITIONAL SLIDES pages 635 to 644 in your Book.. * * Two-Phase Locking to Ensure Serializability (cont..)Governed by the following rules:Two transactions cannot have conflicting locksNo unlock operation can precede a lock operation in the same transactionNo data are affected until all locks are obtained—that is, until transaction is in its locked point * Concurrency Control with Locking MethodsLock Guarantees exclusive use of a data item to a current transactionRequired to prevent another transaction from reading inconsistent dataLock managerResponsible for assigning and policing the locks used
  • 10.
    by transactions * Lock GranularityIndicateslevel of lock useLocking can take place at following levels: DatabaseTablePageRowField (attribute) * Lock Granularity (cont..) Database-level lockEntire database is lockedTable-level lockEntire table is lockedPage-level lockEntire diskpage is locked Row-level lock Allows concurrent transactions to access different rows of same tableEven if rows are located on same page Field-level lock Allows concurrent transactions to access same row as long as they Require the use of different fields (attributes) within the row Fig 12.3 in your book * Fig 12.3 in your book Fig 12.4 in your book * Fig 12.4 in your book Fig. 12.5 in your book *
  • 11.
    Lock Granularity (cont..) Fig.12.5 in your book Fig. 12.6 in your book * Lock Granularity (cont..) Fig. 12.6 in your book * Lock TypesBinary lockTwo states: locked (1) or unlocked (0)Exclusive lock Access is specifically reserved for transaction that locked objectMust be used when potential for conflict existsShared lock Concurrent transactions are granted read access on basis of a common lock Table 12.10 in your book * Table 12.10 in your book * Two-Phase Locking to Ensure SerializabilityDefines how transactions acquire and relinquish locksGuarantees serializability, but does not prevent deadlocks Growing phaseTransaction acquires all required locks without unlocking any dataShrinking phaseTransaction releases all locks and cannot obtain any new lock
  • 12.
    Deadlocks (cont..) * * DeadlocksCondition thatoccurs when two transactions wait for each other to unlock dataPossible only if one of the transactions wants to obtain an exclusive lock on a data itemNo deadlock condition can exist among shared locks * Table 12.11 in your book Deadlocks (cont..) * Deadlocks (cont..)Three techniques to control deadlock:Prevention Detection Avoidance Choice of deadlock control method depends on database environmentLow probability of deadlock, detection recommendedHigh probability, prevention recommended * Concurrency Control
  • 13.
    with Time StampingMethods Assigns global unique time stamp to each transactionProduces explicit order in which transactions are submitted to DBMSUniqueness Ensures that no equal time stamp values can existMonotonicityEnsures that time stamp values always increase * Wait/Die and Wound/Wait SchemesWait/die Older transaction waits and younger is rolled back and rescheduledWound/wait Older transaction rolls back younger transaction and reschedules it Wait/Die and Wound/Wait Schemes (cont..) * * Concurrency Control with Optimistic Methods Optimistic approach Based on assumption that majority of database operations do not conflictDoes not require locking or time stamping techniquesTransaction is executed without restrictions until it is committedPhases: read, validation, and write
  • 14.
    MA 106 FinalExam Name________________________ Solve by Factoring. 1) �2 + 4� − 5 = 0 2) (� − 2) (� + 5) = 8 Solve. 3) A batter hits a baseball in the air. The height h (in feet) of the baseball after t seconds is given by the equation ℎ = −16�2 + 64� + 3 = 0. When is the baseball at a height of three feet? Simplify.
  • 15.
  • 16.
    MA 106 FinalExam Name________________________ 9) (� + 2)2 = 36 10) Solve by completing the square. 7x6x 2 11) Solve by using the quadratic formula. 4-x6x 2 Find all values that make the rational expression undefined. If
  • 17.
    the rational expressionis defined for all real numbers, so state. 12) 18x9x 64x 2 2 Simplify the expression. 13) 2 32 9x3x x217x
  • 18.
    Perform the indicatedoperation. Simplify if possible. 14) ara a ra ra 2 22
  • 19.
    MA 106 FinalExam Name________________________ 15) 12 2 4 1x x 1-x 48x 16) 6-m 18 6-m
  • 20.
  • 21.
  • 22.
    MA 106 FinalExam Name________________________ Simplify the complex fraction. 20) 1 a 7 1 a 7 21) The triangles in the figure are similar. Find the length of the side marked with an x.
  • 23.
  • 24.
    10 8 x 5 4 3 Database Principles: Fundamentalsof Design, Implementations and Management Lecture7- CHAPTER 8 : Beginning Structured Query Language Presented by Rabia Cherouk * ObjectivesIn this chapter, you will learn:The basic commands and functions of SQLHow to use SQL for data administration (to create tables, indexes, and views)How to use SQL for data manipulation (to add, modify, delete, and retrieve data)How to use SQL to query a database for useful information *
  • 25.
    Introduction to SQLSQLfunctions fit into two broad categories:Data definition language (DDL)Create database objects, such as tables, indexes, and viewsDefine access rights to those database objectsData manipulation language (DML)SQL is relatively easy to learnBasic command set has vocabulary of less than 100 words – Non-procedural languageAmerican National Standards Institute (ANSI) prescribes a standard SQLand standards are accepted by ISO (International Organisation for Standardisation) .Several SQL dialects exist Introduction to SQL (cont..) * Introduction to SQL (cont..) * * Data Definition CommandsThe database modelIn this chapter, a simple database with these tables is used to illustrate commands:CUSTOMERINVOICELINEPRODUCTVENDORFoc us on PRODUCT and VENDOR tables
  • 26.
    * The Database Model Figure8.1 in your book The Database Model (cont..) * * Creating the DatabaseTwo tasks must be completed: 1/ Create database structure 2/ Create tables that will hold end-user dataFirst task:RDBMS creates physical files that will hold databaseDiffers substantially from one RDBMS to another * The Database SchemaAuthentication Process through which DBMS verifies that only registered users are able to access databaseLog on to RDBMS using user ID and password created by database administratorSchemaIs a group of database objects—such as tables and indexes— that are related to each other. Usually a schema belongs to a single user or application. A single database can hold multiple schemas belonging to different users or applications.
  • 27.
    * Data TypesData typeselection is usually dictated by nature of data and by intended usePay close attention to expected use of attributes for sorting and data retrieval purposesSupported data types:Number(L,D), Integer, Smallint, Decimal(L,D)Char(L), Varchar(L), Varchar2(L)Date, Time, TimestampReal, Double, FloatInterval day to hourMany other types Data Types (cont..) * * Creating Table StructuresUse one line per column (attribute) definitionUse spaces to line up attribute characteristics and constraintsTable and attribute names are capitalizedNOT NULL specification UNIQUE specification Primary key attributes contain both a NOT NULL and a UNIQUE specificationRDBMS will automatically enforce referential integrity for foreign keys * Creating Table Structures (cont..)Command sequence ends with semicolon Example: CREATE TABLE EMP_2 ( EMP_NUM CHAR(3) NOT NULL UNIQUE, EMP_LNAME VARCHAR(15) NOT NULL, EMP_FNAME VARCHAR(15) NOT NULL,
  • 28.
    EMP_INITIAL CHAR(1), EMP_HIRE DATENOT NULL, JOB_CODE CHAR(3) NOT NULL, PRIMARY KEY (EMP_NUM), FOREIGN KEY (JOB_CODE) REFERENCES JOB); * SQL ConstraintsNOT NULL constraint Ensures that column does not accept nullsUNIQUE constraint Ensures that all values in column are uniqueDEFAULT constraint Assigns value to attribute when a new row is added to tableCHECK constraint Validates data when attribute value is entered * SQL IndexesWhen primary key is declared, DBMS automatically creates unique indexOften need additional indexesUsing CREATE INDEX command, SQL indexes can be created on basis of any selected attributeComposite indexIndex based on two or more attributesOften used to prevent data duplication SQL Indexes (cont..) *
  • 29.
    Data Manipulation CommandsAddingtable rowsSaving table changesListing table rowsUpdating table rowsRestoring table contentsDeleting table rowsInserting table rows with a select subquery * * * Data Manipulation CommandsINSERTSELECTCOMMITUPDATEROLLBACK DELETE * Adding Table RowsINSERT Used to enter data into tableSyntax: INSERT INTO columnname VALUES (value1, value2, … , valueN); * Adding Table Rows (cont..)When entering values, notice that:Row contents are entered between parenthesesCharacter and date values are entered between apostrophesNumerical entries are not enclosed in apostrophesAttribute entries are separated by commasA value is required for each columnUse NULL for unknown values
  • 30.
    * Saving Table ChangesChangesmade to table contents are not physically saved on disk until:Database is closedProgram is closedCOMMIT command is usedSyntax:COMMIT [WORK];Will permanently save any changes made to any table in the database * Listing Table RowsSELECT Used to list contents of tableSyntax: SELECT columnlistFROM tablename;Columnlist represents one or more attributes, separated by commasAsterisk can be used as wildcard character to list all attributes Listing Table Rows (cont..) * * Updating Table RowsUPDATE Modify data in a tableSyntax: UPDATE tablename SET columnname = expression [, columnname = expression] [WHERE conditionlist];If more than one attribute is to be updated in row, separate corrections with commas
  • 31.
    * Restoring Table ContentsROLLBACKUsedto restore database to its previous conditionOnly applicable if COMMIT command has not been used to permanently store changes in databaseSyntax:ROLLBACK;COMMIT and ROLLBACK only work with manipulation commands that are used to add, modify, or delete table rows * Deleting Table RowsDELETE Deletes a table rowSyntax: DELETE FROM tablename [WHERE conditionlist ];WHERE condition is optionalIf WHERE condition is not specified, all rows from specified table will be deleted * Inserting Table Rows with a SELECT SubqueryINSERTInserts multiple rows from another table (source)Uses SELECT subquerySubquery: query that is embedded (or nested) inside another querySubquery is executed firstSyntax: INSERT INTO tablename SELECT columnlist FROM tablename; *
  • 32.
    SELECT QueriesFine-tune SELECTcommand by adding restrictions to search criteria using:Conditional restrictionsArithmetic operatorsLogical operatorsSpecial operators * Selecting Rows with Conditional RestrictionsSelect partial table contents by placing restrictions on rows to be included in outputAdd conditional restrictions to SELECT statement, using WHERE clauseSyntax: SELECT columnlist FROM tablelist [ WHERE conditionlist ] ; Selecting Rows with Conditional Restrictions (continued) * Selecting Rows with Conditional Restrictions (continued)
  • 33.
    * * Selecting Rows with ConditionalRestrictions (cont..) Selecting Rows with Conditional Restrictions (continued) * Selecting Rows with Conditional Restrictions (cont..) * Selecting Rows with Conditional Restrictions (continued)
  • 34.
    * Selecting Rows with ConditionalRestrictions (continued) * Selecting Rows with Conditional Restrictions (cont..) * Selecting Rows with Conditional Restrictions (continued) *
  • 35.
    Selecting Rows with ConditionalRestrictions (continued) * * Arithmetic Operators: The Rule of PrecedencePerform operations within parenthesesPerform power operationsPerform multiplications and divisionsPerform additions and subtractionsTable 8.7 in your book in T * Logical Operators: AND, OR, and NOTSearching data involves multiple conditionsLogical operators: AND, OR, and NOTCan be combinedParentheses placed to enforce precedence orderConditions in parentheses always executed firstBoolean algebra: mathematical field dedicated to use of logical operatorsNOT negates result of conditional expression *
  • 36.
    Special OperatorsBETWEEN: checkswhether attribute value is within a rangeIS NULL: checks whether attribute value is nullLIKE: checks whether attribute value matches given string patternIN: checks whether attribute value matches any value within a value listEXISTS: checks if subquery returns any rows * Advanced Data Definition CommandsAll changes in table structure are made by using ALTER commandThree optionsADD adds a columnMODIFY changes column characteristicsDROP deletes a columnCan also be used to: Add table constraintsRemove table constraints * Changing a Column’s Data TypeALTER can be used to change data typeSome RDBMSs do not permit changes to data types unless column is empty Changing a Column’s Data CharacteristicsUse ALTER to change data characteristicsChanges in column’s characteristics permitted if changes do not alter the existing data type * Adding a Column Dropping a ColumnUse ALTER to add columnDo not include the NOT NULL clause for new columnUse ALTER to drop columnSome RDBMSs impose restrictions on the deletion of an attribute
  • 37.
    * SummarySQL commands canbe divided into two overall categories: Data definition language commands Data manipulation language commandsThe ANSI standard data types are supported by all RDBMS vendors in different waysBasic data definition commands allow you to create tables, indexes, and views * Summary (cont..)DML commands allow you to add, modify, and delete rows from tablesThe basic DML commands:SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACKSELECT statement is main data retrieval command in SQL * Summary (cont..)WHERE clause can be used with SELECT, UPDATE, and DELETE statementsAggregate functionsSpecial functions that perform arithmetic computations over a set of rowsORDER BY clauseUsed to sort output of SELECT statementCan sort by one or more columnsAscending or descending order * Summary (cont..)Join output of multiple tables with SELECT statementJoin performed every time you specify two or more tables in FROM clauseIf no join condition specified, DBMX performs Cartesian productNatural join uses join condition to match only rows with equal values in specified columnsRight
  • 38.
    outer join andleft outer join select rows with no matching values in other related table * Advanced Data UpdatesUPDATE command updates only data in existing rowsIf relationship between entries and existing columns, can assign values to slotsArithmetic operators useful in data updatesIn Oracle, ROLLBACK command undoes changes made by last two UPDATE statements * Advanced Data Updates * Copying Parts of TablesSQL permits copying contents of selected table columnsData need not be reentered manually into newly created table(s)First create the table structureNext add rows to new table using table rows from another table Copying Parts of Tables (cont..) * *
  • 39.
    Adding Primary andForeign Key DesignationsWhen table is copied, integrity rules do not copyPrimary and foreign keys manually defined on new tableUser ALTER TABLE commandSyntax: ALTER TABLE tablename ADD PRIMARY KEY (fieldname);For foreign key, use FOREIGN KEY in place of PRIMARY KEY * Deleting a Table from the DatabaseDROPDeletes table from databaseSyntax: DROP TABLE tablename;Can drop a table only if it is not the “one” side of any relationshipOtherwise RDBMS generates an error messageForeign key integrity violation * Advanced SELECT QueriesLogical operators work well in the query environmentSQL provides useful functions that:CountFind minimum and maximum valuesCalculate averages, etc.SQL allows user to limit queries to:Entries having no duplicatesEntries whose duplicates may be grouped * Ordering a ListingORDER BY clause useful when listing order importantSyntax: SELECT columnlist FROM tablelist [WHERE conditionlist] [ORDER BY columnlist [ASC | DESC]];Ascending order by default
  • 40.
    Ordering a Listing * Orderinga Listing (cont..) * Ordering a Listing (cont..) * * Listing Unique ValuesDISTINCT clause produces list of only values that are different from one anotherExample: SELECT DISTINCT V_CODE FROM PRODUCT;Access places nulls at the top of the listOracle places it at the bottomPlacement of nulls does not affect list contents
  • 41.
    Listing Unique Values * * AggregateFunctionsCOUNT function tallies number of non-null values of an attributeTakes one parameter: usually a column nameMAX and MIN find highest (lowest) value in a tableCompute MAX value in inner queryCompare to each value returned by the querySUM computes total sum for any specified attributeAVG function format similar to MIN and MAX Aggregate Functions * Aggregate Functions (cont..) Figure 8.21 COUNT function output examples *
  • 42.
    Aggregate Functions (cont..) Figure8.22 MIN and MAX Output Examples * Aggregate Functions (cont..) Figure 8.23 The total values of all items in the PRODUCT table * Aggregate Functions (cont..) Figure 8.24 AVG Function Output Examples * * Grouping DataFrequency distributions created by GROUP BY clause within SELECT statementSyntax: SELECT columnlist FROM tablelist [WHERE conditionlist] [GROUP BY columnlist] [HAVINGconditionlist] [ORDER BY columnlist [ASC | DESC] ] ;
  • 43.
    Grouping Data Figure 8.25GROUP BY Clause Output Examples * Grouping Data (cont..) Figure 8.27 An application of the HAVING clause * * Virtual Tables: Creating a ViewView is virtual table based on SELECT queryCreate view by using CREATE VIEW commandSpecial characteristics of relational view:Name of view can be used anywhere a table name is expectedView dynamically updatedRestricts users to only specified columns and rowsViews may be used as basis for reports Virtual Tables: Creating a View (cont..) Figure 8.28 Creating a virtual table using the CREATE VIEW command
  • 44.
    * * Joining Database TablesAbilityto combine (join) tables on common attributes is most important distinction between relational database and other databasesJoin is performed when data are retrieved from more than one table at a timeEquality comparison between foreign key and primary key of related tablesJoin tables by listing tables in FROM clause of SELECT statementDBMS creates Cartesian product of every table Joining Database Tables (cont..) * Joining Database Tables (cont..) * * Joining Tables with an AliasAlias identifies the source table from which data are takenAlias can be used to identify source tableAny legal table name can be used as aliasAdd alias after
  • 45.
    table name inFROM clauseFROM tablename alias Joining Database Tables (cont..) * * Recursive Joins - Outer JoinsAlias especially useful when a table must be joined to itselfRecursive queryUse aliases to differentiate the table from itselfTwo types of outer joinLeft outer joinRight outer join Recursive Joins * Recursive Joins (cont..) *
  • 46.
    Outer Joins * Outer Joins(cont..) * Grouping Data (cont..) Figure 8.26 Incorrect and Correct use of the GROUP BY Clause * Lect10-Conceptual, Logical and Physical.ppt 9 * Database Principles: Fundamentals of Design, Implementations and Management Lecture 10 - CHAPTER 11: CONCEPTUAL, LOGICAL AND PHYSICAL DATABASE DESIGN
  • 47.
    * 9 * In this chapter,you will learn:About the three stages of database design: conceptual, logical and physical.How to design a conceptual model to represent the business and its key functional areas.How the conceptual model can be transformed into a logically equivalent set of relations.How to translate the logical data model into a set of specific DBMS table specifications.About different types of file organization.How indexes can be applied to improve data access and retrieval.How to estimate data storage requirements. * 9 * Database DesignNecessary to focus on the dataMust concentrate on the data characteristics required to build database modelAt this point there are two views of data within system:Business view of data as an information sourceDesigner’s view of the data structure, its access, and
  • 48.
    activities required totransform data into information 9 * Database Design (cont..) 9 * Database Design (cont..) To complete the design phase, we must remember these points:The process of database design is loosely related to analysis and design of larger system The data component is only one element of a larger systemSystems analysts or systems programmers are in charge of designing other system componentsTheir activities create procedures that will help transform data within database into useful informationThe Database Design does not constitute a sequential processIterative process that provides continuous feedback designed to trace previous steps 9 * Database Design (cont..) 9 * 3 Stages of Database Design
  • 49.
    * 9 * I. Conceptual Design(CD)In the CD, Data modeling is used to create an abstract database structure that represents real-world objects in most realistic way possibleThe CD must embody clear understanding of business and its functional areasEnsure that all data needed are in the model, and that all data in the model are neededRequires four steps:Data analysis and requirementsEntity relationship modeling and normalisationData model verificationDistributed database design 9 * I. Conceptual Design (cont..)Data Analysis and Requirements First step is to discover the data element characteristicsObtains characteristics from different sourcesMust take into account the business rulesDerived from the description of operations which is a Document that provides precise, detailed, up-to-date, and thoroughly reviewed description of activities that define organization’s operating environment 9 *
  • 50.
    I. Conceptual Design(cont...)Entity Relationship (ER) Modeling and Normalization Designer must communicate and enforce appropriate standards to be used in documentation of designUse of diagrams and symbolsDocumentation writing styleLayoutOther conventions to be followed during documentation 9 * I. Conceptual Design (cont..) * 9 * I. Conceptual Design (cont..) Fig 11.2 in your book 9 * I. Conceptual Design (cont..)
  • 51.
    9 * I. Conceptual Design(cont..) * 9 * I. Conceptual Design (cont..) * 9 * I. Conceptual Design (cont..)Entity Relationship (ER) Modeling and Normalization (cont…)Data dictionary Defines all objects (entities, attributes, relations, views, and so on) Used in tandem with the normalization process to help eliminate data anomalies and redundancy problems
  • 52.
    9 * I. Conceptual Design(cont..)The Data Model Verification The ER Model must be verified against the proposed system processes to corroborate (confirm) that the intended processes can be supported by the database modelA revision of the original design starts with careful reevaluation of the entities, followed by the detailed examination of the attributes that describe these entitiesDefine design’s major components as modules:A module is an information system component that handles a specific function 9 * I. Conceptual Design (cont..) * 9 * I. Conceptual Design (cont..) *
  • 53.
    9 * I. Conceptual Design(cont..)Data Model Verification (cont..)Verification process starts with:Selecting a central (most important) entityWhich is defined in terms of its participation in most of the model’s relationshipsThe next step is to identify the module or subsystem to which central entity belongs and to define boundaries and scopeOnce the module is identified, the central entity is placed within module’s framework 9 * I. Conceptual Design (cont..)Distributed Database DesignPortions of the database may reside in different physical locationsDesigner must also develop data distribution and allocation strategies 9 * II. DBMS Software SelectionThe selection of the software is critical to an information system’s smooth operationAdvantages and disadvantages should be carefully studiedSome common factors that may affect the purchasing decision are:CostDBMS features and toolsUnderlying model: Hierarchical, network, relational etc…PortabilityDBMS requirements
  • 54.
    9 * III. Logical DesignUsedto translate the conceptual design into internal model for selected database management systemLogical design is software-dependentRequires that all objects in the model be mapped to specific constructs used by selected database software 9 * III. Logical Design (cont..)Used to translate the conceptual design into internal model for the selected database management systemLogical design is software-dependentThe logical design stage consists of the following phases: Creating the logical data model. Validating the logical data model using normalization. Assigning and validating integrity constraints. Merging logical models constructed for different parts for the database together. Reviewing the logical data model with the user. * 9 * III. Logical Design (cont..)
  • 55.
    9 * III. Logical Design(cont…) 9 * Review the complete logical model with the userReviewing the completed logical model with the users to ensure that all the data requirements have been modelled Ensure that all the transactions are supported within the different user views. This stage is very important as any problems need to be solved before beginning the physical database design stage. * 9 * IV. Physical Database DesignPhysical database design requires the definition of specific storage or access methods that will be used by the database. Involves the translation of the logical model into a set of specific DBMS specifications for storing and accessing data.The ultimate goal must be to ensure that data storage is effective to ensure integrity and security and efficient in terms of query response time.
  • 56.
    * 9 * IV. Physical Design(cont..)Is the process of selecting data storage and data access characteristics of the databaseThe storage characteristics are a function of device types supported by the hardware, the type of data access methods supported by the system, and DBMSParticularly important in older hierarchical and network modelsBecomes more complex when data are distributed at different locations 9 * IV. Physical Database Design (cont..) The following information needs to have been collected: A set of normalized relations devised from the ER model and the normalization process. An estimate of the volume of data which will be stored in each database table and the usage statistics. An estimate of the physical storage requirements for each field (attribute) within the database. The physical storage characteristics of the DBMS that are being used *
  • 57.
    9 * Stages of PhysicalDatabase Design Analysing data volume and database usage. Translate each relation identified in the logical data model into a table. Determine a suitable file organization. Define indexes. Define user views. Estimate data storage requirements. Determine database security for users. Additional slides are after the summary * 9 * Analysing Data Volume and Database UsageThe steps required to carrying out this phase are: Identifying the most frequent and critical transactions. Analysis of critical transactions to determine which relations in the database participate in these transactions. *
  • 58.
    9 * Analysing Data Volumeand Database Usage (cont….)Data volume and data usage statistics are usually shown on a simplified version of the ERD. This diagram is known as a composite usage map or a transaction usage map. * 9 * Analysing Data Volume and Database Usage (cont….) * 9 * Translate logical relations into tables Identify the primary and any foreign keys for each table.
  • 59.
    Identify those attributeswhich are not allowed to contain NULL values and those which should be UNIQUE. You can exclude the primary key attribute(s) here as the PRIMARY KEY constraint automatically imposes the NOT NULL and UNIQUE constraints. * 9 * Translate logical relations into tablesFor each relation you should: Identify each attribute name and its domain from the data dictionary. Note any attributes which require DEFAULT values to be inserted into the attribute whenever new rows are inserted into the database. Determine any attributes that require a CHECK constraint in order to validate the value of the attribute. * 9 * Translate logical relations into tables (continued)
  • 60.
    * 9 * Translate logical relationsinto tables (cont..) * 9 * Determine Suitable File OrganisationSelecting the most suitable file organization is very important to ensure that the data is stored efficiently and data can be retrieved as quickly as possible.To do this the DBMS must know where this record is stored and how it can identify it. Look at the future growth of the database and whether the type of file organization provides some protection against data loss. *
  • 61.
    9 * Determine Suitable FileOrganisation (cont..)There are three categories of file organizations:files which contain randomly ordered records known as heap filesfiles which are sorted on one or more fields such as file organizations which are based on indexesfiles hashed on one or more fields known as hash files. * 9 * Determine Suitable File Organisation (cont..)Sequential File OrganizationsRecords are stored in a sequence based on the value of one or more fields which is often the primary key. In order to locate a specific record the whole file must be searched and every record in the file must be read in turn until the required record is located. * 9
  • 62.
    * Determine Suitable FileOrganisation (cont..)Heap File OrganizationsRecords are unordered and inserted into the file as they come. Only used when a large quantity of data needs to be inserted into a table for the first time. * 9 * Determine Suitable File Organisation (cont..) * 9 * Determine Suitable File Organisation (cont..)Indexed File OrganizationsRecords can be stored in a sorted or unsorted sequence and an index is created locate specific records quickly.
  • 63.
    * 9 * Determine Suitable FileOrganisation (cont..) * 9 * Determine Suitable File Organisation (cont..)Types of IndexesPrimary index —these indexes are placed on unique fields such as the primary key.Secondary index —these indexes can be placed on any field in the file that is unordered.Multi- level index —is used where one index becomes too large and so is split into a number of separate indexes in order to reduce the search. * 9
  • 64.
    * Determine Suitable FileOrganisation (cont..) * 9 * Determine Suitable File Organisation (cont..)B-treesBalanced or B-trees and are used to maintain an ordered set of indexes or data to allow efficient operations to select, delete and insert data.A special kind of B-tree is known as the B+-tree where all keys reside in the leaves. This tree is mostoften used to represent indexes which act as a ‘road_map’ so that each index can be quickly located * 9 * Determine Suitable File Organisation (cont..)
  • 65.
    * 9 * Determine Suitable FileOrganisation (cont..) * 9 * Conceptual Design of the DVD rental store * 9 * Determine Suitable File Organisation (cont..) *
  • 66.
    9 * Determine Suitable FileOrganisation (cont..)Bitmap IndexesBitmap indexes are usually applied to attributes which are sparse in their given domain.A two-dimensional array is constructed. One column is generated for every row in the table which we want to index with each column representing a distinct value within the bitmapped index. The two-dimensional array represents each value within the index multiplied by the number of rows in the table. * 9 * Determine Suitable File Organisation (cont..) * 9 *
  • 67.
    Determine Suitable FileOrganisation (cont..) * 9 * Determine Suitable File Organisation (cont..)Bitmap indexes are usually used when:A column in the table has low cardinality. The table is not used often for data manipulation activities and is large.Specific SQL queries reference a number of low cardinality values in their where clauses. * 9 * Determine Suitable File Organisation (cont..)Join IndexCan be applied to columns from two or more tables whose values come from the same domain. It is often referred to as a bitmap join index and it is a way of saving space by reducing the volume of data that must be joined. The bitmap join stores the ROWIDS of corresponding rows in a separate table.
  • 68.
    * 9 * Determine Suitable FileOrganisation (cont..) * 9 * Determine Suitable File Organisation (cont..) * 9 * Determine Suitable File Organisation (cont..)Hashed File OrganizationsUses a hashing algorithm to map a primary key
  • 69.
    value onto aspecific record address in the file. Records are stored in a random order throughout the file. Often referred to as random or direct files. * 9 * Define Indexes (cont..)In SQL, indexes are created using the CREATE INDEX statement. For example, if we wanted to create a primary index on the DVD_ID primary key field from the DVDs table the SQL would be: CREATE UNIQUE INDEX DVDINDEX ON DVD(DVD_ID) * 9 * Define Indexes (cont..)As a general rule, indexes are likely to be used:When an indexed column appears by itself in a search criteria of a WHERE or HAVING clause.When an indexed column appears by itself in a GROUP BY or ORDER BY clause.When a MAX or MIN function is applied to an indexed
  • 70.
    column.When the datasparsity on the indexed column is high. * 9 * Guidelines for creating IndexesCreate indexes for each single attribute used in a WHERE, HAVING, ORDER BY, or GROUP BY clause.Do not use indexes in small tables or tables with low sparsity.Declare primary and foreign keys so the query optimizer within a specific DBMS can use the indexes in join operations.Declare indexes in join columns other than PK/FK. * 9 * Define User ViewsDuring the conceptual design stage the different user views required for the database are determined.Using the relations defined in the logical data model, these views must now be defined. Views are often defined taking database security into account as they can help to define the roles of different types of users.
  • 71.
    * 9 * Estimate Data StorageRequirements * 9 * SecurityData must be protected from access by unauthorized usersMust provide for following: Physical securityPassword securityAccess rightsAudit trailsData encryptionDiskless workstations 9 * Determine database security for usersDuring physical database design security requirements must be implementedDatabase privileges for users will need to be established. For example, privileges may include selecting rows from specified tables or views, being able to modify or delete data in specified tables
  • 72.
    etc. * 9 * Determine database securityfor usersDuring physical database design security requirements must be implementedDatabase privileges for users will need to be established. For example, privileges may include selecting rows from specified tables or views, being able to modify or delete data in specified tables etc. * 9 * Security in ORACLEThe SQL commands GRANT and REVOKE are used to authorize or withdraw privileges on specific user accounts. For example, the following two SQL statements grant the account with the username ‘Craig’ the ability to select rows from the DVD table and the ability to create tables. GRANT SELECT ON DVD TO Craig; GRANT CREATE TABLE TO Craig;
  • 73.
    * 9 * Security in ORACLE(cont..)Removing these privileges can be done using the following SQL statements: REVOKE SELECT ON MOVIE FROM Craig; REVOKE CREATE TABLE FROM Craig; * 9 * Security in ORACLE (cont..)A role is simply a collection of privileges referred to under a single name. The major benefit of roles is that a DBA can add or revoke privileges from a role at any time. These changes will then automatically apply to all the users who have been assigned that role. *
  • 74.
    9 * Security in ORACLE(cont..)For example, in the DVD rental store, the sales staff need to perform SELECT and UPDATE operations on the CUSTOMER table. The SQL command CREATE ROLE is used to create the role STAFF_CUSTOMER_ROLE:CREATE ROLE STAFF_CUSTOMER_ROLE;Once created, privileges can then be granted on selected database objects to the new role. * 9 * Security in ORACLE (cont..)For example: GRANT SELECT ON CUSTOMERS TO STAFF_CUSTOMER_ROLE; GRANT UPDATE ON CUSTOMERS TO STAFF_CUSTOMER_ROLE;The last stage then involves granting the role to individual users accounts, e.g. Frank: GRANT STAFF_CUSTOMER_ROLE TO Frank; *
  • 75.
    9 * SummaryConceptual database designis where the conceptual representation of the database is created by producing a data model which identifies the relevant entities and relationships within the system. * 9 * Summary (cont..)Logical database design is the second stage in the Database Life Cycle, where relations are designed based on each entity and its relationships within the conceptual model. * 9 *
  • 76.
    Summary (cont..)Physical databasedesign is where the logical data model is mapped onto the physical database tables to be implemented in the chosen DBMS. The ultimate goal must be to ensure that data storage is used effectively, to ensure integrity and security and to improve efficiency in terms of query response time. * 9 * Summary (cont…)Selecting a suitable file organization is important for fast data retrieval and efficient use of storage space.Indexes are crucial in speeding up data access. Indexes facilitate searching, sorting, and using aggregate functions and even join operations. * Chap 2.pptx Database Principles: Fundamentals of Design, Implementations and Management CHAPTER 2: DATA MODELS
  • 77.
    Database Principles 2ndEd., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 20/03/2017 1 In this chapter, you will learn: Why data models are important About the basic data-modeling building blocks What business rules are and how they influence database design How the major data models evolved How data models can be classified by level of abstraction Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 2 20/03/2017 The Importance of Data Models Data models Relatively simple representations, usually graphical, of complex real-world data structures Facilitate interaction among the designer, the applications programmer, and the end user End-users have different views and needs for data Data model organizes data for various users Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA
  • 78.
    3 20/03/2017 Data Model BasicBuilding Blocks Entity - anything about which data are to be collected and stored Attribute - a characteristic of an entity Relationship - describes an association among entities One-to-many (1:*) relationship Many-to-many (*:*) relationship One-to-one (1:1) relationship Constraint - a restriction placed on the data Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 4 20/03/2017 Business Rules A business rule is a brief, precise, and unambiguous descriptions of a policies, procedures, or principles within a specific organization Apply to any organization that stores and uses data to generate information Description of operations that help to create and enforce actions within that organization’s environment Must be rendered in writing Must be kept up to date Sometimes are external to the organization Must be easy to understand and widely disseminated
  • 79.
    Describe characteristics ofthe data as viewed by the company Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 5 20/03/2017 Discovering Business Rules Sources of Business Rules: Company managers Policy makers Department managers Written documentation Procedures Standards Operations manuals Direct interviews with end users Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 6 20/03/2017 Translating Business Rules into Data Model Components Standardize company’s view of data Constitute a communications tool between users and designers Allow designer to understand the nature, role, and scope of data Allow designer to understand business processes
  • 80.
    Allow designer todevelop appropriate relationship participation rules and constraints Promote creation of an accurate data model Generally, nouns translate into entities Verbs translate into relationships among entities Relationships are bi-directional Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 7 20/03/2017 The Evolution of Data Models Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 8 20/03/2017 The Evolution of Data Models (cont..) Hierarchical Network Relational Entity relationship Object oriented (OO) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA
  • 81.
    9 20/03/2017 The Hierarchical Model Developedin the 1960s to manage large amounts of data for complex manufacturing projects Basic logical structure is represented by an upside-down “tree” The hierarchical structure contains levels, or segments Depicts a set of one-to-many (1:*) relationships between a parent and its children segments Each parent can have many children each child has only one parent Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 10 20/03/2017 The Hierarchical Model (cont…) Advantages Many of the hierarchical data model’s features formed the foundation for current data models Its database application advantages are replicated, though in a different form, in current database environments Generated a large installed (mainframe) base, created a pool of programmers who developed numerous tried-and-true business applications Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
  • 82.
    © 2013 CengageLearning EMEA 11 Albeit= Bien que , quoique que , though 20/03/2017 The Hierarchical Model (cont..) Disadvantages Complex to implement Difficult to manage Lacks structural independence Implementation limitations Lack of standards Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 12 20/03/2017 The Network Model Created to Represent complex data relationships more effectively than the hierarchical model Improve database performance Impose a database standard While the Network model is not used today, the definitions of standard database concepts are still used by modern data models such as: Schema Conceptual organization of entire database as viewed by the database administrator
  • 83.
    Database Principles 2ndEd., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 13 20/03/2017 The Network Model (cont..) Subschema Defines database portion “seen” by the application programs that actually produce the desired information from data contained within the database Data Management Language (DML) Defines the environment in which data can be managed and is used to work with the data in the database Schema Data Definition Language (DDL) Enables database administrator to define schema components Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 14 20/03/2017 The Network Model (cont..) Disadvantages Too cumbersome The lack of ad hoc query capability put heavy pressure on programmers Any structural change in the database could produce havoc in all application programs that drew data from the database
  • 84.
    Many database old-timerscan recall the interminable information delays Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 15 20/03/2017 The Relational Model Developed by Codd (IBM) in 1970 Considered ingenious but impractical in 1970 Conceptually simple Computers lacked power to implement the relational model Today, microcomputers can run sophisticated Relational Database Software called Relational Database Management System (RDBMS)- Ex: Oracle : mainframe relational software Performs same basic functions provided by hierarchical and network DBMS systems, in addition to a host of other functions Most important advantage of the RDBMS is its ability to hide the complexities of the relational model from the user Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 16 20/03/2017 The Relational Model (cont..) Table Matrix consisting of a series of row/column intersections
  • 85.
    Related to eachother through sharing a common entity characteristic Tables, also called relations are related to each other through the sharing of a common field Relational diagram Is a representation of relational database’s entities, attributes within those entities, and relationships between those entities Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 17 20/03/2017 The Relational Model (cont..) Relational Table Stores a collection of related entities Resembles a file Relational table is purely a logical structure How data are physically stored in the database is of no concern to the user or the designer This property became the source of a real database revolution Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 18 20/03/2017 The Relational Model (cont..)
  • 86.
    Database Principles 2ndEd., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 19 20/03/2017 The Relational Model (continued) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 20 20/03/2017 The Relational Model (cont..) Another raison for the database model’s rise to dominance is its powerful and flexible query language Structured Query Language (SQL) allows the user to specify what must be done without specifying how it must be done SQL-based relational database application involves 3 parts: User interface A set of tables stored in the database SQL engine Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 21 20/03/2017
  • 87.
    The Entity RelationshipModel Widely accepted and adapted graphical tool for data modeling Introduced by Peter Chen in 1976 It was the graphical representation of entities and their relationships in a database structure. More recently the class diagram component of the Unified Modeling Language (UML) has been used to produce entity relationship models. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 22 20/03/2017 The Entity Relationship Model (cont..) Entity relationship diagram (ERD) Uses graphic representations to model database components Entity is mapped to a relational table Entity instance (or occurrence) is a row in table Each entity is described by a set of attributes that describe particular characteristics of the entity Entity set is collection of like entities Connectivity labels types of relationships (1-1, 1- M, M-M) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 23 20/03/2017
  • 88.
    The Entity RelationshipModel (cont..) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 24 20/03/2017 The Entity Relationship Model (cont..) Fig 2.4 The basic Crow’s foot ERD Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 25 20/03/2017 Data Models: A Summary Each new data model capitalized on the shortcomings of previous models Common characteristics that data models must have in order to be widely accepted: Conceptual simplicity without compromising the semantic completeness of the database Represent the real world as closely as possible Representation of real-world transformations (behavior) must comply with consistency and integrity characteristics of any data model
  • 89.
    Database Principles 2ndEd., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 26 20/03/2017 Data Models: A Summary (cont..) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 27 20/03/2017 Degrees of Data Abstraction Way of classifying data models Many processes begin at high level of abstraction and proceed to an ever-increasing level of detail Designing a usable database follows the same basic process Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 28 20/03/2017 Degrees of Data Abstraction (cont..) In the early 1970s, the American National Standards Institute
  • 90.
    (ANSI) Standards Planningand Requirements Committee (SPARC) Defined a framework for data modeling based on degrees of data abstraction: External Conceptual Internal Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 29 20/03/2017 Degrees of Data Abstraction (cont..) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 30 20/03/2017 The External Model End users’ view of the data environment Requires that the modeler subdivide set of requirements and constraints into functional modules that can be examined within the framework of their external models Advantages: Easy to identify specific data required to support each business unit’s operations Facilitates designer’s job by providing feedback about the
  • 91.
    model’s adequacy Creation ofexternal models helps to ensure security constraints in the database design Simplifies application program development Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 31 20/03/2017 The External Model (cont..) Fig 2.9 External Models for Tiny College Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 32 20/03/2017 The External Model (cont..) Fig 2.9 External Models for Tiny College Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 33
  • 92.
    20/03/2017 The Conceptual Model Representsglobal view of the entire database Representation of data as viewed by the entire organization The conceptual is the basis for identification and high-level description of main data objects, avoiding details Most widely used conceptual model is the entity relationship (ER) model Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 34 20/03/2017 The Conceptual Model (cont..) Fig 2.10 The Conceptual Model for Tiny College Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 35 20/03/2017 The Conceptual Model (cont..) First, the CM provides a relatively easily understood macro level view of data environment Second, the CM is independent of both software and hardware
  • 93.
    Does not dependon the DBMS software used to implement the model Does not depend on the hardware used in the implementation of the model Changes in either hardware or DBMS software have no effect on the database design at the conceptual level Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 36 20/03/2017 The Internal Model Is the representation of the database as “seen” by the DBMS The internal model should map the conceptual model to the DBMS The internal schema depicts a specific representation of an internal model Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 37 20/03/2017 Fig 2.11 An Internal Model for Tiny College Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA
  • 94.
    38 20/03/2017 The Physical Model Operatesat lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes Software and hardware dependent Requires that database designers have a detailed knowledge of the hardware and software used to implement database design Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 39 20/03/2017 The Physical Model (cont..) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 40 20/03/2017 Summary A data model is a (relatively) simple abstraction of a complex real-world data environment Basic data modeling components are: Entities
  • 95.
    Attributes Relationships Constraints Database Principles 2ndEd., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 41 20/03/2017 Summary (cont..) Hierarchical model Depicts a set of one-to-many (1:*) relationships between a parent and its children segments Network data model Uses sets to represent 1:* relationships between record types Relational model Current database implementation standard ER model is a popular graphical tool for data modeling that complements the relational model Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 42 20/03/2017 Summary (cont..) Object is basic modeling structure of object oriented data model The relational model has adopted many object-oriented extensions to become the extended relational data model (ERDM)
  • 96.
    Data modeling requirementsare a function of different data views (global vs. local) and level of data abstraction NoSQL databases are a new generation of databases that do not use the relational model and are geared to support the very specific needs of Big Data organizations Additional slides are next Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 43 20/03/2017 The Object Oriented Model Modeled both data and their relationships in a single structure known as an object Object-oriented data model (OODM) is the basis for the object- oriented database management system (OODBMS) OODM is said to be a semantic data model Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 44 20/03/2017 The Object Oriented Model (cont..) Object described by its factual content Like relational model’s entity Includes information about relationships between facts within object, and relationships with other objects
  • 97.
    Unlike relational model’sentity Subsequent OODM development allowed an object to also contain all operations Object becomes basic building block for autonomous structures Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 45 20/03/2017 The Object Oriented Model (cont..) Object is an abstraction of a real-world entity Attributes describe the properties of an object Objects that share similar characteristics are grouped in classes Classes are organized in a class hierarchy Inheritance is the ability of an object within the class hierarchy to inherit the attributes and methods of classes above it Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 46 20/03/2017 The Object Oriented Model (cont..) Fig 2.5 A comparison of the OO model and the ER model Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA
  • 98.
    47 20/03/2017 Other Models Extended RelationalData Model (ERDM) Semantic data model developed in response to increasing complexity of applications DBMS based on the ERDM often described as an object/relational database management system (O/RDBMS) Primarily geared to business applications Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 48 20/03/2017 Emerging Data Models: Big Data and NoSQL Big Data refers to a movement to find new and better ways to manage large amounts of Web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA Emerging Data Models: Big Data and NoSQL (cont…) The relational approach does not always match the needs of organizations with Big Data challenges. It is not always possible to fi t unstructured, social media data into the conventional relational structure of rows and columns. Adding millions of rows of multi-format (structured and
  • 99.
    nonstructured) data ona daily basis will inevitably lead to the need for more storage, processing power, and sophisticated data analysis tools that may not be available in the relational environment. The type of high-volume implementations required in the RDBMS environment for the Big Data problem comes with a hefty price tag for expanding hardware, storage, and software licenses. Data analysis based on OLAP tools has proven to be very successful in relational environments with highly structured data. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA Database Models and the Internet Internet drastically changed role and scope of database market OODM and ERDM-O/RDM have taken a backseat to development of databases that interface with Internet Dominance of Web has resulted in growing need to manage unstructured information Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA 51 20/03/2017 NoSQL Databases NoSQL to refer to a new generation of databases that address the specific challenges of the Big Data era. They have the following general characteristics: Not based on the relational model, hence the name NoSQL.
  • 100.
    Supports distributed databasearchitectures. Provides high scalability, high availability, and fault tolerance. Supports very large amounts of sparse data. Geared toward performance rather than transaction consistency. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA NoSQL Databases (cont..) The key-value data model is based on a structure composed of two data elements: a key and a value, in which every key has a corresponding value or set of values. The key-value data model is also referred to as the attribute- value or associative data model. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA NoSQL Databases (cont…) Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA NoSQL Databases (cont..) The data type of the “value” column is generally a long string to accommodate the variety of actual data types of the values placed in the column. To add a new entity attribute in the relational model, you need to modify the table definition. To add a new attribute in the key-value store, you add a row to the key-value store, which is why it is said to be “schema-less.” NoSQL databases do not store or enforce relationships among entities. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett
  • 101.
    © 2013 CengageLearning EMEA NoSQL Databases (cont..) NoSQL databases use their own native application programming interface (API) with simple data access commands, such as put, read, and delete. Indexing and searches can be difficult. Because the “value” column in the key-value data model could contain many different data types, it is often difficult to create indexes on the data. At the same time, searches can become very complex. Database Principles 2nd Ed., Coronel, Morris, Rob & Crockett © 2013 Cengage Learning EMEA Lecture7_ch07.ppt Database Principles: Fundamentals of Design, Implementations and Management CHAPTER 7 Normalizing Database Designs * ObjectivesIn this chapter, you will learn:What normalization is and what role it plays in the database design processAbout the normal forms 1NF, 2NF, 3NF, BCNF, and 4NFHow normal forms can be transformed from lower normal forms to higher normal formsThat normalization and ER modeling are used concurrently to produce a good database
  • 102.
    designThat some situationsrequire denormalization to generate information efficiently * Database Tables and NormalizationNormalization Process for evaluating and correcting table structures to minimize data redundanciesReduces data anomaliesWorks through a series of stages called normal forms: First normal form (1NF)Second normal form (2NF)Third normal form (3NF) * Database Tables and Normalization (cont..)Normalization (cont..)2NF is better than 1NF; 3NF is better than 2NFFor most business database design purposes, 3NF is as high as needed in normalizationHighest level of normalization is not always most desirableDenormalization produces a lower normal formPrice paid for increased performance is greater data redundancy * The Need for NormalizationExample: company that manages building projectsCharges its clients by billing hours spent on each contractHourly billing rate is dependent on employee’s positionPeriodically, report is generated that contains information such as displayed in Table 5.1 The Need for Normalization *
  • 103.
    The Need forNormalization * * The Need for Normalization (cont..)Structure of data set in Figure 7.1 does not handle data very wellTable structure appears to work; report generated with easeUnfortunately report may yield different results depending on what data anomaly has occurredRelational database environment suited to help designer avoid data integrity problems * The Normalization ProcessEach table represents a single subjectNo data item will be unnecessarily stored in more than one tableAll attributes in a table are dependent on the primary keyEach table void of insertion, update, deletion anomalies Void = depourvu de * The Normalization Process (cont..) *
  • 104.
    * The Normalization Process(cont..)Objective of normalization is to ensure all tables in at least 3NFHigher forms not likely to be encountered in business environmentNormalization works one relation at a timeProgressively breaks table into new set of relations based on identified dependencies * Conversion to First Normal FormRepeating groupDerives its name from the fact that a group of multiple entries of same type can exist for any single key attribute occurrenceRelational table must not contain repeating groupsNormalizing table structure will reduce data redundanciesNormalization is three-step procedure * Conversion to First Normal Form (cont.)Step 1: Eliminate the Repeating Groups Present data in tabular format, where each cell has single value and there are no repeating groupsEliminate nulls: each repeating group attribute contains an appropriate data value Step 2: Identify the Primary Key Primary key must uniquely identify attribute valueNew key must be composed Conversion to First Normal Form (cont..) *
  • 105.
    Conversion to FirstNormal Form (cont..) * Conversion to First Normal Form (cont..)Step 3: Identify All Dependencies Dependencies can be depicted with help of a diagramDependency diagram: Depicts all dependencies found within given table structureHelpful in getting bird’s-eye view of all relationships among table’s attributesMakes it less likely that will overlook an important dependency * Conversion to First Normal Form (cont..) * * Conversion to First Normal Form (cont.)First normal form describes tabular format in which:All key attributes are definedThere are no repeating groups in the tableAll attributes are dependent on primary keyAll relational tables satisfy 1NF requirementsSome tables contain partial
  • 106.
    dependenciesDependencies based onpart of the primary keySometimes used for performance reasons, but should be used with cautionStill subject to data redundancies Conversion to Second Normal FormRelational database design can be improved by converting the database into second normal form (2NF)Two steps * * Conversion to Second Normal Form (cont..)Step 1: Write Each Key Component on a Separate Line Write each key component on separate line, then write original (composite) key on last lineEach component will become key in new table Step 2: Assign Corresponding Dependent Attributes Determine those attributes that are dependent on other attributesAt this point, most anomalies have been eliminated Conversion to Second Normal Form (cont..) *
  • 107.
    * Conversion to SecondNormal Form (cont..)Table is in second normal form (2NF) when:It is in 1NF and It includes no partial dependencies:No attribute is dependent on only portion of primary key * Conversion to Third Normal FormData anomalies created are easily eliminated by completing three stepsStep 1: Identify Each New Determinant For every transitive dependency, write its determinant as PK for new tableDeterminant: any attribute whose value determines other values within a row Conversion to Third Normal Form (cont..)Step 2: Identify the Dependent Attributes Identify attributes dependent on each determinant identified in Step 1 and identify dependency Name table to reflect its contents and function * * Conversion to Third Normal Form (cont.)Step 3: Remove the Dependent Attributes from Transitive Dependencies Eliminate all dependent attributes in transitive relationship(s) from each of the tablesDraw new dependency diagram to show all tables defined in Steps 1–3Check new tables as well as tables modified in Step 3 to make sure that:Each table has a determinant and thatNo table contains inappropriate dependencies
  • 108.
    Conversion to ThirdNormal Form (cont..) * * Conversion to Third Normal Form (cont.)A table is in third normal form (3NF) when both of the following are true:It is in 2NFIt contains no transitive dependencies * Improving the DesignTable structures are cleaned up to eliminate troublesome initial partial and transitive dependencies Normalization cannot, by itself, be relied on to make good designs It is valuable because its use helps eliminate data redundancies * Improving the Design (cont..)Issues to address in order to produce a good normalized set of tables: Evaluate PK AssignmentsEvaluate Naming ConventionsRefine Attribute AtomicityIdentify New AttributesIdentify New RelationshipsRefine Primary Keys as Required for Data GranularityMaintain Historical AccuracyEvaluate Using Derived Attributes
  • 109.
    Improving the Design(cont..) * Improving the Design (cont..) * Improving the Design (cont..) * * Surrogate Key ConsiderationsWhen primary key is considered to be unsuitable, designers use surrogate keysData entries in Table 7.3 are inappropriate because they duplicate existing recordsYet there has been no violation of either entity integrity or referential integrity * Higher-Level Normal FormsTables in 3NF perform suitably in business transactional databasesHigher order normal forms useful on occasionTwo special cases of 3NF:Boyce-Codd normal form (BCNF)Fourth normal form (4NF)
  • 110.
    * The Boyce-Codd NormalForm (BCNF)Every determinant in table is a candidate keyHas same characteristics as primary key, but for some reason, not chosen to be primary keyWhen table contains only one candidate key, the 3NF and the BCNF are equivalentBCNF can be violated only when table contains more than one candidate key * The Boyce-Codd Normal Form (BCNF) (cont..)Most designers consider the BCNF as special case of 3NFTable is in 3NF when it is in 2NF and there are no transitive dependenciesTable can be in 3NF and fails to meet BCNFNo partial dependencies, nor does it contain transitive dependenciesA nonkey attribute is the determinant of a key attribute The Boyce-Codd Normal Form (BCNF) (cont...) * The Boyce-Codd Normal Form (BCNF) (cont..) *
  • 111.
    The Boyce-Codd NormalForm (BCNF) (cont..) * * SummaryNormalization is used to minimize data redundanciesFirst three normal forms (1NF, 2NF, and 3NF) are most commonly encounteredTable is in 1NF when:All key attributes are definedAll remaining attributes are dependent on primary key * Summary (continued)Table is in 2NF when it is in 1NF and contains no partial dependenciesTable is in 3NF when it is in 2NF and contains no transitive dependenciesTable that is not in 3NF may be split into new tables until all of the tables meet 3NF requirementsNormalization is important part—but only part—of the design process * Summary (continued)Table in 3NF may contain multivalued dependencies Numerous null values or redundant dataConvert 3NF table to 4NF by:Splitting table to remove multivalued dependenciesTables are sometimes denormalized to yield less I/O, which increases processing speed Additional SlidesPlease have a look a the following slides
  • 112.
    * * Fourth Normal Form(4NF)Table is in fourth normal form (4NF) when both of the following are true:It is in 3NF No multiple sets of multivalued dependencies4NF is largely academic if tables conform to following two rules:All attributes dependent on primary key, independent of each otherNo row contains two or more multivalued facts about an entity Fourth Normal Form (4NF) (continued) * * Fourth Normal Form (4NF) Fourth Normal Form (4NF) * * Normalization and Database DesignNormalization should be part of the design processMake sure that proposed entities meet
  • 113.
    required normal formbefore table structures are createdMany real-world databases have been improperly designed or burdened with anomaliesYou may be asked to redesign and modify existing databases * Normalization and Database Design (cont.)ER diagram Identify relevant entities, their attributes, and their relationshipsIdentify additional entities and attributesNormalization procedures Focus on characteristics of specific entitiesMicro view of entities within ER diagramDifficult to separate normalization process from ER modeling processTwo techniques should be used concurrently Figure 7.13 in your book * Normalization and Database Design (cont.) Figure 7.13 in your book * Normalization and Database Design (cont.) Figure 7.14 in your book Figure 7.14 in your book * Figure 7.15 in your book Normalization and Database Design (cont.) Figure 7.15 in your book
  • 114.
    Normalization and DatabaseDesign (continued) * Normalization and Database Design (continued) * Normalization and Database Design (continued) * * * Void = depourvu de * * * * *
  • 115.
  • 116.
    Database Principles: Fundamentalsof Design, Implementations and Management Lecture9-CHAPTER 10 : Database Development Process In this chapter, you will learn: That successful database design must reflect the information system of which the database is a part That successful information systems are developed within a framework known as the Systems Development Life Cycle (SDLC) That within the information system, the most successful databases are subject to frequent evaluation and revision within a framework known as the Database Life Cycle (DBLC) How to conduct evaluation and revision within the SDLC and DBLC frameworks 2 In this chapter, you will learn (cont..): About database design strategies: top-down vs. bottom-up design and centralized vs. decentralized design Common threats to the security of the data and what security measures could be put in place The importance of the database administration in an organization The technical and managerial roles of the database administrator (DBA)
  • 117.
    3 The Information System Providesfor data collection, storage, and retrieval Composed of people, hardware, software, database(s), application programs, and procedures Systems analysis Is the process that establishes the need for and extent of an information system Systems development Is the process of creating information system 4 The Information System (cont..) Applications Transform data into information that forms the basis for decision making Usually produce the following: Formal report Tabulations Graphic displays Composed of following two parts: Data Code by which data are transformed into information 5
  • 118.
    The Information System(cont..) 6 The Information System (cont..) Information system performance depends on triad of factors: Database design and implementation Application design and implementation Administrative procedures Database development Is the process of database design and implementation The primary objective is to create complete, normalized, non- redundant (to the extent possible), and fully integrated conceptual, logical, and physical database models 7 The Systems Development Life Cycle (SDLC) Traces history (life cycle) of information system Provides “big picture” within which database design and application development can be mapped out and evaluated Divided into following five phases: Planning Analysis Detailed systems design Implementation Maintenance Iterative rather than sequential process
  • 119.
    8 The Systems DevelopmentLife Cycle (SDLC) (cont..) 9 Planning Yields a general overview of the company and its objectives Is an initial assessment made of information-flow-and-extent requirements Must begin to study and evaluate alternate solutions Technical aspects of hardware and software requirements System cost 10 Analysis The problems defined during planning phase are examined in greater detail during analysis Thorough audit of user requirements The existing hardware and software systems are studied Goal is a better understanding of : system’s functional areas, The actual and potential problems, and opportunities
  • 120.
    11 Analysis (cont..) Includes thecreation of logical system design Must specify appropriate conceptual data model, inputs, processes, and expected output requirements Might use tools such as data flow diagrams (DFDs), hierarchical input process output (HIPO) diagrams, and entity relationship (ER) diagrams Yields functional descriptions of system’s components (modules) for each process within database environment 12 Detailed Systems Design The designer completes the design of the system’s processes Includes all necessary technical specifications The steps are laid out for conversion from old to new system The training principles and methodologies are also planned Submitted for management approval 13 Implementation Hardware, DBMS software, and application programs are
  • 121.
    installed, and the databasedesign is implemented The system enters into: A cycle of coding, Testing, and debugging continues until it is ready to be delivered The actual database is created and the system is customized by creation of tables and views, and user authorizations 14 Maintenance Maintenance activities can be grouped into three types: Corrective maintenance in response to systems errors Adaptive maintenance due to changes in business environment Perfective maintenance to enhance system Computer-assisted systems engineering (CASE) Make it possible to produce better systems within reasonable amount of time and at reasonable cost CASE-produced applications are structured, documented, standardized 15 16 The Database Life Cycle (DBLC) Six phases: Database initial study
  • 122.
    Database design Implementation andloading Testing and evaluation Operation Maintenance and evolution The Database Life Cycle (DBLC) 17 The Database Initial Study Overall purpose: Analyze company situation Define problems and constraints Define objectives Define scope and boundaries Fig 10.4 in the next slide depicts the interactive and iterative processes required to complete first phase of DBLC successfully 18 The Database Initial Study (cont..) Fig 10.4 in your book 19
  • 123.
    Analyze the CompanySituation Analysis–To break up any whole into its parts so as to find out their nature, function, and so on Company situation General conditions in which company operates, its organizational structure, and its mission Analyze company situation Discover what company’s operational components are, how they function, and how they interact 20 Define Problems and Constraints Managerial view of company’s operation is often different from that of end users The Database Designer must continue to carefully probe to generate additional information that will help define problems within larger framework of company operations Finding precise answers is important Defining problems does not always lead to perfect solution 21 Define Objectives Designer must ensure that database system objectives correspond to those envisioned by end user(s) Designer must begin to address following questions:
  • 124.
    What is proposedsystem’s initial objective? Will system interface with other existing or future systems in the company? Will system share data with other systems or users? 22 Define Scope and Boundaries Scope Defines extent of design according to operational requirements Helps define required data structures, type and number of entities, and physical size of database Boundaries Limits external to system Often imposed by existing hardware and software 23 Database Design Necessary to concentrate on data Characteristics required to build database model Two views of data within system: Business view of data as information source Designer’s view of data structure, its access, and activities required to transform data into information 24
  • 125.
    Database Design (cont..) Fig10.5 in your book 25 Database Design (cont..) Loosely related to analysis and design of larger system The Systems analysts or systems programmers are in charge of designing other system components Their activities create procedures that will help transform data within database into useful information Does not constitute sequential process Iterative process that provides continuous feedback designed to trace previous steps 26 Database Design (cont..) 27 I. Conceptual Design Overview
  • 126.
    Data modeling usedto create an abstract database structure that represents real-world objects in most realistic way possible Must embody clear understanding of business and its functional areas Ensure that all data needed are in model, and that all data in the model are needed 28 I. Conceptual Design Overview (cont..) Requires four steps Data analysis and requirements Discover data element characteristics Obtains characteristics from different sources Take into account business rules Derived from description of operations Entity relationship modeling and normalization Designer enforces standards in design documentation Use of diagrams and symbols, documentation writing style, layout, other conventions 29 30 I. Conceptual Design Overview (cont..) 3. Data model verification Verified against proposed system processes Revision of original design Careful reevaluation of entities
  • 127.
    Detailed examination ofattributes describing entities Define design’s major components as modules: Module: information system component that handles specific function 31 I. Conceptual Design Overview (cont..) Data model verification (cont…) Verification process Select central (most important) entity Defined in terms of its participation in most of model’s relationships Identify module or subsystem to which central entity belongs and define boundaries and scope Place central entity within module’s framework 32 I. Conceptual Design Overview(cont..) Distributed database design Portions of the database may reside in different physical locations Processes accessing the database vary from one location to another The Designer must also develop data distribution and allocation strategies II. DBMS Software Selection Critical to information system’s smooth operation Common factors affecting purchasing decisions: Cost DBMS features and tools Underlying model
  • 128.
    Portability DBMS hardware requirements Advantagesand disadvantages should be carefully studied 33 III. Logical Design Overview Used to translate conceptual design into internal model for selected database management system Logical design is software-dependent Requires that all objects in model be mapped to specific constructs used by selected database software Definition of attribute domains, design of required tables, access restriction formats Tables must correspond to entities in conceptual design Translates software-independent conceptual model into software-dependent logical model 34 III. Logical Design Overview (cont..) The logical design stage consists of the following phases: Creating the logical data model. Validating the logical data model using normalization. Assigning and validating integrity constraints. Merging logical models constructed for different parts for the
  • 129.
    database together. Reviewing thelogical data model with the use 35 IV. Physical Design Overview Is the Process of selecting data storage and data access characteristics of database Storage characteristics are function of device types supported by hardware, type of data access methods supported by system, and DBMS Particularly important in older hierarchical and network models Becomes more complex when data are distributed at different locations 36 IV. Physical Design Overview (cont..) Physical database design can be broken down into a number of stages: Analyze data volume and database usage. Translate each relation identified in the logical data model into tables. Determine a suitable file organization. Define indexes. Define user views. Estimate data storage requirements. Determine database security for users.
  • 130.
    37 Implementation and Loading Newdatabase implementation requires creation of special storage-related constructs to house end-user tables 38 Performance Is one of the most important factors in certain database implementations Not all DBMSs have performance-monitoring and fine-tuning tools embedded in their software Performance evaluation is rendered more difficult as there is no standard measurement for database performance 39 Backup and Recovery Database can be subject to data loss through unintended data deletion and power outages Data backup and recovery procedures Create safety valve Allow database administrator to ensure availability of consistent
  • 131.
    data Integrity Enforced through properuse of primary and foreign key rules 40 Company Standards May partially define database standards Database administrator must implement and enforce such standards Database Security Data must be protected from access by unauthorized users Establish security goals - What are we trying to protect the database from? - What security related problems are we trying to prevent? The most common security goals relate to the integrity, confidentiality and the availability of data. 41 Data Security Measures Physical security allows only authorized personnel physical access to specific areas. User authentication is a way of identifying the user and verifying that the user is allowed to access some restricted data or application. achieved through the use of passwords and access rights. Audit trails are usually provided by the DBMS to check for access violations.
  • 132.
    42 Data Security Measures(cont..) Data encryption Can be used to render data useless to unauthorised users. ORACLE DBMS has a Transparent Data Encryption User-defined policies and procedures Backup and recovery strategies should be in place in the event of a disaster occurring Antivirus software Firewalls are systems comprising of hardware devices or software applications which act as gatekeepers to an organisation’s network. For more details on security measures read the slides after the chapter summary 43 Testing and Evaluation This phase occurs in parallel with applications programming Programmers use database tools to prototype applications during coding of the programs If the DB implementation fails to meet some of system’s evaluation criteria, several options may be considered to enhance the system: Fine-tune specific system and DBMS configuration parameters Modify physical design
  • 133.
    Modify logical design Upgradeor change DBMS software and/or hardware platform 44 Operation Once the database has passed the evaluation stage, it is considered operational The beginning of the operational phase starts the process of system maintenance and evolution 45 Maintenance and Evolution Required periodic maintenance: Preventive maintenance (backup) Corrective maintenance (recovery) Adaptive maintenance Assignment of access permissions and their maintenance for new and old users Generation of database access statistics Periodic security audits Periodic system-usage summaries 46 Parallel Activities in the DBLC and the SDLC
  • 134.
    47 Summary Information system isdesigned to facilitate transformation of data into information and to manage both data and information SDLC traces history (life cycle) of an application within the information system DBLC describes history of database within the information system Database design and implementation process moves through series of well-defined stages Conceptual portion of design may be subject to several variations, based on two design philosophies 48 Summary (cont..) Threats to database security include the loss of integrity, confidentiality and availability of data. The database administrator (DBA) is responsible for managing the corporate database. The development of the data administration strategy is closely related to the company’s mission and objectives. 49
  • 135.
    Threats to Security Threatsare any set of circumstances that have the potential to cause loss, misuse or harm to the system and/or its data. Threats can cause: The loss of the integrity of data through unauthorized modification. For example a person gaining unauthorized access to a bank account and removing some money from the account. 50 Threats to Security The loss of availability of the data. For example some adversary causes the database system from being operational which stops authorized users of the data from accessing it. The loss of confidentiality of the data (also referred to as the privacy of data). This could be caused by a person gaining access to private information such as a password or a bank account balance. 51 Examples of Threats Theft and fraud of data. Human error which causes accidental loss of data.
  • 136.
    Electronic infections Viruses Email Viruses Worms TrojanHorses 52 Examples of Threats (cont..) The occurrence of natural disasters such as hurricanes, fires or floods. Unauthorized access and modification of data. Employee sabotage is concerned with the deliberate acts of malice against the organization. Poor database administration. 53 Examples of Threats (cont..) 54 Database Design Strategies Two classical approaches to database design: Top-down design
  • 137.
    Identifies data sets Definesdata elements for each of those sets Bottom-up design Identifies data elements (items) Groups them together in data sets 55 Database Design Strategies Top-down vs. bottom-up design sequencing 56 Centralized vs. Decentralized Design Database design may be based on two very different design philosophies: Centralized design Productive when data component is composed of relatively small number of objects and procedures Decentralized design Used when data component of system has considerable number of entities and complex relations on which very complex operations are performed 57
  • 138.
    Centralized vs. DecentralizedDesign Centralized Design 58 Decentralized Design 59 Centralized vs. Decentralized Design (cont..) Aggregation process Requires designer to create single model in which various aggregation problems must be addressed: Synonyms and homonyms Entity and entity subtypes Conflicting object definitions 60 Centralized vs. Decentralized Design Summary of aggregation problems 61
  • 139.
    Database Administration Data managementis a complex job Led to the development of the database administration function. The person responsible for the control of the centralized and shared database is the database administrator (DBA). 62 DBA Activities Database planning, including the definition of standards, procedures and enforcement. Database requirements gathering and conceptual design. Database logical design and transaction design. 63 DBA Activities (cont..) Database physical design and implementation. Database testing and debugging. Database operations and maintenance, including installation, conversion and migration. Database training and support. 64
  • 140.
    The DBA inthe Organisation 65 DBA Skills 66 The Managerial Role of the DBA 67 The Managerial Role of the DBA (cont..) End-User Support Gathering user requirements Building end-user confidence. Resolving conflicts and problems. Finding solutions to information needs. Ensuring quality and integrity of applications and data. Managing the training and support of DBMS users. 68
  • 141.
    The Managerial Roleof the DBA (cont..) Policies, Procedures and Standards Policies are general statements of direction or action that communicate and support DBA goals. Standards are more detailed and specific than policies and describe the minimum requirements of a given DBA activity. Procedures are written instructions that describe a series of steps to be followed during the performance of a given activity. 69 The Managerial Role of the DBA (cont..) Data Security, Privacy and Integrity Protecting the security and privacy of the data in the database is a function of authorization management. Authorization management defines procedures to protect and guarantee database security and integrity. Includes: user access management, view definition, DBMS access control and DBMS usage monitoring. 70 The Managerial Role of the DBA (cont..) Data Backup and Recovery Many DBA departments have created a position staffed by the database security officer (DSO).
  • 142.
    The DSO’s activitiesare often classified as disaster management. Disaster management includes all of the DBA activities designed to secure data availability following a physical disaster or a database integrity failure. Disaster management includes all planning, organizing and testing of database contingency plans and recovery procedures. 71 The Managerial Role of the DBA (cont..) Data Distribution and Use The DBA is responsible for ensuring that the data are distributed to the right people, at the right time and in the right format. 72 The Technical Role of the DBA Evaluating, selecting and installing the DBMS and related utilities. Designing and implementing databases and applications. Testing and evaluating databases and applications. Operating the DBMS, utilities and applications. Training and supporting users. Maintaining the DBMS, utilities and applications.
  • 143.
    73 Evaluating, Selecting andInstalling the DBMS and Utilities (DBA) Covers the selection of the database management system, utility software and supporting hardware for use in the organization. Must be based primarily on the organization’s needs The DBA would be wise to develop a checklist of desired DBMS features. 74 Designing and Implementing Databases and Applications (DBA) Covers data modelling and design services to the end-user community Determine and enforce standards and procedures to be used. DBA then provides the necessary assistance and support during the design of the database at the conceptual, logical and physical levels 75 Testing and Evaluating Databases and Applications (DBA) The DBA must also provide testing and evaluation services for all of the database and end-user applications. Those services are the logical extension of the design, development and implementation services.
  • 144.
    Testing procedures andstandards must already be in place before any application program can be approved for use in the company. 76 Operating the DBMS, Utilities and Applications (DBA) DBMS operations can be divided into four main areas: System support. Performance monitoring and tuning. Backup and recovery. Security auditing and monitoring. 77 Training and Supporting Users (DBA) Training people to use the DBMS and its tools is included in the DBA’s technical activities. The DBA also provides or secures technical training in the use of the DBMS and its utilities for the applications programmers. 78 Maintaining the DBMS, Utilities and Applications (DBA)
  • 145.
    The maintenance activitiesof the DBA are an extension of the operational activities. Maintenance activities are dedicated to the preservation of the DBMS environment. 79 Chap1 .pptx Database Principles: Fundamentals of Design, Implementations and Management CHAPTER 1 THE DATABASE APPROACH In this chapter, you will learn: The difference between data and information What a database is, what the different types of databases are, and why they are valuable assets for decision making The importance of database design How modern databases evolved from file systems About flaws in file system data management What the database system’s main components are and how a database system differs from a file system The main functions of a database management system (DBMS) The role of Open Source Database Systems The importance of Data Governance and Data Quality
  • 146.
    2 Data vs. Information Data: Rawfacts; building blocks of information Unprocessed information Information: Data processed to reveal meaning Accurate, relevant, and timely information is the key to good decision making Good decision making is the key to survival in a global environment 3 Transforming Raw Data into Information Fig 1.1 p6 Initial survey screen 4 Transforming Raw Data into Information (cont..) Fig 1.1 Information in graphic format 5
  • 147.
    Data Quality andData Governance Data Quality can be examined at a number of different levels including: Accuracy: Is the data accurate and come from a verifiable source? Relevance: Is the data relevant to the organisation? Completeness: Is the required data being stored? Timeliness: Is the data updated frequently in order to meet the business requirements? Uniqueness: Is the data unique and there is no redundancy in the database? Unambiguous: Is the meaning of the data clear. 6 Data Quality and Data Governance (cont…) Data governance is the term used to describe a strategy or methodology defined by an organisation to safeguard data quality. Each organisation produces its own data governance strategy which will involve the development of a series of policies and procedures for managing availability, usability, quality, integrity, and security of data within the organisation. Introducing the Database and the DBMS Database—shared, integrated computer structure that stores: End user data (raw facts)
  • 148.
    Metadata (data aboutdata) DBMS (database management system): Collection of programs that manages database structure and controls access to data Possible to share data among multiple applications or users Makes data management more efficient and effective The DBMS hides much of the database’s internal complexity from the application programs and users. The application program might be written by a programmer using a programming language such as COBOL, Visual basic, C++, or Java or it might be created through a DBMS utility program. 8 Role and Advantages of the DBMS (cont.) A DBMS provides advantages such as : Improved data sharing. Users have better access to more and better-managed data Better data integration. Promotes integrated view of organization’s operations Minimised data inconsistency. Probability of data inconsistency is greatly reduced Improved data access. Possible to produce quick answers to ad hoc queries 9
  • 149.
    Role and Advantagesof the DBMS (cont..) 10 Types of Databases Single-user: Supports only one user at a time Desktop: Single-user database running on a personal computer Multi-user: Supports multiple users at the same time Workgroup: Multi-user database that supports a small group of users or a single department Enterprise: Multi-user database that supports a large group of users or an entire organization 11 Types of Databases (cont..) Can be classified by location: Centralized: Supports data located at a single site Distributed: Supports data distributed across several sites Can be classified by use:
  • 150.
    Transactional (or production): Supportsa company’s day-to-day operations Data warehouse: Stores data used to generate information required to make tactical or strategic decisions Often used to store historical data Structure is quite different 12 Why Database Design is Important Database design refers to the activities that focus on the design of the database structure that will be used to store and manages end-user data. Defines the database’s expected use Different approach needed for different types of databases Avoid redundant data 13 Historical Roots: Files and Data Processing Managing data with file systems is obsolete Understanding file system characteristics makes database design easier to understand Awareness of problems with file systems helps prevent similar problems in DBMS
  • 151.
    Knowledge of filesystems is helpful if you plan to convert an obsolete file system to a DBMS 14 Historical Roots: Files and Data Processing (cont..) Manual File systems: Collection of file folders kept in file cabinet Organization within folders based on data’s expected use (ideally logically related) System adequate for small amounts of data with few reporting requirements Finding and using data in growing collections of file folders became time-consuming and cumbersome 15 Historical Roots: Files and Data Processing (cont..) Computerised file systems: Conversion from manual to computer system: Could be technically complex, requiring hiring of data processing (DP) specialists Resulted in numerous “home-grown” systems being created Initially, computer files were similar in design to manual files (see Figure 1.3) 16
  • 152.
    Historical Roots: Filesand Data Processing cont..) 17 Historical Roots: Files and Data Processing (cont..) Fig 1.3 18 Historical Roots: Files and Data Processing (cont..) DP specialist wrote programs for reports: Monthly summaries of types and amounts of insurance sold by agents Monthly reports about which customers should be contacted for renewal Reports that analyzed ratios of insurance types sold by agent Customer contact letters summarizing coverage Other departments requested databases be written for them SALES database created for sales department AGENT database created for personnel department (see Fig 1.4 next) 19
  • 153.
    Historical Roots: Filesand Data Processing (cont…) 20 Historical Roots: Files and Data Processing(cont..) As number of databases increased, small file system evolved Each file used its own application programs Each file was owned by individual or department who commissioned its creation 21 Historical Roots: Files and Data Processing (cont) 22 Example of Early Database Design (cont…) As system grew, demand for DP’s (Data Specialists) programming skills grew Additional programmers hired DP specialist evolved into DP manager, supervising a DP department Primary activity of department (and DP manager) remained
  • 154.
    programming 23 Problems with FileSystem Data Management Every task requires extensive programming in a third-generation language (3GL) Programmer must specify task and how it must be done Modern databases use fourth-generation languages (4GL) Allow users to specify what must be done without specifying how it is to be done 24 Problems with File System Data Management Lengthy development times. Difficulty in getting quick answers. Complex System Administration Lack of security and limited data sharing Extensive Programming 25
  • 155.
    Structural and DataDependence Structural dependence (SD) A file systems exhibits SD; that is, access to a file depends on its structure Data independence Changes in the data storage characteristics without affecting the application program’s ability to access the data The practical significance of data dependence is the difference between the: Logical data format How the human being views the data And the Physical data format How the computer “sees” the data 26 Field Definitions and Naming Conventions Flexible record definition anticipates reporting requirements by breaking up fields into their component parts 27 Data Redundancy Data redundancy results in data inconsistency Different and conflicting versions of the same data appear in different places Errors more likely to occur when complex entries are made in
  • 156.
    several different filesand/or recur frequently in one or more files A Data anomaly develops when required changes in the redundant data are not made successfully Types of data anomalies: Update anomalies Occur when changes must be made to existing records Insertion anomalies Occur when entering new records Deletion anomalies Occur when deleting records 28 Database Systems Problems inherent in file systems make using a database system very desirable File system Many separate and unrelated files Database Logically related data stored in a single logical data repository 29 Database Systems
  • 157.
    30 The Database SystemEnvironment Database system is composed of five main parts: Hardware Software Operating system software DBMS software Application programs and utility software People Procedures Data 31 The Database System Environment (cont…) 32 DBMS Functions DBMS performs functions that guarantee integrity and consistency of data Data dictionary management defines data elements and their relationships Data storage management
  • 158.
    stores data andrelated data entry forms, report definitions, etc. Data transformation and presentation translates logical requests into commands to physically locate and retrieve the requested data Security management enforces user security and data privacy within database 33 DBMS Functions (cont…) Multiuser access control uses sophisticated algorithms to ensure multiple users can access the database concurrently without compromising the integrity of the database Backup and recovery management provides backup and data recovery procedures Data integrity management promotes and enforces integrity rules Database access languages and application programming interfaces provide data access through a query language Database communication interfaces allow database to accept end-user requests via multiple, different network environments 34
  • 159.
    DBMS Functions (continued) 35 DBMSFunctions (cont…) 36 Summary Data are raw facts. Information is the result of processing data to reveal its meaning. To implement and manage a database, use a DBMS. Database design defines the database structure. A well-designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organization. Databases were preceded by file systems. Limitations of file system data management: requires extensive programming system administration complex and difficult making changes to existing structures is difficult security features are likely to be inadequate independent files tend to contain redundant data DBMS’s were developed to address file systems’ inherent weaknesses
  • 160.
    37 Types of Databases(cont..) Open Source Open Source software is that it is free to acquire and use the product itself. However, there will be costs involved in the development and on-going support of the software. LAMP is used to define the most the most popular open source software namely; Linux, Apache Web server, MySQL DBMS, and the Perl PHP/Python development languages. 38 R_Ch06- Data Modelling Advanced Concepts.ppt Database Principles: Fundamentals of Design, Implementations and Management Lecture 6 - CHAPTER 6 : Data Modelling Advanced Concepts
  • 161.
    * ObjectivesIn this chapter,you will learn:About the extended entity relationship (EER) model’s main constructsHow entity clusters are used to represent multiple entities and relationshipsThe characteristics of good primary keys and how to select themHow to use flexible solutions for special data modeling casesWhat issues to check for when developing data models based on EER diagrams * The Extended Entity Relationship ModelResult of adding more semantic constructs to original entity relationship (ER) modelDiagram using this model is called an EER diagram (EERD) Entity Supertypes and SubtypesEntity supertype Generic entity type that is related to one or more entity subtypesContains common characteristicsEntity subtypes Contains unique characteristics of each entity subtype * * Entity Supertypes and Subtypes (cont..) * Specialization HierarchyDepicts an arrangement of higher-level
  • 162.
    entity supertypes andlower-level entity subtypesRelationships are described in terms of “IS-A” relationshipsSubtype exists only within context of supertypeEvery subtype has only one supertype to which it is directly relatedCan have many levels of supertype/subtype relationships Figure 6.2 in your book as well * Specialization Hierarchy (cont..) Figure 6.2 in your book as well Specialization Hierarchy (cont..)Support attribute inheritanceDefine special supertype attribute known as subtype discriminatorDefine disjoint/overlapping constraints and complete/partial constraints * * InheritanceEnables entity subtype to inherit attributes and relationships of supertypeAll entity subtypes inherit their primary key attribute from their supertypeAt implementation level, supertype and its subtype(s) maintain a 1:1 relationshipEntity subtypes inherit all relationships in which supertype entity participatesLower-level subtypes inherit all attributes and relationships from all upper level-supertypes
  • 163.
    Inheritance (cont..) * Inheritance (cont..) * * NaturalKeys and Primary KeysNatural key is a real-world identifier used to uniquely identify real-world objectsFamiliar to end users and forms part of their day-to-day business vocabularyGenerally data modeler uses natural identifier as primary key of entity being modeledMay instead use composite primary key or surrogate key * Primary Key GuidelinesA Primary key is an attribute or combination of attributes that uniquely identifies entity instances in an entity setCould also be combination of attributesMain function is to uniquely identify an entity instance or row within a tableGuarantee entity integrity, not to “describe” the entityPrimary keys and foreign keys implement relationships among entitiesBehind the scenes, hidden from user Primary Key Guidelines (cont..)
  • 164.
    * Primary Key Guidelines(cont..) * * Entity Integrity: Selecting Primary KeysPrimary key most important characteristic of an entitySingle attribute or some combination of attributesPrimary key’s function is to guarantee entity integrityPrimary keys and foreign keys work together to implement relationshipsProperly selecting primary key has direct bearing on efficiency and effectiveness * When to Use Composite Primary KeysComposite primary keys are useful in two cases:As identifiers of composite entitiesWhere each primary key combination allowed once in M:N relationshipAs identifiers of weak entitiesWhere weak entity has a strong identifying relationship with the parent entityAutomatically provides benefit of ensuring that there cannot be duplicate values
  • 165.
    Figure 6.7 inyour book * When to Use Composite Primary Keys (cont..) Figure 6.7 in your book * When to Use Composite Primary Keys (cont..)When used as identifiers of weak entities normally used to represent:Real- world object that is existent-dependent on another real-world objectReal-world object that is represented in data model as two separate entities in strong identifying relationshipDependent entity exists only when it is related to parent entity * When To Use Surrogate Primary KeysEspecially helpful when there is:No natural keySelected candidate key has embedded semantic contentsSelected candidate key is too long or cumbersomeIf you use surrogate keyEnsure that candidate key of entity in question performs properlyUse “unique index” and “not null” constraints When To Use Surrogate Primary Keys (cont..) * *
  • 166.
    Design Cases: Learning FlexibleDatabase DesignData modeling and design requires skills acquired through experienceExperience acquired through practiceFour special design cases that highlight:Importance of flexible designProper identification of primary keysPlacement of foreign keys * Design Case #1: Implementing 1:1 RelationshipsForeign keys work with primary keys to properly implement relationships in relational modelPut primary key of the “one” side (parent entity) on the “many” side (dependent entity) as foreign keyPrimary key: parent entityForeign key: dependent entity * Design Case #1: Implementing 1:1 Relationships (cont..)In 1:1 relationship two options:Place a foreign key in both entities (not recommended)Place a foreign key in one of the entities Primary key of one of the two entities appears as foreign key of other Design Case #1: Implementing 1:1 Relationships (continued) *
  • 167.
    Figure 6.9 inyour book * Design Case #1: Implementing 1:1 Relationships (cont..) Figure 6.9 in your book * Design Case #2: Maintaining History of Time-Variant DataNormally, existing attribute values replaced with new value without regard to previous valueTime-variant data:Values change over timeMust keep a history of data changesKeeping history of time-variant data equivalent to having a multivalued attribute in your entityMust create new entity in 1:M relationships with original entityNew entity contains new value, date of change Figure 6.10 in your book * Design Case #2: Maintaining History of Time-Variant Data (cont..) Figure 6.10 in your book Figure 6.11 in your book
  • 168.
    * Design Case #2:Maintaining History of Time-Variant Data (cont..) Figure 6.11 in your book * Design Case #3: Fan TrapsDesign trap occurs when relationship is improperly or incompletely identifiedRepresented in a way not consistent with the real worldMost common design trap is known as fan trapFan trap occurs when one entity is in two 1:M relationships to other entitiesProduces an association among other entities not expressed in the model Figure 6.12 in your book * Design Case #3: Fan Traps (cont..) Figure 6.12 in your book * Design Case #4: Redundant RelationshipsRedundancy is seldom a good thing in database environmentOccur when there are multiple relationship paths between related entitiesMain concern is that redundant relationships remain consistent across modelSome designs use redundant relationships to simplify the design
  • 169.
    Figure 6.13 inyour book * Design Case #4: Redundant Relationships (cont..) Figure 6.13 in your book Figure 6.14 in your book. * Design Case #4: Redundant Relationships (cont..) Figure 6.14 in your book. * Data Modeling ChecklistData modeling translates specific real- world environment into data modelRepresents real-world data, users, processes, interactionsEERM (Extented Entity Relationship Model) enables the designer to add more semantic content to the modelData modeling checklist helps ensure data modeling tasks successfully performedBased on concepts and tools learned since Chapter 3 Data Modeling Checklist *
  • 170.
    Data Modeling Checklist(cont..) * * SummaryExtended entity relationship (EER) model adds semantics to ER modelAdds semantics via entity supertypes, subtypes, and clustersEntity supertype is a generic entity type related to one or more entity subtypesSpecialization hierarchy Depicts arrangement and relationships between entity supertypes and entity subtypesInheritance means an entity subtype inherits attributes and relationships of supertype * Summary (cont..)Subtype discriminator determines which entity subtype the supertype occurrence is related to:Partial or total completenessSpecialization vs. generalizationEntity cluster is “virtual” entity typeRepresents multiple entities and relationships in ERDFormed by combining multiple interrelated entities and relationships into a single object * Summary (cont..)Natural keys are identifiers that exist in real worldSometimes make good primary keysCharacteristics of primary keys:Must have unique valuesShould be
  • 171.
    nonintelligentMust not changeover timePreferably numeric or composed of single attribute * Summary (cont..)Composite keys are useful to represent M:N relationships Weak (strong-identifying) entitiesSurrogate primary keys are useful when no suitable natural key makes primary keyIn a 1:1 relationship, place the PK of mandatory entityAs FK in optional entityAs FK in entity that causes least number of nullsAs FK where the role is played * Summary (cont..)Time-variant data Data whose values change over timeRequires keeping a history of changesTo maintain history of time-variant data:Create entity containing the new value, date of change, other time-relevant dataEntity maintains 1:M relationship with entity for which history maintained * Summary (cont..)Fan trap occurs when you have:One entity in two 1:M relationships to other entities and there is an Association among the other entities not expressed in modelRedundant relationships occur when multiple relationship paths between related entitiesMain concern is that they remain consistent across the modelData modeling checklist provides way to check that the ERD meets minimum requirements ADDITIONAL SLIDES
  • 172.
    Please find additionalslides to have a look at.. * * Subtype DiscriminatorAttribute in supertype entity Determines to which entity subtype each supertype occurrence is relatedDefault comparison condition for subtype discriminator attribute is equality comparisonSubtype discriminator may be based on other comparison condition * Disjoint and Overlapping ConstraintsDisjoint subtypesAlso known as non-overlapping subtypesSubtypes that contain unique subset of supertype entity setOverlapping subtypesSubtypes that contain nonunique subsets of supertype entity set Figure 6.4 Same as in your book * Disjoint and Overlapping Constraints (cont..) Figure 6.4 Same as in your book Disjoint and Overlapping Constraints (cont..) *
  • 173.
    * Completeness ConstraintSpecifies whetherentity supertype occurrence must be a member of at least one subtypeCan be partial or totalPartial completenessSymbolized by a circle over a single lineSome supertype occurrences that are not members of any subtypeTotal completenessSymbolized by a circle over a double lineEvery supertype occurrence must be member of at least one subtype Table 6.2 same as in your book.. * Completeness Constraint (cont..) Table 6.2 same as in your book.. * Specialization and GeneralizationSpecializationIdentifies more specific entity subtypes from higher-level entity supertypeTop- down process of identifying lower-level, more specific entity subtypes from higher-level entity supertypeBased on grouping unique characteristics and relationships of the subtypes * Specialization and Generalization (cont..)GeneralizationIdentifies more generic entity supertype from lower-level entity subtypesBottom-up process of identifying higher-level, more generic entity supertype from lower-level entity subtypesBased on grouping common characteristics and relationships of the subtypes
  • 174.
    Composition and AggregationAggregationalarger entity can be composed of smaller entities.Compositionspecial case of aggregationwhen the parent entity instance is deleted, all child entity instances are automatically deleted. * Composition and Aggregation (cont..) * Using Aggregation and CompositionAn aggregation construct is used when an entity is composed of (or is formed by) a collection of other entities, but the entities are independent of each other. the relationship can be classified as a ‘has_a’ relationship type.A composition construct is used when two entities are associated in an aggregation association with a strong identifying relationship. deleting the parent deletes the children instances. * Aggregation and Composition *
  • 175.
    * Entity ClusteringA “Virtual”entity type is used to represent multiple entities and relationships in ERDConsidered “virtual” or “abstract” because it is not actually an entity in final ERDTemporary entity used to represent multiple entities and relationshipsEliminate undesirable consequencesAvoid display of attributes when entity clusters are used Figure 6.6 in your book * Entity Clustering (cont..) Figure 6.6 in your book * * * * * * * * *
  • 176.
    * * * * * * DB-Lecture3_ch03.ppt Database Principles: Fundamentalsof Design, Implementations and Management CHAPTER 3 Relational Model Characteristics ObjectivesIn this chapter, you will learn:That the relational database model offers a logical view of dataAbout the relational model’s basic component: relationsThat relations are logical constructs composed of rows (tuples) and columns (attributes)That relations are implemented as tables in a relational DBMSAbout relational database operators, the data dictionary, and the system catalogHow data redundancy is handled in the relational database modelWhy indexing is
  • 177.
    important * A Logical Viewof DataRelational model Enables the programmer to view data logically rather than physicallyTable Has structural and data independenceResembles a file conceptuallyRelational database model easier to understand than its hierarchical and network database predecessors modelsTable also called a relation because the relational model’s creator, Codd, used the term relation as a synonym for table * Tables and Their CharacteristicsLogical view of relational database based on relationRelation thought of as a tableThink of a table as a persistent relation:A relation whose contents can be permanently saved for future useTable: two-dimensional structure composed of rows and columnsPersistent representation of logical relationContains group of related entities = an entity set * Properties of a Relation *
  • 178.
    Example Relation /Table * Attributes and Domains *Each attribute is a named column within the relational table and draws its values from a domain. The domain of values for an attribute should contain only atomic values and any one value should not be divisible into components. No attributes with more than one value are allowed. Degree and CardinalityDegree and cardinality are two important properties of the relational model. A relation with N columns and N rows is said to be of degree N and cardinality N. The degree of a relation is the number of its attributes and the cardinality of a relation is the number of its tuples. The product of a relation’s degree and cardinality is the number of attribute values it contains. * Relational Schema A relational schema is a textual representation of the database tables, where each table is described by its name followed by the list of its attributes in parentheses.
  • 179.
    KeysA key consistsof one or more attributes that determine other attributes Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row) A Key’s role is based on determinationIf you know the value of attribute A, you can look up (determine) the value of attribute B * * Keys (cont..) Relational Database Keys (cont….)Composite keyComposed of more than one attributeKey attributeAny attribute that is part of a keySuperkeyAny key that uniquely identifies each rowCandidate key A superkey without redundancies and without unnecessary attributesEx: Stud_ID, Stud_lastname * Keys (cont..)Nulls:No data entryNot permitted in primary keyShould be avoided in other attributesCan representAn unknown attribute valueA known, but missing, attribute valueA “not applicable” conditionCan create problems when functions such as COUNT, AVERAGE, and SUM are usedCan create
  • 180.
    logical problems whenrelational tables are linkedControlled redundancy:Makes the relational database workTables within the database share common attributes that enables the tables to be linked togetherMultiple occurrences of values in a table are not redundant when they are required to make the relationship workRedundancy exists only when there is unnecessary duplication of attribute values * Keys (cont..) * Keys (cont..) * Keys (cont..)Foreign key (FK) An attribute whose values match primary key values in the related tableReferential integrity FK contains a value that refers to an existing valid tuple (row) in another relationSecondary key Key used strictly for data retrieval purposes *
  • 181.
    Integrity RulesMany RDBMsenforce integrity rules automaticallyIt is safer to ensure that your application design conforms to entity and referential integrity rules Rules are summarized in the next slide Designers use flags to avoid nullsFlags indicate absence of some valueFor Ex, the code -99 could be used as the AGENT_CODE entry for the 4th row of the CUSTOMER Table to indicate that customer Paul Olowsky does not have yet an agent assigned to it * Integrity Rules * Integrity Rules * The Data Dictionary and System CatalogData dictionary Provides detailed accounting of all tables found within the user/designer-created database Contains (at least) all the attribute names and characteristics for each table in the system Contains metadata: data about data
  • 182.
    Sometimes described as“the database designer’s database” because it records the design decisions about tables and their structures * * A Sample Data Dictionary The Data Dictionary and System Catalog (cont..)System catalogContains metadata Detailed system data dictionary that describes all objects within the database Terms “system catalog” and “data dictionary” are often used interchangeably Can be queried just like any user/designer-created table * Relationships within the Relational Database1:M relationship Relational modeling idealShould be the norm in any relational database design1:1 relationship Should be rare in any relational database designM:N relationships Cannot be implemented as such in the relational modelM:N relationships can be changed into two 1:M relationships *
  • 183.
    The 1:M RelationshipRelationaldatabase normFound in any database environment * * The 1:M Relationship (cont…) The 1:1 RelationshipOne entity related to only one other entity, and vice versaSometimes means that entity components were not defined properlyCould indicate that two entities actually belong in the same tableCertain conditions absolutely require their useAs rare as 1:1 relationships should be, certain conditions absolutely require their use * * The 1:1 Relationship (cont…)
  • 184.
    * The 1:1 Relationship(cont…) The M:N RelationshipCan be implemented by breaking it up to produce a set of 1:M relationshipsAvoid problems inherent to M:N relationship by creating a composite entity or a bridge entityThe composite entity Includes -as foreign keys- at least the primary keys of the tables that are to to be linked * Implementation of a composite entityYields required M:M to 1:M conversionComposite entity table must contain at least the primary keys of original tablesLinking table contains multiple occurrences of the foreign key valuesAdditional attributes may be assigned as needed * The M:M Relationship (cont..) * The M:M Relationship (cont…) * Figure 3.16 in the book
  • 185.
    * Figure 3.17 inyour book * Data Redundancy RevisitedData redundancy leads to data anomaliesSuch anomalies can destroy the effectiveness of the database Foreign keysControl data redundancies by using common attributes shared by tablesCrucial to exercising data redundancy control Sometimes, data redundancy is necessary * Data Redundancy Revisited (cont…) * Data Redundancy Revisited (cont..)
  • 186.
    * * Data Redundancy Revisited(cont…) IndexesOrderly arrangement to logically access rows in a tableIndex key Index’s reference pointPoints to data location identified by the keyUnique indexIndex in which the index key can have only one pointer value (row) associated with itEach index is associated with only one table * * Indexes (cont..) Similar to Figure 3.20 of your book and better explained Codd’s Relational Database RulesIn 1985, Codd published a list of 12 rules to define a relational database system The reason was the concern that many vendors were marketing products as “relational” even though those products did not meet minimum relational standards *
  • 187.
    SummaryTables (relations) arebasic building blocks of a relational databaseKeys are central to the use of relational tablesKeys define functional dependenciesSuperkeyCandidate keyPrimary keySecondary keyForeign keyEach table row must have a primary key which uniquely identifies all attributes Tables can be linked by common attributes. Thus, the primary key of one table can appear as the foreign key in another table to which it is linked Good design begins by identifying appropriate entities and attributes and the relationships among the entities. Those relationships (1:1, 1:M, M:N) can be represented using ERDs. *
  • 188.
    Chap 5.ppt Database Principles:Fundamentals of Design, Implementations and Management CHAPTER 5 Data Modelling With Entity Relationship Diagrams * ObjectivesIn this chapter, you will learn:The main characteristics of entity relationship componentsHow relationships between entities are defined, refined, and incorporated into the database design processHow ERD components affect database design and implementationThat
  • 189.
    real-world database designoften requires the reconciliation of conflicting goals * The Entity Relationship (ER) ModelER model forms the basis of an ER diagramERD represents conceptual database as viewed by the end userERDs depict database’s main components:EntitiesAttributesRelationships * EntitiesRefers to entity set and not to single entity occurrenceCorresponds to a table and not to row in relational environmentIn Chen and Crow’s Foot models, entity is represented by a rectangle with an entity’s nameEntity name, a noun, written in capital letters * AttributesCharacteristics of entitiesChen notation: attributes represented by ovals connected to entity rectangle with a lineEach oval contains the name of attribute it representsCrow’s Foot notation: attributes written in attribute box below entity
  • 190.
    rectangle * * Attributes (cont..)Required attribute:must have a valueOptional attribute: may be left emptyDomain: set of possible values for an attributeAttributes may share a domainIdentifiers: one or more attributes that uniquely identify each entity instanceComposite identifier: primary key composed of more than one attribute * * Attributes (cont..)Composite attribute can be subdividedSimple
  • 191.
    attribute cannot besubdividedSingle-value attribute can have only a single valueMultivalued attributes can have many values * Attributes (cont..)M:N relationships and multivalued attributes should not be implementedCreate several new attributes for each of the original multivalued attributes componentsCreate new entity composed of original multivalued attributes componentsDerived attribute: value may be calculated from other attributesNeed not be physically stored within database * * RelationshipsAssociation between entitiesParticipants are entities that participate in a relationshipRelationships between entities always operate in both directionsRelationship can be classified as 1:MRelationship classification is difficult to establish if only one side of the relationship is known
  • 192.
    * Connectivity and CardinalityConnectivityDescribes the relationship classificationCardinality Expresses minimum and maximum number of entity occurrences associated with one occurrence of related entityEstablished by very concise statements known as business rules * * Existence DependenceExistence dependenceEntity exists in database only when it is associated with another related entity occurrenceExistence independenceEntity can exist apart from one or more related entitiesSometimes such an entity is referred to as a strong or regular entity *
  • 193.
    Relationship StrengthWeak (non-identifying) relationshipsExistsif PK of related entity does not contain PK component of parent entity Strong (identifying) relationshipsExists when PK of related entity contains PK component of parent entity * Weak (Non-Identifying) Relationships * Strong (Identifying) Relationships * Weak EntitiesWeak entity meets two conditionsExistence- dependentCannot exist without entity with which it has a relationshipHas a primary key that is partially or totally derived from parent entity in relationshipDatabase designer usually determines whether an entity can be described as weak based on business rules
  • 194.
    * Strong Entity Weak Entity * WeakEntities (cont..) * Relationship ParticipationOptional participationOne entity occurrence does not require corresponding entity occurrence in particular relationshipMandatory participationOne entity occurrence requires corresponding entity occurrence in particular relationship * Relationship Participation (cont..)
  • 195.
    * Relationship Participation (cont..) * RelationshipDegreeIndicates number of entities or participants associated with a relationshipUnary relationshipAssociation is maintained within single entity Binary relationship Two entities are associatedTernary relationship Three entities are associated * Relationship Degree (cont..) * Relationship Degree (cont..)
  • 196.
    * Recursive RelationshipsRelationship canexist between occurrences of the same entity setNaturally found within unary relationship * Recursive Relationships (cont..) * Recursive Relationships (cont..) * Associative (Composite) EntitiesAlso known as bridge entitiesUsed to implement M:N relationshipsComposed of primary keys of each of the entities to be connectedMay also contain additional attributes that play no role in connective process
  • 197.
    * Composite Entities (cont..) * CompositeEntities (cont..) * Developing an ER DiagramDatabase design is iterative rather than linear or sequential processIterative process Based on repetition of processes and proceduresBuilding an ERD usually involves the following activities:Create detailed narrative of organization’s description of operationsIdentify business rules based on description of operationsIdentify main entities and relationships from business rulesDevelop initial ERDIdentify attributes and primary keys that adequately describe entitiesRevise and review ERD
  • 198.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..)
  • 199.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..)
  • 200.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Database Design Challenges: Conflicting GoalsDatabase designers must make design compromisesConflicting goals: design standards, processing speed, information requirementsImportant to meet logical requirements and design conventionsDesign of little value unless it delivers all specified query and reporting requirementsSome design and implementation problems do not yield “clean” solutions * Database Design Challenges: Conflicting Goals (cont.)
  • 201.
    * SummaryEntity relationship (ER)model Uses ERD to represent conceptual database as viewed by end userERM’s main components:EntitiesRelationshipsAttributesIncludes connectivity and cardinality notationsMultiplicities are based on business rulesIn ERM, M:M relationship is valid at conceptual levelERDs may be based on many different ERMsDatabase designers are often forced to make design compromises Chap 5.ppt Database Principles: Fundamentals of Design, Implementations and Management CHAPTER 5 Data Modelling With Entity Relationship Diagrams * ObjectivesIn this chapter, you will learn:The main characteristics of entity relationship componentsHow relationships between entities are defined, refined, and incorporated into the database design processHow ERD components affect database design and implementationThat real-world database design often requires the reconciliation of
  • 202.
    conflicting goals * The EntityRelationship (ER) ModelER model forms the basis of an ER diagramERD represents conceptual database as viewed by the end userERDs depict database’s main components:EntitiesAttributesRelationships * EntitiesRefers to entity set and not to single entity occurrenceCorresponds to a table and not to row in relational environmentIn Chen and Crow’s Foot models, entity is represented by a rectangle with an entity’s nameEntity name, a noun, written in capital letters * AttributesCharacteristics of entitiesChen notation: attributes represented by ovals connected to entity rectangle with a lineEach oval contains the name of attribute it representsCrow’s Foot notation: attributes written in attribute box below entity rectangle
  • 203.
    * * Attributes (cont..)Required attribute:must have a valueOptional attribute: may be left emptyDomain: set of possible values for an attributeAttributes may share a domainIdentifiers: one or more attributes that uniquely identify each entity instanceComposite identifier: primary key composed of more than one attribute * * Attributes (cont..)Composite attribute can be subdividedSimple attribute cannot be subdividedSingle-value attribute can have
  • 204.
    only a singlevalueMultivalued attributes can have many values * Attributes (cont..)M:N relationships and multivalued attributes should not be implementedCreate several new attributes for each of the original multivalued attributes componentsCreate new entity composed of original multivalued attributes componentsDerived attribute: value may be calculated from other attributesNeed not be physically stored within database * * RelationshipsAssociation between entitiesParticipants are entities that participate in a relationshipRelationships between entities always operate in both directionsRelationship can be classified as 1:MRelationship classification is difficult to establish if only one side of the relationship is known
  • 205.
    * Connectivity and CardinalityConnectivityDescribes the relationship classificationCardinality Expresses minimum and maximum number of entity occurrences associated with one occurrence of related entityEstablished by very concise statements known as business rules * * Existence DependenceExistence dependenceEntity exists in database only when it is associated with another related entity occurrenceExistence independenceEntity can exist apart from one or more related entitiesSometimes such an entity is referred to as a strong or regular entity * Relationship StrengthWeak (non-identifying)
  • 206.
    relationshipsExists if PKof related entity does not contain PK component of parent entity Strong (identifying) relationshipsExists when PK of related entity contains PK component of parent entity * Weak (Non-Identifying) Relationships * Strong (Identifying) Relationships * Weak EntitiesWeak entity meets two conditionsExistence- dependentCannot exist without entity with which it has a relationshipHas a primary key that is partially or totally derived from parent entity in relationshipDatabase designer usually determines whether an entity can be described as weak based on business rules
  • 207.
    * Strong Entity Weak Entity * WeakEntities (cont..) * Relationship ParticipationOptional participationOne entity occurrence does not require corresponding entity occurrence in particular relationshipMandatory participationOne entity occurrence requires corresponding entity occurrence in particular relationship * Relationship Participation (cont..)
  • 208.
    * Relationship Participation (cont..) * RelationshipDegreeIndicates number of entities or participants associated with a relationshipUnary relationshipAssociation is maintained within single entity Binary relationship Two entities are associatedTernary relationship Three entities are associated * Relationship Degree (cont..) * Relationship Degree (cont..)
  • 209.
    * Recursive RelationshipsRelationship canexist between occurrences of the same entity setNaturally found within unary relationship * Recursive Relationships (cont..) * Recursive Relationships (cont..) * Associative (Composite) EntitiesAlso known as bridge entitiesUsed to implement M:N relationshipsComposed of primary keys of each of the entities to be connectedMay also contain additional attributes that play no role in connective process
  • 210.
    * Composite Entities (cont..) * CompositeEntities (cont..) * Developing an ER DiagramDatabase design is iterative rather than linear or sequential processIterative process Based on repetition of processes and proceduresBuilding an ERD usually involves the following activities:Create detailed narrative of organization’s description of operationsIdentify business rules based on description of operationsIdentify main entities and relationships from business rulesDevelop initial ERDIdentify attributes and primary keys that adequately describe entitiesRevise and review ERD
  • 211.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..)
  • 212.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..) * Developing an ER Diagram (cont..)
  • 213.
    * Developing an ERDiagram (cont..) * Developing an ER Diagram (cont..) * Database Design Challenges: Conflicting GoalsDatabase designers must make design compromisesConflicting goals: design standards, processing speed, information requirementsImportant to meet logical requirements and design conventionsDesign of little value unless it delivers all specified query and reporting requirementsSome design and implementation problems do not yield “clean” solutions * Database Design Challenges: Conflicting Goals (cont.)
  • 214.
    * SummaryEntity relationship (ER)model Uses ERD to represent conceptual database as viewed by end userERM’s main components:EntitiesRelationshipsAttributesIncludes connectivity and cardinality notationsMultiplicities are based on business rulesIn ERM, M:M relationship is valid at conceptual levelERDs may be based on many different ERMsDatabase designers are often forced to make design compromises