More and more sensitive data is stored on computers. As sophistication and the number of users grow, the data becomes more vunerable to unwanted access or corruption. Computer security starts with a set of well-designed set of controls. This lecture will look at security, distinguish it from integrity, and examine some of the threats to security and integrity and how we can either avoid the problems or resolve them if they arise.
A) We want to stop people accessing data when they shouldn’t B) We want to ensure that users can do what they need to do. For example, we need to ensure that all of you can use ORACLE SQL*Plus and SQL*Forms. Use of passwords: “ A password should be like a toothbrush. Use it every day; change it regularly; and DON’T share it with anyone” (O’Reilly and Associates) It is very important to not mix this up with data integrity. Sometimes they are used synonomously - BUT they are not the same thing!
Once you can access data we want to ensure that you don’t do anything wrong.
Clearly there can be many kinds of failure which could threaten the database. HARDWARE FAILURES 1. DISK This type of failure is potentially the worst, because it can cause data to be physically lost or the data may have been corrupted in some way. It’s the most time consuming failure to recover from. As disk drives are complex in structure and utilise several mechanical parts, it is imperative that you plan for such a failure. 2. CPU Failures of this type are becoming increasingly rare. This is due largely to the advances in technology and an overall increase in quality. It would be rare for this type of failure to cause the physical loss of any data. 3. NETWORK We are all aware of the huge growth in computer networks. Failures of this type can be difficult to trace and solve for the following reasons:- large number of physical connections, poor diagnostic aids, third party involvement; i.e... BT, other telecommunication carriers SOFTWARE FAILURES 1. SYSTEM AND DATABASE Third party vendor software is usually very robust due to extensive testing and quality control. 2. PROGRAM Program bugs are the most common form of a system failure. They can remain undetected for long periods of time. N.B.. ALL SOFTWARE FAILURES CAN RESULT IN LOSS AND/OR CORRUPTION OF DATA
The most important security features of a DBMS are: Views Restrict users view of the database (i.e. a subset of another table) Data not included cannot be accessed by a program using that view Not adequate on their own (don’t guard against unauthorised access to procedures, or ‘experimentation’ so more sophisticated techniques are required) Authorisation rules Controls which restrict access to data & actions which may be performed on data. 1) For example, at the object level, e.g. order records can be read updated 2) Subjects - organisational entities that can access the database,e.g. accountants 3) Constraints (at the Program Level) i.e.. Business Constraint, Column Constraints, Key constraints. Will discuss these in more detail in a later slide. User Defined Procedures Processes & Procedures over and above those provided by the DBMS. These could be achieved by a) Programming,e.g. ask a user their mother’s maiden name. b) Operating System Features Encryption Techniques Coding or scrambling of data to make it unreadable without decoding - can be DBMS supplied or user coded BUT encryption procedures must be adequately secured, and it consumes significant computing resources. E.G. move characters to the left. Encryption is often used by Banks in electronic transfer of money. Also Authentication techniques e.g. retina scans finger printing, smart cards.
This examples shows how different people within an organisation have the authority to do different things.
A further example – showing different levels of access to a system.
Constraints are applied to try to ensure that the integrity of data is maintained.
For example, if you had one column, e.g. credit limit = 200, then another column such as withdrawals would depend on the value of that column.
Entity Constraints For example the primary key should always be not null. If you didn’t have a value in the primary key, you would have no data to uniquely identify it! Referential Constraints E.g. You cannot delete a value in the primary key of one table if it exists in another! Some integrity rules can be implemented within ORACLE when the tables are created. These rules are then either implemented when rows are inserted into the database or they can be used to generate validation rules within Forms. Look at some of the SQL scripts to see if you can spot the constraints and at the next slide.
Should go through and identify all the different types of constraint on this example.
These are the benefits of using constraints.
We’ve looked at the threats to security and how they can be circumvented, and we’ve looked at some of the problems of integrity and how we can use constraints to avoid them. However, concurrency control is an important feature when looking at integrity.
If a system is being used concurrently then, for example, updates may all happen at the same time. If this is not handled correctly then there is the potential for data to be inaccurate. So this is an important problem that needs to be dealt with properly.
Who would be unhappy in this situation?
The example you’ve just looked at is the problem of lost updates. These are really variations on a theme - you should ensure that you know what each of them are.
These problems are tackled using locking. With locking, any data retrieved by a user for updating must be locked, or denied to other users until the update is complete. It’s like taking a book out of the library - once you’ve borrowed it, no one else can read it until you have returned it. A Shared lock allows other users to read but not to update a transaction. A shared lock will be placed on a record when it will only read the record.. Placing a shared lock on the record prevents another user from placing an exclusive lock. An exclusive lock prevents another transaction from reading (and therefore updating) a transaction. An exclusive lock means that no other type of lock can be placed on the record. Locking solves the problems of erroneous updates, but may lead to another problem. This problem is called deadlock.
This occurs when two or more transactions have locked a common resource, and both users end up waiting indefinitely for the other. This is also called a deadly embrace. It’s a bit like two people walking through a door at the same time, and neither giving way to the other. There are two ways to deal with this. A) By locking records in advance (so in this example User 1 would have to lock X and Y before starting. Then User 2 would just have to wait until user1 had finished. Locking records in advance prevents deadlock. However it is often difficult to predict in advance what records will be required to update a transaction. Many transactions call others etc. So really locking isn’t very useful as a way of preventing deadlocks. B) The most common approach is to let deadlocks occur, but to have a mechanism to detect and break them. The most common way of doing this is for the DBMS to keep a matrix of users and records to detect deadlock and then backing one of them out. ORACLE for example aborts the youngest transaction. Any partially made changes are removed and then the transaction can restart when resources become free again.
Another important point which links with security is recovery. We can not always cater for every situation where data can potentially be lost, so if data is lost we need to be able to recover and get the database back to a consistent state. Some examples of types of failures are given.
Inevitably databases are damaged or lost because of some kind of failure - of the type that we have already discussed, e.g. hardware failure, human error, program errors, viruses and natural catastrophes.. Most organisations rely heavily on the databases. For this reason a DBMS must provide mechanisms for restoring a database quickly and accurately after loss or damage. Backup facilities to create duplicate copies of the database Journaling facilities to provide backup copies or an audit trail of changes made Checkpoint facilities for the DBMS to periodically stop and synchronize all its files and journals Recovery facilities to restore the database to a consistent state and restart the processing of transactions. Before discussing these in any depth, lets have a look at the processes involved in updating a database and how these facilities might be incorporated into that.
Transactions are the basic unit of recovery for any database system. Transactions normally conform to 4 properties (ACID): Atomicity: transaction commits in its entirety or not at all Consistency: a transaction must transform the database from one consistent state to another Isolation: transactions work in isolation, i.e. one must execute independently of another – another transaction should not see the partial effects of another transaction Durable: changes made by a transaction are permanent. In recovery, the purpose is to enforce atomicity and durability, i.e. a transaction always either completes all its changes or none, and the changes made by this transaction are permanent. That is, on recovery, the recovery manager must ensure that either none or all of the transactions effects are permanently recorded. Next slide shows an example of problems that can occur making database updates.
This slide shows the operations that are involved when trying to update a record in the database. In general terms the system has to find the data it wants to update which involves the read operations, then once changes are made these need to be recorded – the steps in the write operation. Want we want to get across is that to update a record and save these changes takes a number of steps. If the system fails during this process we may not know at which stage of the read/write operation we were at. Therefore we need ways to recover from such an event.
Database buffers occupy an area of main memory from which data is transferred to and from secondary storage. This slide should be used to explain how data storage is managed. Essentially, data is only made permanent once it has been saved in secondary storage – data is flushed to secondary storage from the buffers. Flushing can be triggered by a specific event such as the ‘commit’ command or is forced when the buffer becomes full. If failure occurs between writing to the buffers and flushing to secondary storage, the recovery manager must be able to recover from this. Later slides will show how this might be done. Main point of this slide is to make students understand the stages of data storage.
This diagram shows an overview of the database update process. The process can be seen as a series of changes to the database state. The initial database state is modified by an update creating a new state and so on. Every so often the database is backed up and usually stored in a secure location - often off-site. In order to understand how recovery works you need to appreciate what happens when a transaction is processed.
This is one of the main backup facilities. Typically a backup is made at least once a day. This should be stored off-site if possible. and would be used to restore the whole database if necessary.
Backups are great, but they do not help to recover work done since the last backup. For example if a backup is done nightly then a whole day’s work could potentially be lost if the system fails just before the next scheduled backup. So to aid in getting the database as up to date as possible, databases also keep logs of all transactions done since a certain period. There are normally two types of information in the log which are discussed on the following slides.
An after image stores the data after it has been updated. This means that this update can be done again without having to re-run the query.
Before images store the data before it is updated. This means we can revert back to the original data if an erroneous update has been made.
This is a sample log file. Most of it is self explanatory, the last two columns represent pointers to the previous and next log record for each transaction.
The type of recovery procedure used depends on all kinds of things. For example, it depends on the nature of the data loss, the type of data - whether its critical or not, the type of back-up available and the sophistication of the DBMS recovery facilities. However there are basically four types of recovery strategy that can be used and we’ll look at each of these in turn.
This recovery procedure involves keeping two copies of the database and updating both simultaneously. When there is a problem with one the other is switched to. This is particularly useful when recovery is required in seconds - for critical data. Its also very good if you have disk failure because your duplicate will be on other disk. However, it offers no protection against power failure and it is very expensive because you need double the storage capacity.
The kind of database failures that might require rollback recovery are: Human error Input of invalid data Hardware failure Communications failure Recovery from these is best achieved through rolling back to the pre-transaction position. This would normally be performed by the DBMS Incorrect data can also be a cause of a problem - i.e. data which has been entered as valid but is actually incorrect. Can recover from this by: a) Rollback if possible b) Entering compensating transactions c) Restarting from the most recent checkpoint. N.B. Checkpoint Every so often the DBMS pauses (and accepts no new transactions) periodically. The transactions that are in progress are completed and all journals are brought up to date. The DBMS writes a checkpoint record to enable restarting the system from a safe point. This should be performed automatically every few hours.
ROLLBACK Update2 terminated abnormally leaving DB STATE 3 inconsistent. We need to be able to return to STATE 2 by applying the before images. Thus we would rollback by changing the database to STATE2 with the before images of the records updated in TRANS2. Backward recovery reverses changes made.
Here at the top we have a database with changes, and we apply before images, i.e. images before the data was changed and rollback the database to that point.
Roll forward is also called forward recovery and involves re-creating a database using a prior consistent database. Advantages of Forward recovery are: Reprocessing is avoided Transactions are restored in the correct sequence.
ROLLFORWARD If State4 of the database was destroyed and we needed to recover it, we would take the last database backup (State 2) and then apply the after image record created by update transactions 2 & 3. This would restore the database to STATE 4. Thus it starts with an earlier copy of the database - applies the after images (result of good transactions) and the backup is moved forward.
In rollforward recovery we have a database without any changes, apply the after images, then rollfoward to add the transactions.
This is very similar to forward recovery except that instead of using after images it uses update transactions. The DBMS does not need to create an after image journal and there are no special restart procedures. However it takes a long time as every transaction is reprocessed in sequence. Any new processing has to wait until the recovery is complete.
REPROCESSING TRANSACTIONS If DB destroyed at STAGE 3 This takes the last DB backup (at STAGE3) and then reprocesses the update transactions 2 & 3 to return the DB to STATE 4.
This table shows the types of procedures one might use for certain problems.
Database Security, Integrity and Recovery
Database Security, Integrity and Recovery
Database Security and Integrity <ul><li>Definitions </li></ul><ul><li>Threats to security </li></ul><ul><li>Threats to integrity </li></ul><ul><li>Resolution of Problems </li></ul>
Database Security <ul><li>SECURITY </li></ul><ul><li>Protecting the database from unauthorised users </li></ul><ul><li>Ensures that users are allowed to do the things they are trying to do </li></ul>
Database Security <ul><li>INTEGRITY </li></ul><ul><li>Protecting the database from authorised users </li></ul><ul><li>Ensures that what users are trying to do is correct </li></ul>
Database Security <ul><li>TYPES OF SYSTEM FAILURES </li></ul><ul><li>1. HARDWARE </li></ul><ul><li>DISK , CPU , NETWORK </li></ul><ul><li>2. SOFTWARE </li></ul><ul><li>SYSTEM, DATABASE, PROGRAM </li></ul>
Database Security <ul><li>Important security features include: </li></ul><ul><ul><li>Views </li></ul></ul><ul><ul><li>Authorisation & controls </li></ul></ul><ul><ul><li>User defined procedures </li></ul></ul><ul><ul><li>Encryption procedures </li></ul></ul>
Authorisation Rules <ul><li>An example: a person who can supply a particular password may be authorised to read any record, but cannot modify any of those records. </li></ul><ul><li>Authorisation Table for subjects i.e. Salesperson </li></ul>N N Delete N Y Modify Y Y Insert Y Y Read Order Records Customer Records
Authorisation Rules <ul><ul><li>Authorisation Table for Objects i.e. Order Records </li></ul></ul>Y N N Delete Y Y N Modify N Y N Insert Y Y Y Read (Julie) (Joker) (Batman) Password Accounting Order Entry Salesperson
Database Integrity <ul><li>CONSTRAINTS </li></ul><ul><ul><li>Can be classed in 3 different ways: </li></ul></ul><ul><li>1. Business constraints </li></ul><ul><li>2. Entity constraints </li></ul><ul><li>3. Referential constraints </li></ul>
Database Integrity <ul><li>BUSINESS CONSTRAINTS </li></ul><ul><li>A value in one column may be constrained by value of another or by some calculation </li></ul><ul><li>or formulae. </li></ul>
Database Integrity <ul><li>ENTITY CONSTRAINTS </li></ul><ul><li>Individual columns of a table may be constrained e.g. not null </li></ul><ul><li>REFERENTIAL CONSTRAINTS </li></ul><ul><li>Some times referred to as key constraints, e.g. </li></ul><ul><li>Table 2 depends on Table 1 </li></ul>
Database Integrity <ul><li>BENEFITS OF USING CONSTRAINTS </li></ul><ul><ul><li>Guaranteed integrity and consistency </li></ul></ul><ul><ul><li>Defined as part of table definition </li></ul></ul><ul><ul><li>Applies across all applications </li></ul></ul><ul><ul><li>Cannot be circumvented </li></ul></ul><ul><ul><li>Application development productivity </li></ul></ul><ul><ul><li>Requires no special programming </li></ul></ul><ul><ul><li>Easy to specify and maintain(reduced coding) </li></ul></ul><ul><ul><li>Defined once only </li></ul></ul>
Database Integrity <ul><li>CONCURRENCY CONTROL </li></ul><ul><li>WHAT IS IT? </li></ul><ul><li>The co-ordination of simultaneous requests, for the same data, from multiple users </li></ul>
Database Integrity <ul><li>CONCURRENCY CONTROL </li></ul><ul><li>WHY IS IT IMPORTANT? </li></ul><ul><li>Simultaneous execution of transactions over a shared database may create several data integrity and consistency problems </li></ul>
Database Integrity <ul><li>The three main integrity problems are: </li></ul><ul><ul><li>Lost updates </li></ul></ul><ul><ul><li>Uncommitted data </li></ul></ul><ul><ul><li>Inconsistent retrievals </li></ul></ul>
Database Integrity <ul><li>LOCKING </li></ul><ul><li>Two kinds of Locks: </li></ul><ul><li>1. Shared Locks (allows read only access) </li></ul><ul><li>2. Exclusive Locks (prevents reading of a </li></ul><ul><li>record) </li></ul>
Database Integrity <ul><li> Time </li></ul><ul><li>User 1 User2 </li></ul><ul><li>1. Lock record X </li></ul><ul><li>1. Lock record Y </li></ul><ul><li>2. Request record Y </li></ul><ul><li>2. Request Record X </li></ul><ul><li>(Wait for Y) (Wait for X) </li></ul><ul><li>DEADLOCK </li></ul>
Database Recovery <ul><li>The process of restoring the database to a correct state in the event of a failure, e.g. </li></ul><ul><ul><li>System Crashes </li></ul></ul><ul><ul><li>Media Failures </li></ul></ul><ul><ul><li>Application Software Errors </li></ul></ul><ul><ul><li>Natural Physical Disasters </li></ul></ul><ul><ul><li>Carelessness </li></ul></ul><ul><ul><li>Sabotage </li></ul></ul>
Transactions <ul><li>Basic unit of recovery </li></ul><ul><li>Properties of Transaction </li></ul><ul><ul><li>Atomicity </li></ul></ul><ul><ul><li>Consistency </li></ul></ul><ul><ul><li>Isolation </li></ul></ul><ul><ul><li>Durability </li></ul></ul><ul><li>Purpose of recovery manager is to enforce Atomicity and Durability </li></ul>
Staff Salary Update Example <ul><li>Read Operations : </li></ul><ul><li>Find address of the disk block that contains record with primary key x </li></ul><ul><li>transfer block into a DB buffer in main memory </li></ul><ul><li>copy salary data from DB buffer into variable salary </li></ul><ul><li>Write Operations: </li></ul><ul><li>as steps 1 & 2 above </li></ul><ul><li>copy salary data from variable salary into the DB buffer </li></ul><ul><li>write DB buffer back to disk </li></ul>
Storing Data Database Buffer Main Memory Secondary Storage Commit Buffer contents flushed to secondary storage ‘ permanent’ buffer full
<ul><li>DBMS provides a mechanism for taking backup copies of the database and log file at regular intervals. </li></ul><ul><ul><li>A dump or copy or backup file contains all or part of the database </li></ul></ul><ul><ul><li>backups taken without having to stop the system </li></ul></ul>Back-up Facilities
<ul><li>REDO LOGS </li></ul><ul><li>This is the main logging file. The file contains two different types of logging records. </li></ul><ul><ul><li>AFTER IMAGES </li></ul></ul><ul><ul><li>BEFORE IMAGES </li></ul></ul>Journal Facilities
<ul><li>REDO LOGS - AFTER IMAGES </li></ul><ul><li>After any column of any row on any table in the database is changed, then the new values are not only written to the database but also to the redo log. The complete row is written to the log. If a row is deleted then notification is also put on to the redo log. After images are used in roll forward recovery. </li></ul>Journal Facilities
<ul><li>REDO LOGS - BEFORE IMAGES </li></ul><ul><li>Before a row is updated the data is copied to the redo log. It is not a simple copy from the database because a separate area of the database maintains the immediate pre-update version of each row updated in the database. The extra area is called the ROLLBACK SEGMENT. The redo log takes before image copies from the rollback segment in the database. </li></ul>Journal Facilities
<ul><li>Requires 2 copies of the database </li></ul><ul><ul><li>Advantages </li></ul></ul><ul><ul><ul><li>Fast Recovery (seconds) </li></ul></ul></ul><ul><ul><ul><li>Good for disk failures </li></ul></ul></ul><ul><ul><li>Disadvantages </li></ul></ul><ul><ul><ul><li>No protection against power failure </li></ul></ul></ul><ul><ul><ul><li>Expensive </li></ul></ul></ul>Duplicate Databases
<ul><li>Changes made to the database are undone </li></ul><ul><li>(Backward Recovery ) </li></ul><ul><li>Rollback enables the updating to be undone to a predetermined point in the database processing that provides a consistent database state. </li></ul>Rollback Recovery
<ul><li>This recovery technique updates an out-of-date database up-to-the current processing position. </li></ul><ul><li>If the data is inconsistent then the database may need to rollback to the previous consistent state. </li></ul>Roll Forward Recovery
Summary <ul><li>This lecture has looked at security and recovery procedures </li></ul><ul><li>Ensuring that these two are administered correctly cuts out the majority of problems with database administration </li></ul>
Further Reading <ul><li>Security </li></ul><ul><ul><li>Connolly & Begg, chapter 19 </li></ul></ul><ul><li>Concurrency Control </li></ul><ul><ul><li>Connolly & Begg, chapter 20? </li></ul></ul><ul><li>Integrity and Recovery </li></ul><ul><ul><li>Connolly & Begg, chapters 18 and 19? </li></ul></ul><ul><li>Next session </li></ul><ul><ul><li>Advanced Relational Theory </li></ul></ul>