Your SlideShare is downloading. ×
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Mba ebooks ! Edhole
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Mba ebooks ! Edhole

53

Published on

Here you will get ebooks

Here you will get ebooks

Published in: Education, Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
53
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. EDUCATION HOLE PRESENTS DATABASE MANAGEMENT SYSTEM Unit-V
  2. Concurrency Control Techniques.............................................................................................. 2 Concurrency control .............................................................................................................................................2 Locking Techniques for concurrency control.....................................................................................3 Pessimistic Locking................................................................................................................................................3 Optimistic Locking.................................................................................................................................................3 Lock Problems..................................................................................................................................3 Deadlock ...............................................................................................................................................................3 Live lock ................................................................................................................................................................4 Basic Time stamping.........................................................................................................................4 Time stamping protocols for concurrency control .................................................................... 4 Time-stamp ordering protocol..........................................................................................................5 Thomas' Write rule...........................................................................................................................................5 Validation based protocol ....................................................................................................... 6 Multiple Granularities ............................................................................................................. 7 Gray, et al.: Granularity of Locks...........................................................................................................................8 Multi version schemes ............................................................................................................. 9 Recovery with concurrent transaction ................................................................................... 10 Concurrency Control Techniques Concurrency control Concurrency control is a database management systems (DBMS) concept that is used to address conflicts with the simultaneous accessing or altering of data that can occur with a multi-user system. Concurrency control, when applied to a DBMS, is meant to coordinate simultaneous transactions while preserving data integrity. The Concurrency is about to control the multi-user access of Database
  3. Locking Techniques for concurrency control Pessimistic Locking This concurrency control strategy involves keeping an entity in a database locked the entire time it exists in the database's memory. This limits or prevents users from altering the data entity that is locked. There are two types of locks that fall under the category of pessimistic locking: write lock and read lock. With write lock, everyone but the holder of the lock is prevented from reading, updating, or deleting the entity. With read lock, other users can read the entity, but no one except for the lock holder can update or delete it. Optimistic Locking This strategy can be used when instances of simultaneous transactions, or collisions, are expected to be infrequent. In contrast with pessimistic locking, optimistic locking doesn't try to prevent the collisions from occurring. Instead, it aims to detect these collisions and resolve them on the chance occasions when they occur. Pessimistic locking provides a guarantee that database changes are made safely. However, it becomes less viable as the number of simultaneous users or the number of entities involved in a transaction increase because the potential for having to wait for a lock to release will increase. Optimistic locking can alleviate the problem of waiting for locks to release, but then users have the potential to experience collisions when attempting to update the database. Lock Problems Deadlock When dealing with locks two problems can arise, the first of which being deadlock. Deadlock refers to a particular situation where two or more processes are each waiting for another to release a resource, or more than two processes are waiting for resources in a circular chain. Deadlock is a common problem in multiprocessing where many processes share a specific type of mutually exclusive resource. Some computers, usually those intended for the time-sharing and/or real-time markets, are often equipped with a hardware lock, or hard lock, which guarantees exclusive access to processes, forcing serialization. Deadlocks are particularly disconcerting because there is no general solution to avoid them. A fitting analogy of the deadlock problem could be a situation like when you go to unlock your car door and your passenger pulls the handle at the exact same time, leaving the door still locked. If you have ever been in a situation where the passenger is impatient and keeps trying to open the door, it can be
  4. very frustrating. Basically you can get stuck in an endless cycle, and since both actions cannot be satisfied, deadlock occurs. Live lock Live lock is a special case of resource starvation. A live lock is similar to a deadlock, except that the states of the processes involved constantly change with regard to one another wile never progressing. The general definition only states that a specific process is not progressing. For example, the system keeps selecting the same transaction for rollback causing the transaction to never finish executing. Another live lock situation can come about when the system is deciding which transaction gets a lock and which waits in a conflict situation. An illustration of live lock occurs when numerous people arrive at a four way stop, and are not quite sure who should proceed next. If no one makes a solid decision to go, and all the cars just keep creeping into the intersection afraid that someone else will possibly hit them, then a kind of live lock can happen. Basic Time stamping Basic time stamping is a concurrency control mechanism that eliminates deadlock. This method doesn’t use locks to control concurrency, so it is impossible for deadlock to occur. According to this method a unique timestamp is assigned to each transaction, usually showing when it was started. This effectively allows an age to be assigned to transactions and an order to be assigned. Data items have both a read-timestamp and a write-timestamp. These timestamps are updated each time the data item is read or updated respectively. Problems arise in this system when a transaction tries to read a data item which has been written by a younger transaction. This is called a late read. This means that the data item has changed since the initial transaction start time and the solution is to roll back the timestamp and acquire a new one. Another problem occurs when a transaction tries to write a data item which has been read by a younger transaction. This is called a late write. This means that the data item has been read by another transaction since the start time of the transaction that is altering it. The solution for this problem is the same as for the late read problem. The timestamp must be rolled back and a new one acquired. Adhering to the rules of the basic time stamping process allows the transactions to be serialized and a chronological schedule of transactions can then be created. Time stamping may not be practical in the case of larger databases with high levels of transactions. A large amount of storage space would have to be dedicated to storing the timestamps in these cases. Time stamping protocols for concurrency control The most commonly used concurrency protocol is time-stamp based protocol. This protocol uses either system time or logical counter to be used as a time-stamp. Lock based protocols manage
  5. the order between conflicting pairs among transaction at the time of execution whereas time- stamp based protocols start working as soon as transaction is created. Every transaction has a time-stamp associated with it and the ordering is determined by the age of the transaction. A transaction created at 0002 clock time would be older than all other transaction, which come after it. For example, any transaction 'y' entering the system at 0004 is two seconds younger and priority may be given to the older one. In addition, every data item is given the latest read and write-timestamp. This lets the system know, when last read was and write operation made on the data item. Time-stamp ordering protocol The timestamp-ordering protocol ensures serializability among transaction in their conflicting read and writes operations. This is the responsibility of the protocol system that the conflicting pair of tasks should be executed according to the timestamp values of the transactions. • Time-stamp of Transaction Ti is denoted as TS(Ti). • Read time-stamp of data-item X is denoted by R-timestamp(X). • Write time-stamp of data-item X is denoted by W-timestamp(X). Timestamp ordering protocol works as follows: • If a transaction Ti issues read(X) operation: o If TS(Ti) < W-timestamp(X)  Operation rejected. o If TS(Ti) >= W-timestamp(X)  Operation executed. o All data-item Timestamps updated. • If a transaction Ti issues write(X) operation: o If TS(Ti) < R-timestamp(X)  Operation rejected. o If TS(Ti) < W-timestamp(X)  Operation rejected and Ti rolled back. o Otherwise, operation executed. Thomas' Write rule This rule states that in case of:
  6. • If TS(Ti) < W-timestamp(X) • • Operation rejected and Ti rolled back. Timestamp ordering rules can be modified to make the schedule view serializable. Instead of making Ti rolled back, the 'write' operation itself is ignored. Validation based protocol A validation phase checks whether any of the transaction’s updates violate serializability. Certain information needed by the validation phase must be kept by the system. If serializability is not violated, the transaction is committed and the database is updated from the local copies; otherwise, the transaction is aborted and then restarted later. There are three phases for this concurrency control protocol: 1. Read phase: A transaction can read values of committed data items from the database. However, updates are applied only to local copies (versions) of the data items kept in the transaction workspace. 2. Validation phase: Checking is performed to ensure that serializability will not be violated if the transaction updates are applied to the database. 3. Write phase: If the validation phase is successful, the transaction updates are applied to the database; otherwise, the updates are discarded and the transaction is restarted. The idea behind optimistic concurrency control is to do all the checks at once; hence, transaction execution proceeds with a minimum of overhead until the validation phase is reached. If there is little interference among transactions, most will be validated successfully. However, if there is much interference, many transactions that execute to completion will have their results discarded and must be restarted later. Under these circumstances, optimistic techniques do not work well. The techniques are called "optimistic" because they assume that little interference will occur and hence that there is no need to do checking during transaction execution. The optimistic protocol we describe uses transaction timestamps and also requires that the write sets and read sets of the transactions be kept by the system. In addition, start and end times for some of the three phases need to be kept for each transaction. Recall that the write set of a transaction is the set of items it writes, and the read set is the set of items it reads. In the validation phase for transaction Ti, the protocol checks that Ti does not interfere with any committed transactions or with any other transactions currently in their validation phase. The validation phase for Ti checks that, for each such transaction Tj that is either committed or is in its validation phase, one of the following conditions holds:
  7. 1. Transaction Tj completes its write phase before Ti starts its read phase. 2. Ti starts its write phase after Tj completes its write phase, and the read_set of Ti has no items in common with the write_set of Tj. 3. Both the read_set and write_set of Ti have no items in common with the write_set of Tj, and Tj completes its read phase before Ti completes its read phase. When validating transaction Ti, the first condition is checked first for each transaction Tj, since (1) is the simplest condition to check. Only if condition (1) is false is condition (2) checked, and only if (2) is false is condition (3)—the most complex to evaluate—checked. If any one of these three conditions holds, there is no interference and Ti is validated successfully. If none of these three conditions holds, the validation of transaction Ti fails and it is aborted and restarted later because interference may have occurred. Multiple Granularities • Locks vs. Latches o Locks assure logical (i.e. xactional) consistency of data. They are implemented via a lock table, held for a long time (e.g. 2PL), and part of the deadlock detection mechanism. o Latches are like semaphores. They assure physical consistency of data and resources, which are not visible at the transactional level (e.g. latch a frame in a buffer pool). They are not subject to 2PL (usually held for a very short time), and the deadlock detector doesn't know about them (and therefore...?). o Acquiring a latch is much cheaper than acquiring a lock (10s vs. 100s of instructions in the no-conflict case)  latch control info is always in VM in a fixed place  lock tables are dynamically managed (can have a lock per tuple!), so data structure mgmt on lock/release is more complex • Lock table is a hashed main-mem structure • Lock/Unlock must be atomic operations (protected by latches or critical sections) • typically costs several hundred instructions to lock/unlock an item • Lock Upgrades o Suppose T1 has an S lock on P, T2 is waiting to get X lock on P, and now T3 wants S lock on P. Do we grant T3 an S lock? No! (Starvation, unfair, etc.) So
  8. • Manage FCFS queue for each locked object with outstanding requests • all exacts that are adjacent and compatible are a compatible group • The front group is the granted group • group mode is most restrictive mode amongst group members • Conversions: often want to convert (e.g. S to X for "test and modify" actions). Should conversions go to back of queue? • No! Instant deadlock. So put conversions right after granted group. • More on deadlocks below • More notes for the curious... • We will read more in a couple weeks about non-2PL locking schemes, and the consistency guarantees they provide • Most DBMSs have multiple lock request types, in part to help with that. E.g. DB2 has: • Conditional vs. unconditional lock requests (are you willing to wait?) • instant vs. manual vs. commit duration lock requests (wait, but plan to hold for different amounts of time) Gray, et al.: Granularity of Locks Theme: Correctness and performance • Granularity tradeoff: small granularity (e.g. field of a tuple) means high concurrency but high overhead. Large granularity (e.g. file) means low overhead but low concurrency. • Possible granularities: o DB o Areas o Files o Pages o Tuples (records) o fields of tuples • Want hierarchical locking, to allow "large" xacts to set large locks, "small" xacts to set small locks • Problem: T1 S-locks a record in a file, then T2 X-locks the whole file. How can T2 discover that T1 has locked the record? • Solution: "Intention" locks NL IS IX S SIX X NL IS IX S
  9. SIX X • IS and IX locks • T1 obtains S lock on record in question, but first gets IS lock on file. • Now T2 cannot get X lock on file • However, T3 can get IS or S lock on file (the reason for distinguishing IS and IX: if there were only I, T3 couldn’t get an S lock on file) • For higher concurrency, one more mode: SIX. Intuitively, you read all of the object but only lock some subparts. Allows concurrent IS locks (IX alone would not). Note: gives S access, so disallows IX to others. • requires that exacts lock items root to leaf in the hierarchy, unlock leaf to root • Generalization to DAG of resources: X locks all paths to a node, S locks at least one. Multi version schemes Multisession concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. If someone is reading from a database at the same time as someone else is writing to it, it is possible that the reader will see a half-written or inconsistent piece of data. There are several ways of solving this problem, known as concurrency control methods. The simplest way is to make all readers wait until the writer is done, which is known as a lock. This can be very slow, so MVCC takes a different approach: each user connected to the database sees a snapshot of the database at a particular instant in time. Any changes made by a writer will not be seen by other users of the database until the changes have been completed (or, in database terms: until the transaction has been committed.) When an MVCC database needs to update an item of data, it will not overwrite the old data with new data, but instead mark the old data as obsolete and add the newer version elsewhere. Thus there are multiple versions stored, but only one is the latest. This allows readers to access the data that was there when they began reading, even if it was modified or deleted part way through by someone else. It also allows the database to avoid the overhead of filling in holes in memory or disk structures but requires (generally) the system to periodically sweep through and delete the old, obsolete data objects. For a document-oriented database it also allows the system to optimize documents by writing entire documents onto contiguous sections of disk—when updated, the entire document can be re-written rather than bits and pieces cut out or maintained in a linked, non-contiguous database structure. MVCC provides point in time consistent views. Read transactions under MVCC typically use a timestamp or transaction ID to determine what state of the DB to read, and read these versions of the data. This avoids managing locks for read transactions because writes can be isolated by virtue of the old versions being maintained, rather
  10. than through a process of locks or murexes. Writes affect a future version but at the transaction ID that the read is working at, everything is guaranteed to be consistent because the writes are occurring at a later transaction ID. MVCC uses timestamps or increasing transaction IDs to achieve transactional consistency. MVCC ensures a transaction never has to wait for a database object by maintaining several versions of an object. Each version would have a write timestamp and it would let a transaction (Ti) read the most recent version of an object which precedes the transaction timestamp (TS(Ti)). If a transaction (Ti) wants to write to an object, and if there is another transaction (Tk), the timestamp of Ti must precede the timestamp of Tk (i.e., TS(Ti) < TS(Tk)) for the object write operation to succeed. Every object would also have a read timestamp, and if a transaction Ti wanted to write to object P, and the timestamp of that transaction is earlier than the object's read timestamp (TS(Ti) < RTS(P)), the transaction Ti is aborted and restarted. Otherwise, Ti creates a new version of P and sets the read/write timestamps of P to the timestamp of the transaction TS (Ti). The obvious drawback to this system is the cost of storing multiple versions of objects in the database. On the other hand reads are never blocked, which can be important for workloads mostly involving reading values from the database. MVCC is particularly adept at implementing true snapshot isolation, something which other methods of concurrency control frequently do either incompletely or with high performance costs. Recovery with concurrent transaction Remote backup, described next, is one of the solutions to save life. Alternatively, whole database backups can be taken on magnetic tapes and stored at a safer place. This backup can later be restored on a freshly installed database and bring it to the state at least at the point of backup. Grown up databases are too large to be frequently backed-up. Instead, we are aware of techniques where we can restore a database by just looking at logs. So backup of logs at frequent rate is more feasible than the entire database. Database can be backed-up once a week and logs, being very small can be backed-up every day or as frequent as every hour

×