SlideShare a Scribd company logo
1 of 116
UNIT III
DATA STORAGE AND QUERY
PROCESSING
Overview of Physical Storage Media – Magnetic
Disks and Flash Storage – RAID – Tertiary
Storage – File Organization – Organization of
Records in Files – Indexing and Hashing: Basic
Concepts – Ordered Indices – B+ tree Index Files
– B+ tree Extensions – Static Hashing – Dynamic
Hashing – Query Processing Overview –
Selection Operation – Sorting – Join Operation.
Overview of Physical Storage Media
Overview of Physical Storage Media
Storage media are classified into different
types based on the following:
 Accessing Speed
 Cost per unit of data
 Reliability
Based on storage volatility they can be classified
into 2 types:
 Volatile storage: Loses the contents when the
power to the device is removed.
E.g.: Main memory and cache.
 Non-Volatile storage: Contents persist even when
the power is switched off.
E.g.: Secondary & Tertiary storage devices.
Storage Device Hierarchy
 Primary Storage
 Secondary Storage
 Tertiary Storage
Primary Storage:
 This category usually provides fast access to data, but has
limited storage capacity.
 It is volatile in nature.
 Example: Cache and main memory
Secondary Storage:
 These devices usually have a large capacity.
 Less cost and slower access to data
 It is non-volatile.
 E.g.: Magnetic disks
Tertiary Storage:
 This is in the lowest level of hierarchy.
 Non-volatile,
 Slow access time
 Example: Magnetic tape, optical storage
Cache Memory
It is the fastest and most costly form of storage.
 It is volatile in nature.
 It is managed by computer system hardware.
 Cache memory, lies in between CPU and the Main memory
Main Memory
Fast access, generally two small to store the entire database.
Too expensive.
Capacities of upto a few giga bytes widely used currently.
Ex: RAM
Flash Memory
It is present between primary storage and secondary
storage in the storage hierarchy.
 It is non volatile memory.
 Accessing speed is as fast as reading data from main
memory.
 Widely used in embedded devices such as digital
cameras, Video games, etc
Magnetic Disk
 Primary medium for long term storage of data, typically stores
entire database.
 Data must be moved from disk to main memory for access and
written back for storage.
 Much slower access than main memory.
 Capacities ranges upto 400 Gigabytes currently
 Much larger capacity than main and flash memory.
 Disk storage survives power failures and system crashes
 Disk failure can destroy data but is very rare.
Schematic diagram of a magnetic disk
Mechanism
Physically, disks are relatively simple. Each platter has a
flat circular shape.
 Its two surfaces are covered with a magnetic
material, and information is recorded on the
surfaces.
 Platters are made from rigid metal or glass
 When the disk is in use, a drive motor
spins it at a constant high speed.
 There is a read-write head positioned
just above the surface of the platter.
 The disk surface is logically divided
into tracks over 50,000 to 100,000
tracks per platter and 1 to 5 platters per
disk.
 Each track is divided into sectors. A
sector is a smallest unit of information
that can be read from or written to the
disk. Sector size typically 512 bytes.
 Inner tracks are of smaller
length (500 sectors per track)
and outer tracks contains more
sectors than inner tracks (1000
sectors per track).
 The read-write head stores
information on a sector
magnetically.
 The read-write head of all the
tracks are mounted on a single
assembly called a disk arm.
 The disk platters mounted on a
spindle and heads mounted on a disk
arm are together known as head -
disk assemblies
 Cylinder is consisting of the track
of all the platters
 A disk-controller interfaces
between the computer system and the
disk drive hardware.
 Disk controllers attach checksums to each sector to
verify that data is read back correctly. The checksum
is computed from the data written to the sector.
 Another task of disk controller is remapping of bad
sectors.
 There are number of common interfaces for
connecting disks to personal computers - (a) ATA
interface and (b) SCSI (small computer system
Interconnected) interface.
Performance Measures of Disks
Access time: The time it takes from when a read or write
request is issued to when data transfer begins.
 Seek time: Time it takes to reposition the arm over the correct
track. The seek time ranges from 2 to 30 milliseconds. The
average seek time is one-third the worst case seek time and one
half the maximum seek time. Average seek time currently ranges
between 4 to 10 milliseconds.
 Rotational latency: Time it takes for the
sector to be accessed to appear under the head.
It ranges from 4 to 11 milliseconds.
 Data-transfer rate: The rate at which data
can be retrieved from or stored to the disk. It
ranges from 25 to 100 megabytes per second.
 Mean time to failure (MTTF): The average
time, the disk is expected to run continuously
without any failure.
 It is the measure of the reliability of the disk
 Typically 3 to 5 years
 Probability of failure of new disks is quite low
 MTTF decreases as disk ages.
Optimization of Disk-Block Access
 Techniques used for accessing data from disk:
 Scheduling
 Disk arm scheduling algorithm order pending accesses to tracks
so that disk arm movement is minimized.
 Commonly used algorithm is elevator algorithm.
 Move disk arm in one direction (From outer to inner tracks or
vice versa), processing next request in that direction, till no more
requests in that direction, then reverse direction and repeat.
 File organization:
 Optimize block access time by organizing the blocks to
correspond to how data will be accessed.
Eg: store related information’s on the same or nearby
cylinders.
Sequential file may become fragmented, that is its blocks
become scattered all over the disk. Sequential access to a
fragmented file results in increased disk arm movement.
Non-volatile write buffers: (NV-RAM)
To speed up disk writes by writing blocks to a non-
volatile RAM buffer immediately, the contents to
NV-RAM are not lost in power failure.
Log disk
A disk devoted to writing a sequential log to block
updates used exactly like non-volatile RAM. Write to
log disk is very fast since no seeks are required
Journaling file system write data in safe order to NV-
RAM or log disk.
RAID
 Redundant Array of Independent Disks
 Multiple secondary disks are connected together to
increase the performance, data redundancy or both.
 Need:
 To increase the performance
 Increased reliability
 To give greater throughput
 Data are restored
RAID – Level 0
Data is stripped into multiple drives
RAID – Level 0
 Data is broken down into blocks and these blocks are
stored across all the disks.
 Thus striped array of disks is implemented.
 There is no duplication of data in this level so once a
block is lost then there is no way recover it.
 It has good performance.
RAID – Level 1
Mirroring of data in drive 1 to drive 2. It
offers 100% redundancy as array will
continue to work even if either disk fails.
RAID – Level 1
 uses mirroring techniques
 All data in the drive is duplicated to another
drive.
 It provides 100% redundancy in case of a
failure.
 Advantage: Fault Tolerance
RAID 10, also known as RAID 1+0, is a RAID configuration that
combines disk mirroring and disk striping to protect data. It
requires a minimum of four disks and stripes data across
mirrored pairs.
RAID – Level 2
RAID – Level 2
 Use of mirroring as well as stores Error Correcting
codes for its data striped on different disks.
 Each data bit in a word is recorded on a separate disk
and ECC codes of the data words are stored on a
different set disks.
 Due to its complex structure and high cost, RAID 2 is
not commercially available.
RAID – Level 3
One dedicated drive is
used to store the parity
information and in case of
any drive failure the
parity is restored using
this extra drive.
RAID – Level 3
 It consists of byte level stripping with dedicated
parity. In this level, the parity information is
stored for each disk section and written to
dedicated parity drive.
 Parity is a technique that checks whether data has
been lost or written over when it is moved from
one place in storage to another.
RAID – Level 3
 In the case of disk failure, the parity disk is
accessed and data is reconstructed from the
remaining devices.
 Once the failed disk is replaced, the missing
data can be restored on the new disk.
RAID – Level 4
This level is very much
similar to RAID 3 apart from
the feature where RAID 4
uses block level stripping
rather than byte level.
RAID – Level 4
 It consists of block level stripping with a
parity disk.
RAID – Level 5
Parity information is written
to a different disk in the array
for each stripe. In case of
single disk failure data can be
recovered with the help of
distributed parity without
affecting the operation and
other read write operations.
RAID – Level 5
 RAID 5 writes whole data blocks onto
different disks, but the parity bits generated for
data block stripe are distributed among all the
data disks rather than storing them on a
different dedicated disk.
RAID – Level 6
This level is an enhanced version of RAID 5 adding
extra benefit of dual parity. This level uses block
level stripping with DUAL distributed parity
RAID – Level 6
 RAID 6 is a extension of Level 5.
 In this level, two independent parities are generated
and stored in distributed fashion among multiple disks.
 Two parities provide additional fault tolerance.
 This level requires at least four disk drives to
implement RAID.
 The factors to be taken into account in
choosing a RAID level are:
 Performance requirements in terms of number of
I/O operation.
 Performance when a disk has failed.
 Performance during rebuild.
File Organization
 A method of arranging records in a file when the file is
stored on disk.
 A file is organized logically as a sequence of records.
 Record is a sequence of fields.
 Each file is also logically partitioned into fixed-length
storage units called blocks, which are the units of both
storage allocation and data transfer.
What is File Organization?
In simple terms, Storing the files in certain order is called file
Organization. File Structure refers to the format of the label
and data blocks and of any logical control record.
Types of File Organizations
• Sequential File Organization
• Heap File Organization
• Hash File Organization
• B+ Tree File Organization
• Clustered File Organization
Sequential File Organization –
• The easiest method for file Organization is Sequential method.
In this method the file are stored one after another in a
sequential manner. There are two ways to implement this
method:
1. Pile File Method
2. Sorted File Method
Pile File Method – This method is quite simple, in which we
store the records in a sequence i.e one after other in the
order in which they are inserted into the tables.
Insertion of new record –
Let the R1, R3 and so on upto R5 and R4 be four records in the
sequence. Here, records are nothing but a row in any table.
Suppose a new record R2 has to be inserted in the sequence,
then it is simply placed at the end of the file.
• Sorted File Method –
In this method, As the name itself suggest whenever a new record
has to be inserted, it is always inserted in a sorted (ascending or
descending) manner. Sorting of records may be based on any
primary key or any other key.
Insertion of new record –
Let us assume that there is a preexisting sorted sequence of
four records R1, R3, and so on upto R7 and R8. Suppose a new
record R2 has to be inserted in the sequence, then it will be
inserted at the end of the file and then it will sort the
sequence .
Heap File Organization
 Heap File Organization works with data blocks. In this
method records are inserted at the end of the file, into the
data blocks. No Sorting or Ordering is required in this method.
 If a data block is full, the new record is stored in some other
block, Here the other data block need not be the very next
data block, but it can be any block in the memory. It is the
responsibility of DBMS to store and manage the new records.
• Insertion of new record –
Suppose we have four records in the heap R1, R5, R6, R4
and R5 and suppose a new record R2 has to be inserted
in the heap then, since the last data block i.e data block 3
is full it will be inserted in any of the data blocks selected
by the DBMS, lets say data block 1.
If we want to search, delete or
update data in heap file
Organization the we will traverse
the data from the beginning of the
file till we get the requested
record. Thus if the database is very
huge, searching, deleting or
updating the record will take a lot
of time.
Hashing
In a database management system, When we want to
retrieve a particular data, It becomes very inefficient to
search all the index values and reach the desired data. In
this situation, Hashing technique comes into picture.
• Hashing is an efficient technique to directly search the
location of desired data on the disk without using index
structure. Data is stored at the data blocks whose
address is generated by using hash function. The memory
location where these records are stored is called as data
block or data bucket.
Hash File Organization:
• Data bucket – Data buckets are the memory locations
where the records are stored. These buckets are also
considered as Unit Of Storage.
• Hash Function – Hash function is a mapping function
that maps all the set of search keys to actual record
address. Generally, hash function uses the primary key to
generate the hash index – address of the data block.
Hash function can be simple mathematical function to
any complex mathematical function.
• Hash Index-The prefix of an entire hash value is taken as
a hash index. Every hash index has a depth value to
signify how many bits are used for computing a hash
function.
Static Hashing:
• In static hashing, when a search-key value is provided, the
hash function always computes the same address. For
example, if we want to generate an address for STUDENT_ID =
104 using mod (5) hash function, it always results in the same
bucket address 4. There will not be any changes to the bucket
address here. Hence a number of data buckets in the memory
for this static hashing remain constant throughout.
Insertion – When a new record is inserted into the table, The
hash function h generates a bucket address for the new
record based on its hash key K.
Bucket address = h(K)
Searching – When a record needs to be searched, The same hash function
is used to retrieve the bucket address for the record. For Example, if we
want to retrieve the whole record for ID 104, and if the hash function is
mod (5) on that ID, the bucket address generated would be 4. Then we will
directly got to address 4 and retrieve the whole record for ID 104. Here ID
acts as a hash key.
Deletion – If we want to delete a record, Using the hash function we will
first fetch the record which is supposed to be deleted. Then we will
remove the records for that address in memory.
Updation – The data record that needs to be updated is first searched
using hash function, and then the data record is updated.
• Now, If we want to insert some new records into the file But
the data bucket address generated by the hash function is not
empty or the data already exists in that address. This becomes
a critical situation to handle. This situation in the static
hashing is called bucket overflow.
To overcome this situation Some commonly used methods are
discussed below:
Open Hashing – In Open hashing method, next available data
block is used to enter the new record, instead of overwriting
the older one. This method is also called linear probing. For
example, D3 is a new record that needs to be inserted, the
hash function generates the address as 105. But it is already
full. So the system searches next available data bucket, 123
and assigns D3 to it.
Closed hashing – In Closed hashing method, a new data
bucket is allocated with same address and is linked it after the
full data bucket. This method is also known as overflow
chaining. For example, we have to insert a new record D3 into
the tables. The static hash function generates the data bucket
address as 105. But this bucket is full to store the new data. In
this case is a new data bucket is added at the end of 105 data
bucket and is linked to it. Then new record D3 is inserted into
the new bucket.
• Quadratic probing : Quadratic probing is very much similar to
open hashing or linear probing. Here, The only difference
between old and new bucket is linear. Quadratic function is
used to determine the new bucket address.
• Double Hashing : Double Hashing is another method similar
to linear probing. Here the difference is fixed as in linear
probing, but this fixed difference is calculated by using
another hash function. That’s why the name is double
hashing.
Dynamic Hashing –
• The drawback of static hashing is that it does not expand or shrink
dynamically as the size of the database grows or shrinks. In Dynamic
hashing, data buckets grows or shrinks (added or removed dynamically) as
the records increases or decreases. Dynamic hashing is also known
as extended hashing. In dynamic hashing, the hash function is made to
produce a large number of values.
For example:
• Consider the following grouping of keys into buckets, depending on the
prefix of their hash address:
The last two bits of 2 and 4 are 00. So it will go into bucket B0.
The last two bits of 5 and 6 are 01, so it will go into bucket B1.
The last two bits of 1 and 3 are 10, so it will go into bucket B2.
The last two bits of 7 are 11, so it will go into B3.
Insert key 9 with hash address 10001 into the above structure:
• Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1
is full, so it will get split.
• The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will
go into bucket B1, and the last three bits of 6 are 101, so it will go into bucket B5.
• Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry
because last two bits of both the entry are 00.
• Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry
because last two bits of both the entry are 10.
• Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because
last two bits of both the entry are 11.
Cluster File Organization
• In cluster file organization, two or more related
tables/records are stored within same file known as
clusters. These files will have two or more tables in the
same data block and the key attributes which are used to
map these table together are stored only once.
• Thus it lowers the cost of searching and retrieving various
records in different files as they are now combined and
kept in a single cluster.
For example we have two tables or relation Employee
and Department. These table are related to each other.
Therefore these table are allowed to combine using a join operation and can be
seen in a cluster file.
If we have to insert, update or delete any record we can directly
do so. Data is sorted based on the primary key or the key with
which searching is done. Cluster key is the key with which
joining of the table is performed.
Types of Cluster File Organization – There are two ways to
implement this method:
• Indexed Clusters –
In Indexed clustering the records are group based on the
cluster key and stored together. The above mentioned
example of the Employee and Department relationship is an
example of Indexed Cluster where the records are based on
the Department ID.
• Hash Clusters –
This is very much similar to indexed cluster with only
difference that instead of storing the records based on cluster
key, we generate hash key value and store the records with
same hash key value.
Indexing
 An index is a data structure that organizes data
records on the disk to make the retrieval of data
efficient.
Indexes are created using a few database columns.
• The first column is the Search key that contains a
copy of the primary key or candidate key of the
table. These values are stored in sorted order so that
the corresponding data can be accessed quickly.
Note: The data may or may not be stored in sorted
order.
• The second column is the Data
Reference or Pointer which contains a set of
pointers holding the address of the disk block where
that particular key value can be found.
Ordered Indices:
Based on sorted ordering values.
The indices are usually sorted to make
searching faster. The indices which are sorted
are known as ordered indices
Primary Index
• If the index is created on the basis of the primary
key of the table, then it is known as primary
indexing. These primary keys are unique to each
record and contain 1:1 relation between the
records.
• As primary keys are stored in sorted order, the
performance of the searching operation is quite
efficient.
The primary index can be classified into two types:
• Dense index and
• Sparse index.
Dense index
• The dense index contains an index record for every
search key value in the data file. It makes searching
faster.
• In this, the number of records in the index table is
same as the number of records in the main table.
• It needs more space to store index record itself. The
index records have the search key and a pointer to
the actual record on the disk.
Sparse Index:
 In the data file, index record appears only for a few
items. Each item points to a block.
 In this, instead of pointing to each record in the main
table, the index points to the records in the main table
in a gap.
Clustering Index
• A clustered index can be defined as an ordered data
file. Sometimes the index is created on non-primary
key columns which may not be unique for each
record.
• In this case, to identify the record faster, we will
group two or more columns to get the unique value
and create index out of them. This method is called a
clustering index.
• The records which have similar characteristics are
grouped, and indexes are created for these group.
Secondary Index
• In the sparse indexing, as the size of the table grows,
the size of mapping also grows.
• These mappings are usually kept in the primary
memory so that address fetch should be faster. Then
the secondary memory searches the actual data
based on the address got from mapping. If the
mapping size grows then fetching the address itself
becomes slower.
• In this case, the sparse index will not be efficient. To
overcome this problem, secondary indexing is
introduced.
In secondary indexing, to reduce the size of
mapping, another level of indexing is introduced.
In this method, the huge range for the columns is
selected initially so that the mapping size of the
first level becomes small. Then each range is
further divided into smaller ranges.
The mapping of the first level is stored in the
primary memory, so that address fetch is faster.
The mapping of the second level and actual data
are stored in the secondary memory (hard disk).
 Single level Indexing:
 The index is usually specified on one field of the
file.
 Types of single level indexing can be primary
indexing, clustering index or secondary indexing.
Search Key Pointer to Record
 Single level Indexing:
Search Key Address
101
120
130
Roll No. Name Age
101 Aa 25
102 bb 20
Roll No. Name Age
130 Xx 32
131 Yy 28
132 zz 30
Multi level Indexing:
 Multilevel index is stored on the disk along with the
actual database files.
 Multi-level Index helps in breaking down the index
into several smaller indices in order to make the
outermost level so small that it can be saved in a single
disk block, which can easily be accommodated
anywhere in the main memory.
 Multi level Indexing:
2
35
55
85
2
8
15
24
35
39
44
51
55
63
71
80
24
29
2
5
35
36
51
53
55
61
80
82
B+ Tree
• The B+ tree is a balanced binary search tree. It follows a
multi-level index format.
• B+ Tree is a Storage method in tree like structure.
• B+ Tree has on root, any number of intermediate node
and a leaf node
• In the B+ tree, leaf nodes denote actual data pointers
and leaf node will Have actual records stored in the
sorted order.
• Intermediate node will have only pointer to the leaf node
it has no data.
• B+ tree ensures that all leaf nodes remain at the same
height.
• In the B+ tree, the leaf nodes are linked using a link list.
Therefore, a B+ tree can support random access as well
as sequential access.
Structure of B+ Tree
In the B+ tree, every leaf node is at equal distance
from the root node.
The B+ tree is of the order n where n is fixed for every
B+ tree.
It contains an internal node and leaf node.
Internal node
• An internal node of the B+ tree can contain at least n/2
record pointers except the root node.
• At most, an internal node of the tree contains n pointers.
Leaf node
• The leaf node of the B+ tree can contain at least n/2 record
pointers and n/2 key values.
• At most, a leaf node contains n record pointer and n key
values.
• Every leaf node of the B+ tree contains one block pointer P
to point to next leaf node.
Searching a record in B+ Tree
• Suppose we have to search 55 in the below B+ tree structure.
First, we will fetch for the intermediary node which will direct
to the leaf node that can contain a record for 55.
• So, in the intermediary node, we will find a branch between
50 and 75 nodes. Then at the end, we will be redirected to the
third leaf node. Here DBMS will perform a sequential search
to find 55.
B+ Tree Insertion
• Suppose we want to insert a record 60 in the below structure.
It will go to the 3rd leaf node after 55. It is a balanced tree,
and a leaf node of this tree is already full, so we cannot insert
60 there.
• In this case, we have to split the leaf node, so that it can be
inserted into tree without affecting the fill factor, balance and
order.
• The 3rd leaf node has the values (50, 55, 60, 65, 70) and its
current root node is 50. We will split the leaf node of the tree
in the middle so that its balance is not altered. So we can
group (50, 55) and (60, 65, 70) into 2 leaf nodes.
• If these two has to be leaf nodes, the intermediate node
cannot branch from 50. It should have 60 added to it, and
then we can have pointers to a new leaf node.
B+ Tree Deletion
• Suppose we want to delete 60 from the above example.
In this case, we have to remove 60 from the intermediate
node as well as from the 4th leaf node too. If we remove
it from the intermediate node, then the tree will not
satisfy the rule of the B+ tree. So we need to modify it to
have a balanced tree.
• After deleting node 60 from above B+ tree and re-
arranging the nodes, it will show as follows:
Query Processing
Query Processing
 Parsing and Translation:
The system must translate the query into a usable form.
 A more useful internal representation is one based on
the extended relational algebra.
 The parser checks the syntax of the user’s query, verifies
that the relation names appearing in the query are names of
the relations in the database.
Query Processing
 Parsing and Translation - Example:
 select salary from instructor where salary <
75000;
 This query can be translated into either of the
following relational-algebra expressions:
Query Processing
 Optimization:
 During this process the query evaluation plan is prepared
from all the relational algebraic expressions.
 The query cost for all the evaluation plans is calculated.
 Amongst all equivalent evaluation plans the one with
lowest cost is chosen.
 Cost is estimated using statistical information from the
database catalog, such as size of tuples, etc.
Query Processing
 Evaluation:
 The query execution engine takes a query evaluation,
executes that plan and returns the answers to the query.
Measures of Query Cost
 Many factors contribute to time cost
 Disk accesses, CPU, or even network communication
 Typically disk access is the predominant cost, and is
also relatively easy to estimate. Measured by taking
into account
Number of seeks * average-seek-cost
Number of blocks read * average-block-read-cost
Number of blocks written * average-block-write-cost
Measures of Query Cost
Cost to write a block is greater than cost to read a block
 data is read back after being written to ensure that the write was successful.
For simplicity we just use the number of block transfers
from disk and the
number of seeks as the cost measures
tT – time to transfer one block
tS – time for one seek
Cost for b block transfers plus S seeks: b * tT + S * tS
Algorithm for Selection Operation
 File scan – search algorithms that locate and
retrieve records that fulfill a selection
condition.
 Algorithm A1 (linear search). Scan each file
block and test all records to see whether they
satisfy the selection condition.
Algorithm for Selection Operation
 Algorithm A1 (linear search):
 Cost estimate = br block transfers + 1 seek
 br denotes number of blocks containing records from relation r.
 If selection is on a key attribute, can stop on finding record
 cost = (br /2) block transfers + 1 seek
 Linear search can be applied regardless of
selection condition or ordering of records in the file, or
availability of indices
Algorithm for Selection Operation
 Algorithm A2 (binary search):
 Applicable if selection is an equality comparison on the attribute
on which file is ordered.
 Assume that the blocks of a relation are stored contiguously.
 Cost estimate (number of disk blocks to be scanned):
cost of locating the first tuple by a binary search on the blocks
 [log2(br)] * (tT + tS)
Algorithm for Selection Operation
 Algorithm A2 (binary search):
 If there are multiple records satisfying selection
Add transfer cost of the number of blocks containing
records that satisfy selection condition
Algorithm for JOIN Operation
 Nested Loop Join - To compute the theta join r
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition q
if they do, add tr • ts to the result.
end
end
Algorithm for JOIN Operation
 Nested Loop Join:
 r is called the outer relation and s the inner relation of the
join.
Requires no indices and can be used with any kind of join
condition.
Expensive since it examines every pair of tuples in the two
relations.
Algorithm for JOIN Operation
 Nested Loop Join:
 In the worst case, if there is enough memory only to hold one
block of each relation, the estimated cost is nr * bs + br
block transfers, plus nr + br
seeks
If the smaller relation fits entirely in memory, use that as the
inner relation.
Reduces cost to br + bs block transfers and 2 seeks
Algorithm for JOIN Operation
 Nested Loop Join:
 Assuming worst case memory availability cost estimate is
with depositor as outer relation:
5000 * 400 + 100 = 2,000,100 block transfers,
5000 + 100 = 5100 seeks
with customer as the outer relation
10000 * 100 + 400 = 1,000,400 block transfers and 10,400 seeks
Algorithm for JOIN Operation
 Block Nested Loop Join
 Variant of nested loop join in which every block of inner
relation is paired with every block of outer relation.
Algorithm for JOIN Operation
 Merge Join
 Sort both relations on their join attribute (if not already sorted
on the join attributes).
Merge the sorted relations to join them.
 Can be used only for equijoins and natural joins
 the cost of merge join is: br + bs block transfers + [br / bb]+ [bs
/ bb] seeks
+ the cost of sorting if relations are unsorted.
Algorithm for JOIN Operation
 Hash Join:
 The hash function is used h is used to partition tuples of
both the relations.
 Cost: 3(br+bs)+4 x n block transfers + 2 ( [br/bb]+[bs/bb])
seeks.
Query Optimization
 Heuristic Estimation:
 Heuristic is a rule that leads to least cost in most of cases.
 Systems may use heuristic to reduce the number of choices
that must be made in a cost-based fashion.
 Heuristic optimization transforms the query-tree by using a
set of rules that typically t improve execution performance.
Query Optimization
 Query Tree:
 SELECT schedule, room FROM Student NATURAL
JOIN Enroll NATURAL JOIN Class WHERE
Major='Math'
Query Optimization
 Query Tree:
Query Optimization
 Heuristic Estimation – Rules:
 Perform selection early (reduces the number of tuples)
 Perform projection early (reduces the number of attributes)
 Perform most restrictive selection and join operations
before other similar operations.
Query Optimization
 Heuristic Estimation – Steps:
 Scanner and parser generate initial query representation
 Representation is optimized according to heuristic rules
 Query execution plan developed.
Query Optimization
 Cost based Estimation:
 Look at all of the possible ways or scenarios in which a
query can be executed
 Each scenario will be assigned a ‘cost’, which indicates
how efficiently that query can be run
 Pick the scenario that has the least cost and execute the
query using that scenario, because that is most efficient
way to run the query.
Query Optimization
 Cost based Estimation:
 Scope of query optimization is a query block. Global
query optimization involves multiple query blocks.
 Cost Components: Access cost to secondary storage, Disk
Storage cost, Computation cost, memory usage cost and
Communication cost
Query Optimization
 Cost based Estimation:
 Information stored in DBMS catalog and used by optimizer:
 File Size
 Organization
 Number of levels of each multilevel index
 Number of distinct values of an attribute
 Attribute selectivity.

More Related Content

Similar to UNIT III.pptx

Chapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsChapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsWayne Jones Jnr
 
db
dbdb
dbAisu
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMSkoolkampus
 
Storage structure1
Storage structure1Storage structure1
Storage structure1amibuban
 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSAbhishek Dutta
 
Sheik Mohamed Shadik - BSc - Project Details
Sheik Mohamed Shadik - BSc - Project DetailsSheik Mohamed Shadik - BSc - Project Details
Sheik Mohamed Shadik - BSc - Project Detailsshadikbsc
 
DBMS Unit IV and V Material
DBMS Unit IV and V MaterialDBMS Unit IV and V Material
DBMS Unit IV and V MaterialArthyR3
 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case studyLavanya G
 
Chapter 8 - Multimedia Storage and Retrieval
Chapter 8 - Multimedia Storage and RetrievalChapter 8 - Multimedia Storage and Retrieval
Chapter 8 - Multimedia Storage and RetrievalPratik Pradhan
 
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!You will get DBA Jobs If You Learn What is Storage System, Hurry Up!
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!raima sen
 

Similar to UNIT III.pptx (20)

Chapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsChapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage Systems
 
DBMS RAID.pptx
DBMS RAID.pptxDBMS RAID.pptx
DBMS RAID.pptx
 
Db
DbDb
Db
 
db
dbdb
db
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
 
Os
OsOs
Os
 
Thiru
ThiruThiru
Thiru
 
Storage structure1
Storage structure1Storage structure1
Storage structure1
 
Magnetic disk - Krishna Geetha.ppt
Magnetic disk  - Krishna Geetha.pptMagnetic disk  - Krishna Geetha.ppt
Magnetic disk - Krishna Geetha.ppt
 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMS
 
Sheik Mohamed Shadik - BSc - Project Details
Sheik Mohamed Shadik - BSc - Project DetailsSheik Mohamed Shadik - BSc - Project Details
Sheik Mohamed Shadik - BSc - Project Details
 
DBMS Unit IV and V Material
DBMS Unit IV and V MaterialDBMS Unit IV and V Material
DBMS Unit IV and V Material
 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case study
 
Chapter 8 - Multimedia Storage and Retrieval
Chapter 8 - Multimedia Storage and RetrievalChapter 8 - Multimedia Storage and Retrieval
Chapter 8 - Multimedia Storage and Retrieval
 
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!You will get DBA Jobs If You Learn What is Storage System, Hurry Up!
You will get DBA Jobs If You Learn What is Storage System, Hurry Up!
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
ch11
ch11ch11
ch11
 
Storage memory
Storage memoryStorage memory
Storage memory
 
Raid 1 3
Raid 1 3Raid 1 3
Raid 1 3
 
Mass storage systemsos
Mass storage systemsosMass storage systemsos
Mass storage systemsos
 

Recently uploaded

Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 

Recently uploaded (20)

Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 

UNIT III.pptx

  • 1. UNIT III DATA STORAGE AND QUERY PROCESSING
  • 2. Overview of Physical Storage Media – Magnetic Disks and Flash Storage – RAID – Tertiary Storage – File Organization – Organization of Records in Files – Indexing and Hashing: Basic Concepts – Ordered Indices – B+ tree Index Files – B+ tree Extensions – Static Hashing – Dynamic Hashing – Query Processing Overview – Selection Operation – Sorting – Join Operation.
  • 3. Overview of Physical Storage Media
  • 4.
  • 5. Overview of Physical Storage Media Storage media are classified into different types based on the following:  Accessing Speed  Cost per unit of data  Reliability
  • 6. Based on storage volatility they can be classified into 2 types:  Volatile storage: Loses the contents when the power to the device is removed. E.g.: Main memory and cache.  Non-Volatile storage: Contents persist even when the power is switched off. E.g.: Secondary & Tertiary storage devices.
  • 8.  Primary Storage  Secondary Storage  Tertiary Storage
  • 9. Primary Storage:  This category usually provides fast access to data, but has limited storage capacity.  It is volatile in nature.  Example: Cache and main memory Secondary Storage:  These devices usually have a large capacity.  Less cost and slower access to data  It is non-volatile.  E.g.: Magnetic disks
  • 10. Tertiary Storage:  This is in the lowest level of hierarchy.  Non-volatile,  Slow access time  Example: Magnetic tape, optical storage
  • 11. Cache Memory It is the fastest and most costly form of storage.  It is volatile in nature.  It is managed by computer system hardware.  Cache memory, lies in between CPU and the Main memory
  • 12. Main Memory Fast access, generally two small to store the entire database. Too expensive. Capacities of upto a few giga bytes widely used currently. Ex: RAM
  • 13. Flash Memory It is present between primary storage and secondary storage in the storage hierarchy.  It is non volatile memory.  Accessing speed is as fast as reading data from main memory.  Widely used in embedded devices such as digital cameras, Video games, etc
  • 14. Magnetic Disk  Primary medium for long term storage of data, typically stores entire database.  Data must be moved from disk to main memory for access and written back for storage.  Much slower access than main memory.  Capacities ranges upto 400 Gigabytes currently  Much larger capacity than main and flash memory.  Disk storage survives power failures and system crashes  Disk failure can destroy data but is very rare.
  • 15. Schematic diagram of a magnetic disk
  • 16. Mechanism Physically, disks are relatively simple. Each platter has a flat circular shape.  Its two surfaces are covered with a magnetic material, and information is recorded on the surfaces.  Platters are made from rigid metal or glass
  • 17.  When the disk is in use, a drive motor spins it at a constant high speed.  There is a read-write head positioned just above the surface of the platter.  The disk surface is logically divided into tracks over 50,000 to 100,000 tracks per platter and 1 to 5 platters per disk.  Each track is divided into sectors. A sector is a smallest unit of information that can be read from or written to the disk. Sector size typically 512 bytes.
  • 18.  Inner tracks are of smaller length (500 sectors per track) and outer tracks contains more sectors than inner tracks (1000 sectors per track).  The read-write head stores information on a sector magnetically.  The read-write head of all the tracks are mounted on a single assembly called a disk arm.
  • 19.  The disk platters mounted on a spindle and heads mounted on a disk arm are together known as head - disk assemblies  Cylinder is consisting of the track of all the platters  A disk-controller interfaces between the computer system and the disk drive hardware.
  • 20.  Disk controllers attach checksums to each sector to verify that data is read back correctly. The checksum is computed from the data written to the sector.  Another task of disk controller is remapping of bad sectors.  There are number of common interfaces for connecting disks to personal computers - (a) ATA interface and (b) SCSI (small computer system Interconnected) interface.
  • 21. Performance Measures of Disks Access time: The time it takes from when a read or write request is issued to when data transfer begins.  Seek time: Time it takes to reposition the arm over the correct track. The seek time ranges from 2 to 30 milliseconds. The average seek time is one-third the worst case seek time and one half the maximum seek time. Average seek time currently ranges between 4 to 10 milliseconds.
  • 22.  Rotational latency: Time it takes for the sector to be accessed to appear under the head. It ranges from 4 to 11 milliseconds.  Data-transfer rate: The rate at which data can be retrieved from or stored to the disk. It ranges from 25 to 100 megabytes per second.
  • 23.  Mean time to failure (MTTF): The average time, the disk is expected to run continuously without any failure.  It is the measure of the reliability of the disk  Typically 3 to 5 years  Probability of failure of new disks is quite low  MTTF decreases as disk ages.
  • 24. Optimization of Disk-Block Access  Techniques used for accessing data from disk:  Scheduling  Disk arm scheduling algorithm order pending accesses to tracks so that disk arm movement is minimized.  Commonly used algorithm is elevator algorithm.  Move disk arm in one direction (From outer to inner tracks or vice versa), processing next request in that direction, till no more requests in that direction, then reverse direction and repeat.
  • 25.  File organization:  Optimize block access time by organizing the blocks to correspond to how data will be accessed. Eg: store related information’s on the same or nearby cylinders. Sequential file may become fragmented, that is its blocks become scattered all over the disk. Sequential access to a fragmented file results in increased disk arm movement.
  • 26. Non-volatile write buffers: (NV-RAM) To speed up disk writes by writing blocks to a non- volatile RAM buffer immediately, the contents to NV-RAM are not lost in power failure. Log disk A disk devoted to writing a sequential log to block updates used exactly like non-volatile RAM. Write to log disk is very fast since no seeks are required Journaling file system write data in safe order to NV- RAM or log disk.
  • 27. RAID  Redundant Array of Independent Disks  Multiple secondary disks are connected together to increase the performance, data redundancy or both.  Need:  To increase the performance  Increased reliability  To give greater throughput  Data are restored
  • 28. RAID – Level 0 Data is stripped into multiple drives
  • 29. RAID – Level 0  Data is broken down into blocks and these blocks are stored across all the disks.  Thus striped array of disks is implemented.  There is no duplication of data in this level so once a block is lost then there is no way recover it.  It has good performance.
  • 30. RAID – Level 1 Mirroring of data in drive 1 to drive 2. It offers 100% redundancy as array will continue to work even if either disk fails.
  • 31. RAID – Level 1  uses mirroring techniques  All data in the drive is duplicated to another drive.  It provides 100% redundancy in case of a failure.  Advantage: Fault Tolerance
  • 32. RAID 10, also known as RAID 1+0, is a RAID configuration that combines disk mirroring and disk striping to protect data. It requires a minimum of four disks and stripes data across mirrored pairs.
  • 34. RAID – Level 2  Use of mirroring as well as stores Error Correcting codes for its data striped on different disks.  Each data bit in a word is recorded on a separate disk and ECC codes of the data words are stored on a different set disks.  Due to its complex structure and high cost, RAID 2 is not commercially available.
  • 35. RAID – Level 3 One dedicated drive is used to store the parity information and in case of any drive failure the parity is restored using this extra drive.
  • 36. RAID – Level 3  It consists of byte level stripping with dedicated parity. In this level, the parity information is stored for each disk section and written to dedicated parity drive.  Parity is a technique that checks whether data has been lost or written over when it is moved from one place in storage to another.
  • 37. RAID – Level 3  In the case of disk failure, the parity disk is accessed and data is reconstructed from the remaining devices.  Once the failed disk is replaced, the missing data can be restored on the new disk.
  • 38. RAID – Level 4 This level is very much similar to RAID 3 apart from the feature where RAID 4 uses block level stripping rather than byte level.
  • 39. RAID – Level 4  It consists of block level stripping with a parity disk.
  • 40. RAID – Level 5 Parity information is written to a different disk in the array for each stripe. In case of single disk failure data can be recovered with the help of distributed parity without affecting the operation and other read write operations.
  • 41. RAID – Level 5  RAID 5 writes whole data blocks onto different disks, but the parity bits generated for data block stripe are distributed among all the data disks rather than storing them on a different dedicated disk.
  • 42. RAID – Level 6 This level is an enhanced version of RAID 5 adding extra benefit of dual parity. This level uses block level stripping with DUAL distributed parity
  • 43. RAID – Level 6  RAID 6 is a extension of Level 5.  In this level, two independent parities are generated and stored in distributed fashion among multiple disks.  Two parities provide additional fault tolerance.  This level requires at least four disk drives to implement RAID.
  • 44.  The factors to be taken into account in choosing a RAID level are:  Performance requirements in terms of number of I/O operation.  Performance when a disk has failed.  Performance during rebuild.
  • 45. File Organization  A method of arranging records in a file when the file is stored on disk.  A file is organized logically as a sequence of records.  Record is a sequence of fields.  Each file is also logically partitioned into fixed-length storage units called blocks, which are the units of both storage allocation and data transfer.
  • 46. What is File Organization? In simple terms, Storing the files in certain order is called file Organization. File Structure refers to the format of the label and data blocks and of any logical control record. Types of File Organizations • Sequential File Organization • Heap File Organization • Hash File Organization • B+ Tree File Organization • Clustered File Organization
  • 47. Sequential File Organization – • The easiest method for file Organization is Sequential method. In this method the file are stored one after another in a sequential manner. There are two ways to implement this method: 1. Pile File Method 2. Sorted File Method Pile File Method – This method is quite simple, in which we store the records in a sequence i.e one after other in the order in which they are inserted into the tables.
  • 48. Insertion of new record – Let the R1, R3 and so on upto R5 and R4 be four records in the sequence. Here, records are nothing but a row in any table. Suppose a new record R2 has to be inserted in the sequence, then it is simply placed at the end of the file.
  • 49. • Sorted File Method – In this method, As the name itself suggest whenever a new record has to be inserted, it is always inserted in a sorted (ascending or descending) manner. Sorting of records may be based on any primary key or any other key. Insertion of new record – Let us assume that there is a preexisting sorted sequence of four records R1, R3, and so on upto R7 and R8. Suppose a new record R2 has to be inserted in the sequence, then it will be inserted at the end of the file and then it will sort the sequence .
  • 50. Heap File Organization  Heap File Organization works with data blocks. In this method records are inserted at the end of the file, into the data blocks. No Sorting or Ordering is required in this method.  If a data block is full, the new record is stored in some other block, Here the other data block need not be the very next data block, but it can be any block in the memory. It is the responsibility of DBMS to store and manage the new records.
  • 51. • Insertion of new record – Suppose we have four records in the heap R1, R5, R6, R4 and R5 and suppose a new record R2 has to be inserted in the heap then, since the last data block i.e data block 3 is full it will be inserted in any of the data blocks selected by the DBMS, lets say data block 1. If we want to search, delete or update data in heap file Organization the we will traverse the data from the beginning of the file till we get the requested record. Thus if the database is very huge, searching, deleting or updating the record will take a lot of time.
  • 52. Hashing In a database management system, When we want to retrieve a particular data, It becomes very inefficient to search all the index values and reach the desired data. In this situation, Hashing technique comes into picture. • Hashing is an efficient technique to directly search the location of desired data on the disk without using index structure. Data is stored at the data blocks whose address is generated by using hash function. The memory location where these records are stored is called as data block or data bucket.
  • 53. Hash File Organization: • Data bucket – Data buckets are the memory locations where the records are stored. These buckets are also considered as Unit Of Storage. • Hash Function – Hash function is a mapping function that maps all the set of search keys to actual record address. Generally, hash function uses the primary key to generate the hash index – address of the data block. Hash function can be simple mathematical function to any complex mathematical function. • Hash Index-The prefix of an entire hash value is taken as a hash index. Every hash index has a depth value to signify how many bits are used for computing a hash function.
  • 54. Static Hashing: • In static hashing, when a search-key value is provided, the hash function always computes the same address. For example, if we want to generate an address for STUDENT_ID = 104 using mod (5) hash function, it always results in the same bucket address 4. There will not be any changes to the bucket address here. Hence a number of data buckets in the memory for this static hashing remain constant throughout. Insertion – When a new record is inserted into the table, The hash function h generates a bucket address for the new record based on its hash key K. Bucket address = h(K)
  • 55. Searching – When a record needs to be searched, The same hash function is used to retrieve the bucket address for the record. For Example, if we want to retrieve the whole record for ID 104, and if the hash function is mod (5) on that ID, the bucket address generated would be 4. Then we will directly got to address 4 and retrieve the whole record for ID 104. Here ID acts as a hash key. Deletion – If we want to delete a record, Using the hash function we will first fetch the record which is supposed to be deleted. Then we will remove the records for that address in memory. Updation – The data record that needs to be updated is first searched using hash function, and then the data record is updated.
  • 56. • Now, If we want to insert some new records into the file But the data bucket address generated by the hash function is not empty or the data already exists in that address. This becomes a critical situation to handle. This situation in the static hashing is called bucket overflow. To overcome this situation Some commonly used methods are discussed below: Open Hashing – In Open hashing method, next available data block is used to enter the new record, instead of overwriting the older one. This method is also called linear probing. For example, D3 is a new record that needs to be inserted, the hash function generates the address as 105. But it is already full. So the system searches next available data bucket, 123 and assigns D3 to it.
  • 57.
  • 58. Closed hashing – In Closed hashing method, a new data bucket is allocated with same address and is linked it after the full data bucket. This method is also known as overflow chaining. For example, we have to insert a new record D3 into the tables. The static hash function generates the data bucket address as 105. But this bucket is full to store the new data. In this case is a new data bucket is added at the end of 105 data bucket and is linked to it. Then new record D3 is inserted into the new bucket.
  • 59. • Quadratic probing : Quadratic probing is very much similar to open hashing or linear probing. Here, The only difference between old and new bucket is linear. Quadratic function is used to determine the new bucket address. • Double Hashing : Double Hashing is another method similar to linear probing. Here the difference is fixed as in linear probing, but this fixed difference is calculated by using another hash function. That’s why the name is double hashing.
  • 60. Dynamic Hashing – • The drawback of static hashing is that it does not expand or shrink dynamically as the size of the database grows or shrinks. In Dynamic hashing, data buckets grows or shrinks (added or removed dynamically) as the records increases or decreases. Dynamic hashing is also known as extended hashing. In dynamic hashing, the hash function is made to produce a large number of values. For example: • Consider the following grouping of keys into buckets, depending on the prefix of their hash address:
  • 61. The last two bits of 2 and 4 are 00. So it will go into bucket B0. The last two bits of 5 and 6 are 01, so it will go into bucket B1. The last two bits of 1 and 3 are 10, so it will go into bucket B2. The last two bits of 7 are 11, so it will go into B3.
  • 62. Insert key 9 with hash address 10001 into the above structure: • Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1 is full, so it will get split. • The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will go into bucket B1, and the last three bits of 6 are 101, so it will go into bucket B5. • Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry because last two bits of both the entry are 00. • Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry because last two bits of both the entry are 10. • Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because last two bits of both the entry are 11.
  • 63. Cluster File Organization • In cluster file organization, two or more related tables/records are stored within same file known as clusters. These files will have two or more tables in the same data block and the key attributes which are used to map these table together are stored only once. • Thus it lowers the cost of searching and retrieving various records in different files as they are now combined and kept in a single cluster. For example we have two tables or relation Employee and Department. These table are related to each other.
  • 64. Therefore these table are allowed to combine using a join operation and can be seen in a cluster file.
  • 65. If we have to insert, update or delete any record we can directly do so. Data is sorted based on the primary key or the key with which searching is done. Cluster key is the key with which joining of the table is performed. Types of Cluster File Organization – There are two ways to implement this method: • Indexed Clusters – In Indexed clustering the records are group based on the cluster key and stored together. The above mentioned example of the Employee and Department relationship is an example of Indexed Cluster where the records are based on the Department ID. • Hash Clusters – This is very much similar to indexed cluster with only difference that instead of storing the records based on cluster key, we generate hash key value and store the records with same hash key value.
  • 66. Indexing  An index is a data structure that organizes data records on the disk to make the retrieval of data efficient.
  • 67. Indexes are created using a few database columns. • The first column is the Search key that contains a copy of the primary key or candidate key of the table. These values are stored in sorted order so that the corresponding data can be accessed quickly. Note: The data may or may not be stored in sorted order. • The second column is the Data Reference or Pointer which contains a set of pointers holding the address of the disk block where that particular key value can be found.
  • 68.
  • 69. Ordered Indices: Based on sorted ordering values. The indices are usually sorted to make searching faster. The indices which are sorted are known as ordered indices
  • 70. Primary Index • If the index is created on the basis of the primary key of the table, then it is known as primary indexing. These primary keys are unique to each record and contain 1:1 relation between the records. • As primary keys are stored in sorted order, the performance of the searching operation is quite efficient. The primary index can be classified into two types: • Dense index and • Sparse index.
  • 71. Dense index • The dense index contains an index record for every search key value in the data file. It makes searching faster. • In this, the number of records in the index table is same as the number of records in the main table. • It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk.
  • 72.
  • 73. Sparse Index:  In the data file, index record appears only for a few items. Each item points to a block.  In this, instead of pointing to each record in the main table, the index points to the records in the main table in a gap.
  • 74. Clustering Index • A clustered index can be defined as an ordered data file. Sometimes the index is created on non-primary key columns which may not be unique for each record. • In this case, to identify the record faster, we will group two or more columns to get the unique value and create index out of them. This method is called a clustering index. • The records which have similar characteristics are grouped, and indexes are created for these group.
  • 75.
  • 76.
  • 77. Secondary Index • In the sparse indexing, as the size of the table grows, the size of mapping also grows. • These mappings are usually kept in the primary memory so that address fetch should be faster. Then the secondary memory searches the actual data based on the address got from mapping. If the mapping size grows then fetching the address itself becomes slower. • In this case, the sparse index will not be efficient. To overcome this problem, secondary indexing is introduced.
  • 78. In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In this method, the huge range for the columns is selected initially so that the mapping size of the first level becomes small. Then each range is further divided into smaller ranges. The mapping of the first level is stored in the primary memory, so that address fetch is faster. The mapping of the second level and actual data are stored in the secondary memory (hard disk).
  • 79.
  • 80.  Single level Indexing:  The index is usually specified on one field of the file.  Types of single level indexing can be primary indexing, clustering index or secondary indexing. Search Key Pointer to Record
  • 81.  Single level Indexing: Search Key Address 101 120 130 Roll No. Name Age 101 Aa 25 102 bb 20 Roll No. Name Age 130 Xx 32 131 Yy 28 132 zz 30
  • 82. Multi level Indexing:  Multilevel index is stored on the disk along with the actual database files.  Multi-level Index helps in breaking down the index into several smaller indices in order to make the outermost level so small that it can be saved in a single disk block, which can easily be accommodated anywhere in the main memory.
  • 83.  Multi level Indexing: 2 35 55 85 2 8 15 24 35 39 44 51 55 63 71 80 24 29 2 5 35 36 51 53 55 61 80 82
  • 84. B+ Tree • The B+ tree is a balanced binary search tree. It follows a multi-level index format. • B+ Tree is a Storage method in tree like structure. • B+ Tree has on root, any number of intermediate node and a leaf node • In the B+ tree, leaf nodes denote actual data pointers and leaf node will Have actual records stored in the sorted order. • Intermediate node will have only pointer to the leaf node it has no data. • B+ tree ensures that all leaf nodes remain at the same height. • In the B+ tree, the leaf nodes are linked using a link list. Therefore, a B+ tree can support random access as well as sequential access.
  • 85. Structure of B+ Tree In the B+ tree, every leaf node is at equal distance from the root node. The B+ tree is of the order n where n is fixed for every B+ tree. It contains an internal node and leaf node.
  • 86. Internal node • An internal node of the B+ tree can contain at least n/2 record pointers except the root node. • At most, an internal node of the tree contains n pointers. Leaf node • The leaf node of the B+ tree can contain at least n/2 record pointers and n/2 key values. • At most, a leaf node contains n record pointer and n key values. • Every leaf node of the B+ tree contains one block pointer P to point to next leaf node.
  • 87. Searching a record in B+ Tree • Suppose we have to search 55 in the below B+ tree structure. First, we will fetch for the intermediary node which will direct to the leaf node that can contain a record for 55. • So, in the intermediary node, we will find a branch between 50 and 75 nodes. Then at the end, we will be redirected to the third leaf node. Here DBMS will perform a sequential search to find 55.
  • 88. B+ Tree Insertion • Suppose we want to insert a record 60 in the below structure. It will go to the 3rd leaf node after 55. It is a balanced tree, and a leaf node of this tree is already full, so we cannot insert 60 there. • In this case, we have to split the leaf node, so that it can be inserted into tree without affecting the fill factor, balance and order.
  • 89. • The 3rd leaf node has the values (50, 55, 60, 65, 70) and its current root node is 50. We will split the leaf node of the tree in the middle so that its balance is not altered. So we can group (50, 55) and (60, 65, 70) into 2 leaf nodes. • If these two has to be leaf nodes, the intermediate node cannot branch from 50. It should have 60 added to it, and then we can have pointers to a new leaf node.
  • 90. B+ Tree Deletion • Suppose we want to delete 60 from the above example. In this case, we have to remove 60 from the intermediate node as well as from the 4th leaf node too. If we remove it from the intermediate node, then the tree will not satisfy the rule of the B+ tree. So we need to modify it to have a balanced tree. • After deleting node 60 from above B+ tree and re- arranging the nodes, it will show as follows:
  • 92. Query Processing  Parsing and Translation: The system must translate the query into a usable form.  A more useful internal representation is one based on the extended relational algebra.  The parser checks the syntax of the user’s query, verifies that the relation names appearing in the query are names of the relations in the database.
  • 93. Query Processing  Parsing and Translation - Example:  select salary from instructor where salary < 75000;  This query can be translated into either of the following relational-algebra expressions:
  • 94. Query Processing  Optimization:  During this process the query evaluation plan is prepared from all the relational algebraic expressions.  The query cost for all the evaluation plans is calculated.  Amongst all equivalent evaluation plans the one with lowest cost is chosen.  Cost is estimated using statistical information from the database catalog, such as size of tuples, etc.
  • 95. Query Processing  Evaluation:  The query execution engine takes a query evaluation, executes that plan and returns the answers to the query.
  • 96. Measures of Query Cost  Many factors contribute to time cost  Disk accesses, CPU, or even network communication  Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account Number of seeks * average-seek-cost Number of blocks read * average-block-read-cost Number of blocks written * average-block-write-cost
  • 97. Measures of Query Cost Cost to write a block is greater than cost to read a block  data is read back after being written to ensure that the write was successful. For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures tT – time to transfer one block tS – time for one seek Cost for b block transfers plus S seeks: b * tT + S * tS
  • 98. Algorithm for Selection Operation  File scan – search algorithms that locate and retrieve records that fulfill a selection condition.  Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition.
  • 99. Algorithm for Selection Operation  Algorithm A1 (linear search):  Cost estimate = br block transfers + 1 seek  br denotes number of blocks containing records from relation r.  If selection is on a key attribute, can stop on finding record  cost = (br /2) block transfers + 1 seek  Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices
  • 100. Algorithm for Selection Operation  Algorithm A2 (binary search):  Applicable if selection is an equality comparison on the attribute on which file is ordered.  Assume that the blocks of a relation are stored contiguously.  Cost estimate (number of disk blocks to be scanned): cost of locating the first tuple by a binary search on the blocks  [log2(br)] * (tT + tS)
  • 101. Algorithm for Selection Operation  Algorithm A2 (binary search):  If there are multiple records satisfying selection Add transfer cost of the number of blocks containing records that satisfy selection condition
  • 102. Algorithm for JOIN Operation  Nested Loop Join - To compute the theta join r for each tuple tr in r do begin for each tuple ts in s do begin test pair (tr,ts) to see if they satisfy the join condition q if they do, add tr • ts to the result. end end
  • 103. Algorithm for JOIN Operation  Nested Loop Join:  r is called the outer relation and s the inner relation of the join. Requires no indices and can be used with any kind of join condition. Expensive since it examines every pair of tuples in the two relations.
  • 104. Algorithm for JOIN Operation  Nested Loop Join:  In the worst case, if there is enough memory only to hold one block of each relation, the estimated cost is nr * bs + br block transfers, plus nr + br seeks If the smaller relation fits entirely in memory, use that as the inner relation. Reduces cost to br + bs block transfers and 2 seeks
  • 105. Algorithm for JOIN Operation  Nested Loop Join:  Assuming worst case memory availability cost estimate is with depositor as outer relation: 5000 * 400 + 100 = 2,000,100 block transfers, 5000 + 100 = 5100 seeks with customer as the outer relation 10000 * 100 + 400 = 1,000,400 block transfers and 10,400 seeks
  • 106. Algorithm for JOIN Operation  Block Nested Loop Join  Variant of nested loop join in which every block of inner relation is paired with every block of outer relation.
  • 107. Algorithm for JOIN Operation  Merge Join  Sort both relations on their join attribute (if not already sorted on the join attributes). Merge the sorted relations to join them.  Can be used only for equijoins and natural joins  the cost of merge join is: br + bs block transfers + [br / bb]+ [bs / bb] seeks + the cost of sorting if relations are unsorted.
  • 108. Algorithm for JOIN Operation  Hash Join:  The hash function is used h is used to partition tuples of both the relations.  Cost: 3(br+bs)+4 x n block transfers + 2 ( [br/bb]+[bs/bb]) seeks.
  • 109. Query Optimization  Heuristic Estimation:  Heuristic is a rule that leads to least cost in most of cases.  Systems may use heuristic to reduce the number of choices that must be made in a cost-based fashion.  Heuristic optimization transforms the query-tree by using a set of rules that typically t improve execution performance.
  • 110. Query Optimization  Query Tree:  SELECT schedule, room FROM Student NATURAL JOIN Enroll NATURAL JOIN Class WHERE Major='Math'
  • 112. Query Optimization  Heuristic Estimation – Rules:  Perform selection early (reduces the number of tuples)  Perform projection early (reduces the number of attributes)  Perform most restrictive selection and join operations before other similar operations.
  • 113. Query Optimization  Heuristic Estimation – Steps:  Scanner and parser generate initial query representation  Representation is optimized according to heuristic rules  Query execution plan developed.
  • 114. Query Optimization  Cost based Estimation:  Look at all of the possible ways or scenarios in which a query can be executed  Each scenario will be assigned a ‘cost’, which indicates how efficiently that query can be run  Pick the scenario that has the least cost and execute the query using that scenario, because that is most efficient way to run the query.
  • 115. Query Optimization  Cost based Estimation:  Scope of query optimization is a query block. Global query optimization involves multiple query blocks.  Cost Components: Access cost to secondary storage, Disk Storage cost, Computation cost, memory usage cost and Communication cost
  • 116. Query Optimization  Cost based Estimation:  Information stored in DBMS catalog and used by optimizer:  File Size  Organization  Number of levels of each multilevel index  Number of distinct values of an attribute  Attribute selectivity.