5. Overview of Physical Storage Media
Storage media are classified into different
types based on the following:
Accessing Speed
Cost per unit of data
Reliability
6. Based on storage volatility they can be classified
into 2 types:
Volatile storage: Loses the contents when the
power to the device is removed.
E.g.: Main memory and cache.
Non-Volatile storage: Contents persist even when
the power is switched off.
E.g.: Secondary & Tertiary storage devices.
9. Primary Storage:
This category usually provides fast access to data, but has
limited storage capacity.
It is volatile in nature.
Example: Cache and main memory
Secondary Storage:
These devices usually have a large capacity.
Less cost and slower access to data
It is non-volatile.
E.g.: Magnetic disks
10. Tertiary Storage:
This is in the lowest level of hierarchy.
Non-volatile,
Slow access time
Example: Magnetic tape, optical storage
11. Cache Memory
It is the fastest and most costly form of storage.
It is volatile in nature.
It is managed by computer system hardware.
Cache memory, lies in between CPU and the Main memory
12. Main Memory
Fast access, generally two small to store the entire database.
Too expensive.
Capacities of upto a few giga bytes widely used currently.
Ex: RAM
13. Flash Memory
It is present between primary storage and secondary
storage in the storage hierarchy.
It is non volatile memory.
Accessing speed is as fast as reading data from main
memory.
Widely used in embedded devices such as digital
cameras, Video games, etc
14. Magnetic Disk
Primary medium for long term storage of data, typically stores
entire database.
Data must be moved from disk to main memory for access and
written back for storage.
Much slower access than main memory.
Capacities ranges upto 400 Gigabytes currently
Much larger capacity than main and flash memory.
Disk storage survives power failures and system crashes
Disk failure can destroy data but is very rare.
16. Mechanism
Physically, disks are relatively simple. Each platter has a
flat circular shape.
Its two surfaces are covered with a magnetic
material, and information is recorded on the
surfaces.
Platters are made from rigid metal or glass
17. When the disk is in use, a drive motor
spins it at a constant high speed.
There is a read-write head positioned
just above the surface of the platter.
The disk surface is logically divided
into tracks over 50,000 to 100,000
tracks per platter and 1 to 5 platters per
disk.
Each track is divided into sectors. A
sector is a smallest unit of information
that can be read from or written to the
disk. Sector size typically 512 bytes.
18. Inner tracks are of smaller
length (500 sectors per track)
and outer tracks contains more
sectors than inner tracks (1000
sectors per track).
The read-write head stores
information on a sector
magnetically.
The read-write head of all the
tracks are mounted on a single
assembly called a disk arm.
19. The disk platters mounted on a
spindle and heads mounted on a disk
arm are together known as head -
disk assemblies
Cylinder is consisting of the track
of all the platters
A disk-controller interfaces
between the computer system and the
disk drive hardware.
20. Disk controllers attach checksums to each sector to
verify that data is read back correctly. The checksum
is computed from the data written to the sector.
Another task of disk controller is remapping of bad
sectors.
There are number of common interfaces for
connecting disks to personal computers - (a) ATA
interface and (b) SCSI (small computer system
Interconnected) interface.
21. Performance Measures of Disks
Access time: The time it takes from when a read or write
request is issued to when data transfer begins.
Seek time: Time it takes to reposition the arm over the correct
track. The seek time ranges from 2 to 30 milliseconds. The
average seek time is one-third the worst case seek time and one
half the maximum seek time. Average seek time currently ranges
between 4 to 10 milliseconds.
22. Rotational latency: Time it takes for the
sector to be accessed to appear under the head.
It ranges from 4 to 11 milliseconds.
Data-transfer rate: The rate at which data
can be retrieved from or stored to the disk. It
ranges from 25 to 100 megabytes per second.
23. Mean time to failure (MTTF): The average
time, the disk is expected to run continuously
without any failure.
It is the measure of the reliability of the disk
Typically 3 to 5 years
Probability of failure of new disks is quite low
MTTF decreases as disk ages.
24. Optimization of Disk-Block Access
Techniques used for accessing data from disk:
Scheduling
Disk arm scheduling algorithm order pending accesses to tracks
so that disk arm movement is minimized.
Commonly used algorithm is elevator algorithm.
Move disk arm in one direction (From outer to inner tracks or
vice versa), processing next request in that direction, till no more
requests in that direction, then reverse direction and repeat.
25. File organization:
Optimize block access time by organizing the blocks to
correspond to how data will be accessed.
Eg: store related information’s on the same or nearby
cylinders.
Sequential file may become fragmented, that is its blocks
become scattered all over the disk. Sequential access to a
fragmented file results in increased disk arm movement.
26. Non-volatile write buffers: (NV-RAM)
To speed up disk writes by writing blocks to a non-
volatile RAM buffer immediately, the contents to
NV-RAM are not lost in power failure.
Log disk
A disk devoted to writing a sequential log to block
updates used exactly like non-volatile RAM. Write to
log disk is very fast since no seeks are required
Journaling file system write data in safe order to NV-
RAM or log disk.
27. RAID
Redundant Array of Independent Disks
Multiple secondary disks are connected together to
increase the performance, data redundancy or both.
Need:
To increase the performance
Increased reliability
To give greater throughput
Data are restored
28. RAID – Level 0
Data is stripped into multiple drives
29. RAID – Level 0
Data is broken down into blocks and these blocks are
stored across all the disks.
Thus striped array of disks is implemented.
There is no duplication of data in this level so once a
block is lost then there is no way recover it.
It has good performance.
30. RAID – Level 1
Mirroring of data in drive 1 to drive 2. It
offers 100% redundancy as array will
continue to work even if either disk fails.
31. RAID – Level 1
uses mirroring techniques
All data in the drive is duplicated to another
drive.
It provides 100% redundancy in case of a
failure.
Advantage: Fault Tolerance
32. RAID 10, also known as RAID 1+0, is a RAID configuration that
combines disk mirroring and disk striping to protect data. It
requires a minimum of four disks and stripes data across
mirrored pairs.
34. RAID – Level 2
Use of mirroring as well as stores Error Correcting
codes for its data striped on different disks.
Each data bit in a word is recorded on a separate disk
and ECC codes of the data words are stored on a
different set disks.
Due to its complex structure and high cost, RAID 2 is
not commercially available.
35. RAID – Level 3
One dedicated drive is
used to store the parity
information and in case of
any drive failure the
parity is restored using
this extra drive.
36. RAID – Level 3
It consists of byte level stripping with dedicated
parity. In this level, the parity information is
stored for each disk section and written to
dedicated parity drive.
Parity is a technique that checks whether data has
been lost or written over when it is moved from
one place in storage to another.
37. RAID – Level 3
In the case of disk failure, the parity disk is
accessed and data is reconstructed from the
remaining devices.
Once the failed disk is replaced, the missing
data can be restored on the new disk.
38. RAID – Level 4
This level is very much
similar to RAID 3 apart from
the feature where RAID 4
uses block level stripping
rather than byte level.
39. RAID – Level 4
It consists of block level stripping with a
parity disk.
40. RAID – Level 5
Parity information is written
to a different disk in the array
for each stripe. In case of
single disk failure data can be
recovered with the help of
distributed parity without
affecting the operation and
other read write operations.
41. RAID – Level 5
RAID 5 writes whole data blocks onto
different disks, but the parity bits generated for
data block stripe are distributed among all the
data disks rather than storing them on a
different dedicated disk.
42. RAID – Level 6
This level is an enhanced version of RAID 5 adding
extra benefit of dual parity. This level uses block
level stripping with DUAL distributed parity
43. RAID – Level 6
RAID 6 is a extension of Level 5.
In this level, two independent parities are generated
and stored in distributed fashion among multiple disks.
Two parities provide additional fault tolerance.
This level requires at least four disk drives to
implement RAID.
44. The factors to be taken into account in
choosing a RAID level are:
Performance requirements in terms of number of
I/O operation.
Performance when a disk has failed.
Performance during rebuild.
45. File Organization
A method of arranging records in a file when the file is
stored on disk.
A file is organized logically as a sequence of records.
Record is a sequence of fields.
Each file is also logically partitioned into fixed-length
storage units called blocks, which are the units of both
storage allocation and data transfer.
46. What is File Organization?
In simple terms, Storing the files in certain order is called file
Organization. File Structure refers to the format of the label
and data blocks and of any logical control record.
Types of File Organizations
• Sequential File Organization
• Heap File Organization
• Hash File Organization
• B+ Tree File Organization
• Clustered File Organization
47. Sequential File Organization –
• The easiest method for file Organization is Sequential method.
In this method the file are stored one after another in a
sequential manner. There are two ways to implement this
method:
1. Pile File Method
2. Sorted File Method
Pile File Method – This method is quite simple, in which we
store the records in a sequence i.e one after other in the
order in which they are inserted into the tables.
48. Insertion of new record –
Let the R1, R3 and so on upto R5 and R4 be four records in the
sequence. Here, records are nothing but a row in any table.
Suppose a new record R2 has to be inserted in the sequence,
then it is simply placed at the end of the file.
49. • Sorted File Method –
In this method, As the name itself suggest whenever a new record
has to be inserted, it is always inserted in a sorted (ascending or
descending) manner. Sorting of records may be based on any
primary key or any other key.
Insertion of new record –
Let us assume that there is a preexisting sorted sequence of
four records R1, R3, and so on upto R7 and R8. Suppose a new
record R2 has to be inserted in the sequence, then it will be
inserted at the end of the file and then it will sort the
sequence .
50. Heap File Organization
Heap File Organization works with data blocks. In this
method records are inserted at the end of the file, into the
data blocks. No Sorting or Ordering is required in this method.
If a data block is full, the new record is stored in some other
block, Here the other data block need not be the very next
data block, but it can be any block in the memory. It is the
responsibility of DBMS to store and manage the new records.
51. • Insertion of new record –
Suppose we have four records in the heap R1, R5, R6, R4
and R5 and suppose a new record R2 has to be inserted
in the heap then, since the last data block i.e data block 3
is full it will be inserted in any of the data blocks selected
by the DBMS, lets say data block 1.
If we want to search, delete or
update data in heap file
Organization the we will traverse
the data from the beginning of the
file till we get the requested
record. Thus if the database is very
huge, searching, deleting or
updating the record will take a lot
of time.
52. Hashing
In a database management system, When we want to
retrieve a particular data, It becomes very inefficient to
search all the index values and reach the desired data. In
this situation, Hashing technique comes into picture.
• Hashing is an efficient technique to directly search the
location of desired data on the disk without using index
structure. Data is stored at the data blocks whose
address is generated by using hash function. The memory
location where these records are stored is called as data
block or data bucket.
53. Hash File Organization:
• Data bucket – Data buckets are the memory locations
where the records are stored. These buckets are also
considered as Unit Of Storage.
• Hash Function – Hash function is a mapping function
that maps all the set of search keys to actual record
address. Generally, hash function uses the primary key to
generate the hash index – address of the data block.
Hash function can be simple mathematical function to
any complex mathematical function.
• Hash Index-The prefix of an entire hash value is taken as
a hash index. Every hash index has a depth value to
signify how many bits are used for computing a hash
function.
54. Static Hashing:
• In static hashing, when a search-key value is provided, the
hash function always computes the same address. For
example, if we want to generate an address for STUDENT_ID =
104 using mod (5) hash function, it always results in the same
bucket address 4. There will not be any changes to the bucket
address here. Hence a number of data buckets in the memory
for this static hashing remain constant throughout.
Insertion – When a new record is inserted into the table, The
hash function h generates a bucket address for the new
record based on its hash key K.
Bucket address = h(K)
55. Searching – When a record needs to be searched, The same hash function
is used to retrieve the bucket address for the record. For Example, if we
want to retrieve the whole record for ID 104, and if the hash function is
mod (5) on that ID, the bucket address generated would be 4. Then we will
directly got to address 4 and retrieve the whole record for ID 104. Here ID
acts as a hash key.
Deletion – If we want to delete a record, Using the hash function we will
first fetch the record which is supposed to be deleted. Then we will
remove the records for that address in memory.
Updation – The data record that needs to be updated is first searched
using hash function, and then the data record is updated.
56. • Now, If we want to insert some new records into the file But
the data bucket address generated by the hash function is not
empty or the data already exists in that address. This becomes
a critical situation to handle. This situation in the static
hashing is called bucket overflow.
To overcome this situation Some commonly used methods are
discussed below:
Open Hashing – In Open hashing method, next available data
block is used to enter the new record, instead of overwriting
the older one. This method is also called linear probing. For
example, D3 is a new record that needs to be inserted, the
hash function generates the address as 105. But it is already
full. So the system searches next available data bucket, 123
and assigns D3 to it.
57.
58. Closed hashing – In Closed hashing method, a new data
bucket is allocated with same address and is linked it after the
full data bucket. This method is also known as overflow
chaining. For example, we have to insert a new record D3 into
the tables. The static hash function generates the data bucket
address as 105. But this bucket is full to store the new data. In
this case is a new data bucket is added at the end of 105 data
bucket and is linked to it. Then new record D3 is inserted into
the new bucket.
59. • Quadratic probing : Quadratic probing is very much similar to
open hashing or linear probing. Here, The only difference
between old and new bucket is linear. Quadratic function is
used to determine the new bucket address.
• Double Hashing : Double Hashing is another method similar
to linear probing. Here the difference is fixed as in linear
probing, but this fixed difference is calculated by using
another hash function. That’s why the name is double
hashing.
60. Dynamic Hashing –
• The drawback of static hashing is that it does not expand or shrink
dynamically as the size of the database grows or shrinks. In Dynamic
hashing, data buckets grows or shrinks (added or removed dynamically) as
the records increases or decreases. Dynamic hashing is also known
as extended hashing. In dynamic hashing, the hash function is made to
produce a large number of values.
For example:
• Consider the following grouping of keys into buckets, depending on the
prefix of their hash address:
61. The last two bits of 2 and 4 are 00. So it will go into bucket B0.
The last two bits of 5 and 6 are 01, so it will go into bucket B1.
The last two bits of 1 and 3 are 10, so it will go into bucket B2.
The last two bits of 7 are 11, so it will go into B3.
62. Insert key 9 with hash address 10001 into the above structure:
• Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1
is full, so it will get split.
• The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will
go into bucket B1, and the last three bits of 6 are 101, so it will go into bucket B5.
• Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry
because last two bits of both the entry are 00.
• Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry
because last two bits of both the entry are 10.
• Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because
last two bits of both the entry are 11.
63. Cluster File Organization
• In cluster file organization, two or more related
tables/records are stored within same file known as
clusters. These files will have two or more tables in the
same data block and the key attributes which are used to
map these table together are stored only once.
• Thus it lowers the cost of searching and retrieving various
records in different files as they are now combined and
kept in a single cluster.
For example we have two tables or relation Employee
and Department. These table are related to each other.
64. Therefore these table are allowed to combine using a join operation and can be
seen in a cluster file.
65. If we have to insert, update or delete any record we can directly
do so. Data is sorted based on the primary key or the key with
which searching is done. Cluster key is the key with which
joining of the table is performed.
Types of Cluster File Organization – There are two ways to
implement this method:
• Indexed Clusters –
In Indexed clustering the records are group based on the
cluster key and stored together. The above mentioned
example of the Employee and Department relationship is an
example of Indexed Cluster where the records are based on
the Department ID.
• Hash Clusters –
This is very much similar to indexed cluster with only
difference that instead of storing the records based on cluster
key, we generate hash key value and store the records with
same hash key value.
66. Indexing
An index is a data structure that organizes data
records on the disk to make the retrieval of data
efficient.
67. Indexes are created using a few database columns.
• The first column is the Search key that contains a
copy of the primary key or candidate key of the
table. These values are stored in sorted order so that
the corresponding data can be accessed quickly.
Note: The data may or may not be stored in sorted
order.
• The second column is the Data
Reference or Pointer which contains a set of
pointers holding the address of the disk block where
that particular key value can be found.
68.
69. Ordered Indices:
Based on sorted ordering values.
The indices are usually sorted to make
searching faster. The indices which are sorted
are known as ordered indices
70. Primary Index
• If the index is created on the basis of the primary
key of the table, then it is known as primary
indexing. These primary keys are unique to each
record and contain 1:1 relation between the
records.
• As primary keys are stored in sorted order, the
performance of the searching operation is quite
efficient.
The primary index can be classified into two types:
• Dense index and
• Sparse index.
71. Dense index
• The dense index contains an index record for every
search key value in the data file. It makes searching
faster.
• In this, the number of records in the index table is
same as the number of records in the main table.
• It needs more space to store index record itself. The
index records have the search key and a pointer to
the actual record on the disk.
72.
73. Sparse Index:
In the data file, index record appears only for a few
items. Each item points to a block.
In this, instead of pointing to each record in the main
table, the index points to the records in the main table
in a gap.
74. Clustering Index
• A clustered index can be defined as an ordered data
file. Sometimes the index is created on non-primary
key columns which may not be unique for each
record.
• In this case, to identify the record faster, we will
group two or more columns to get the unique value
and create index out of them. This method is called a
clustering index.
• The records which have similar characteristics are
grouped, and indexes are created for these group.
75.
76.
77. Secondary Index
• In the sparse indexing, as the size of the table grows,
the size of mapping also grows.
• These mappings are usually kept in the primary
memory so that address fetch should be faster. Then
the secondary memory searches the actual data
based on the address got from mapping. If the
mapping size grows then fetching the address itself
becomes slower.
• In this case, the sparse index will not be efficient. To
overcome this problem, secondary indexing is
introduced.
78. In secondary indexing, to reduce the size of
mapping, another level of indexing is introduced.
In this method, the huge range for the columns is
selected initially so that the mapping size of the
first level becomes small. Then each range is
further divided into smaller ranges.
The mapping of the first level is stored in the
primary memory, so that address fetch is faster.
The mapping of the second level and actual data
are stored in the secondary memory (hard disk).
79.
80. Single level Indexing:
The index is usually specified on one field of the
file.
Types of single level indexing can be primary
indexing, clustering index or secondary indexing.
Search Key Pointer to Record
81. Single level Indexing:
Search Key Address
101
120
130
Roll No. Name Age
101 Aa 25
102 bb 20
Roll No. Name Age
130 Xx 32
131 Yy 28
132 zz 30
82. Multi level Indexing:
Multilevel index is stored on the disk along with the
actual database files.
Multi-level Index helps in breaking down the index
into several smaller indices in order to make the
outermost level so small that it can be saved in a single
disk block, which can easily be accommodated
anywhere in the main memory.
84. B+ Tree
• The B+ tree is a balanced binary search tree. It follows a
multi-level index format.
• B+ Tree is a Storage method in tree like structure.
• B+ Tree has on root, any number of intermediate node
and a leaf node
• In the B+ tree, leaf nodes denote actual data pointers
and leaf node will Have actual records stored in the
sorted order.
• Intermediate node will have only pointer to the leaf node
it has no data.
• B+ tree ensures that all leaf nodes remain at the same
height.
• In the B+ tree, the leaf nodes are linked using a link list.
Therefore, a B+ tree can support random access as well
as sequential access.
85. Structure of B+ Tree
In the B+ tree, every leaf node is at equal distance
from the root node.
The B+ tree is of the order n where n is fixed for every
B+ tree.
It contains an internal node and leaf node.
86. Internal node
• An internal node of the B+ tree can contain at least n/2
record pointers except the root node.
• At most, an internal node of the tree contains n pointers.
Leaf node
• The leaf node of the B+ tree can contain at least n/2 record
pointers and n/2 key values.
• At most, a leaf node contains n record pointer and n key
values.
• Every leaf node of the B+ tree contains one block pointer P
to point to next leaf node.
87. Searching a record in B+ Tree
• Suppose we have to search 55 in the below B+ tree structure.
First, we will fetch for the intermediary node which will direct
to the leaf node that can contain a record for 55.
• So, in the intermediary node, we will find a branch between
50 and 75 nodes. Then at the end, we will be redirected to the
third leaf node. Here DBMS will perform a sequential search
to find 55.
88. B+ Tree Insertion
• Suppose we want to insert a record 60 in the below structure.
It will go to the 3rd leaf node after 55. It is a balanced tree,
and a leaf node of this tree is already full, so we cannot insert
60 there.
• In this case, we have to split the leaf node, so that it can be
inserted into tree without affecting the fill factor, balance and
order.
89. • The 3rd leaf node has the values (50, 55, 60, 65, 70) and its
current root node is 50. We will split the leaf node of the tree
in the middle so that its balance is not altered. So we can
group (50, 55) and (60, 65, 70) into 2 leaf nodes.
• If these two has to be leaf nodes, the intermediate node
cannot branch from 50. It should have 60 added to it, and
then we can have pointers to a new leaf node.
90. B+ Tree Deletion
• Suppose we want to delete 60 from the above example.
In this case, we have to remove 60 from the intermediate
node as well as from the 4th leaf node too. If we remove
it from the intermediate node, then the tree will not
satisfy the rule of the B+ tree. So we need to modify it to
have a balanced tree.
• After deleting node 60 from above B+ tree and re-
arranging the nodes, it will show as follows:
92. Query Processing
Parsing and Translation:
The system must translate the query into a usable form.
A more useful internal representation is one based on
the extended relational algebra.
The parser checks the syntax of the user’s query, verifies
that the relation names appearing in the query are names of
the relations in the database.
93. Query Processing
Parsing and Translation - Example:
select salary from instructor where salary <
75000;
This query can be translated into either of the
following relational-algebra expressions:
94. Query Processing
Optimization:
During this process the query evaluation plan is prepared
from all the relational algebraic expressions.
The query cost for all the evaluation plans is calculated.
Amongst all equivalent evaluation plans the one with
lowest cost is chosen.
Cost is estimated using statistical information from the
database catalog, such as size of tuples, etc.
95. Query Processing
Evaluation:
The query execution engine takes a query evaluation,
executes that plan and returns the answers to the query.
96. Measures of Query Cost
Many factors contribute to time cost
Disk accesses, CPU, or even network communication
Typically disk access is the predominant cost, and is
also relatively easy to estimate. Measured by taking
into account
Number of seeks * average-seek-cost
Number of blocks read * average-block-read-cost
Number of blocks written * average-block-write-cost
97. Measures of Query Cost
Cost to write a block is greater than cost to read a block
data is read back after being written to ensure that the write was successful.
For simplicity we just use the number of block transfers
from disk and the
number of seeks as the cost measures
tT – time to transfer one block
tS – time for one seek
Cost for b block transfers plus S seeks: b * tT + S * tS
98. Algorithm for Selection Operation
File scan – search algorithms that locate and
retrieve records that fulfill a selection
condition.
Algorithm A1 (linear search). Scan each file
block and test all records to see whether they
satisfy the selection condition.
99. Algorithm for Selection Operation
Algorithm A1 (linear search):
Cost estimate = br block transfers + 1 seek
br denotes number of blocks containing records from relation r.
If selection is on a key attribute, can stop on finding record
cost = (br /2) block transfers + 1 seek
Linear search can be applied regardless of
selection condition or ordering of records in the file, or
availability of indices
100. Algorithm for Selection Operation
Algorithm A2 (binary search):
Applicable if selection is an equality comparison on the attribute
on which file is ordered.
Assume that the blocks of a relation are stored contiguously.
Cost estimate (number of disk blocks to be scanned):
cost of locating the first tuple by a binary search on the blocks
[log2(br)] * (tT + tS)
101. Algorithm for Selection Operation
Algorithm A2 (binary search):
If there are multiple records satisfying selection
Add transfer cost of the number of blocks containing
records that satisfy selection condition
102. Algorithm for JOIN Operation
Nested Loop Join - To compute the theta join r
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition q
if they do, add tr • ts to the result.
end
end
103. Algorithm for JOIN Operation
Nested Loop Join:
r is called the outer relation and s the inner relation of the
join.
Requires no indices and can be used with any kind of join
condition.
Expensive since it examines every pair of tuples in the two
relations.
104. Algorithm for JOIN Operation
Nested Loop Join:
In the worst case, if there is enough memory only to hold one
block of each relation, the estimated cost is nr * bs + br
block transfers, plus nr + br
seeks
If the smaller relation fits entirely in memory, use that as the
inner relation.
Reduces cost to br + bs block transfers and 2 seeks
105. Algorithm for JOIN Operation
Nested Loop Join:
Assuming worst case memory availability cost estimate is
with depositor as outer relation:
5000 * 400 + 100 = 2,000,100 block transfers,
5000 + 100 = 5100 seeks
with customer as the outer relation
10000 * 100 + 400 = 1,000,400 block transfers and 10,400 seeks
106. Algorithm for JOIN Operation
Block Nested Loop Join
Variant of nested loop join in which every block of inner
relation is paired with every block of outer relation.
107. Algorithm for JOIN Operation
Merge Join
Sort both relations on their join attribute (if not already sorted
on the join attributes).
Merge the sorted relations to join them.
Can be used only for equijoins and natural joins
the cost of merge join is: br + bs block transfers + [br / bb]+ [bs
/ bb] seeks
+ the cost of sorting if relations are unsorted.
108. Algorithm for JOIN Operation
Hash Join:
The hash function is used h is used to partition tuples of
both the relations.
Cost: 3(br+bs)+4 x n block transfers + 2 ( [br/bb]+[bs/bb])
seeks.
109. Query Optimization
Heuristic Estimation:
Heuristic is a rule that leads to least cost in most of cases.
Systems may use heuristic to reduce the number of choices
that must be made in a cost-based fashion.
Heuristic optimization transforms the query-tree by using a
set of rules that typically t improve execution performance.
110. Query Optimization
Query Tree:
SELECT schedule, room FROM Student NATURAL
JOIN Enroll NATURAL JOIN Class WHERE
Major='Math'
112. Query Optimization
Heuristic Estimation – Rules:
Perform selection early (reduces the number of tuples)
Perform projection early (reduces the number of attributes)
Perform most restrictive selection and join operations
before other similar operations.
113. Query Optimization
Heuristic Estimation – Steps:
Scanner and parser generate initial query representation
Representation is optimized according to heuristic rules
Query execution plan developed.
114. Query Optimization
Cost based Estimation:
Look at all of the possible ways or scenarios in which a
query can be executed
Each scenario will be assigned a ‘cost’, which indicates
how efficiently that query can be run
Pick the scenario that has the least cost and execute the
query using that scenario, because that is most efficient
way to run the query.
115. Query Optimization
Cost based Estimation:
Scope of query optimization is a query block. Global
query optimization involves multiple query blocks.
Cost Components: Access cost to secondary storage, Disk
Storage cost, Computation cost, memory usage cost and
Communication cost
116. Query Optimization
Cost based Estimation:
Information stored in DBMS catalog and used by optimizer:
File Size
Organization
Number of levels of each multilevel index
Number of distinct values of an attribute
Attribute selectivity.