SlideShare a Scribd company logo
1 of 51
Download to read offline
2 December 2005
Introduction to Databases
Access Methods
Prof. Beat Signer
Department of Computer Science
Vrije Universiteit Brussel
http://www.beatsigner.com
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2April 28, 2017
Context of Today's Lecture
Access
Methods
System
Buffers
Authorisation
Control
Integrity
Checker
Command
Processor
Program
Object Code
DDL
Compiler
File
Manager
Buffer
Manager
Recovery
Manager
Scheduler
Query
Optimiser
Transaction
Manager
Query
Compiler
Queries
Catalogue
Manager
DML
Preprocessor
Database
Schema
Application
Programs
Database
Manager
Data
Manager
DBMS
Programmers Users DB Admins
Based on 'Components of a DBMS', Database Systems,
T. Connolly and C. Begg, Addison-Wesley 2010
Data, Indices and
System Catalogue
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3April 28, 2017
Basic Index Concepts
 An index is used to efficiently access specific data
 e.g. index in a textbook
 Often database queries reference only a small number of
records in a file
 indices can be used to enhance the search for these records
 most DBMSs automatically create an index for primary keys
 A search key is an attribute (or set of attributes) that is
used to lookup records in a file
 An index file consists of (searchKey, pointer) index
entries
 normally much smaller than the original file
 pointer identifies a disk block and a record offset within the block
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4April 28, 2017
Basic Index Concepts ...
 There are two basic types of indices
 ordered indices
- based on a sorted ordering of search keys
 hash indices
- search keys are uniformely distributed to different buckets based on a hash
function
 Indexing techniques have to be evaluated based on
 access types
- e.g. records with a given attribute value or an attribute value in a specific range
 access time
 insertion and deletion time
- if the underlying data is changed, all indices have to be updated which may
result in a significant overhead for modifications
 space overhead
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5April 28, 2017
Ordered Indices
 An ordered index stores the search key values in sorted
order
 e.g. author catalogue in a library
 A primary index (clustering index) is an index whose
search key also defines the sequential order of the file
 the primary key is often used as a search key of a primary index
 files with a clustering index on the search key are called
index-sequential-files
 A secondary index (non-clustering index) is an index
whose search key has a different order than the
sequential file order
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6April 28, 2017
Dense Index
 A dense index contains an entry for each search key in
the file
 An index record of a dense primary index points to the
first file record with the given search key
 An index record of a dense secondary index stores a list
of pointers to the records with the same search key value
de Rover
Frei
Jones
Meier
Michelin
2 de Rover Pieter Brussels
1 Frei Urs Zurich
8 Frei Urs Zurich
14 Jones Andrew Paris
53 Meier Beat Zurich
5 Michelin Robert Parisdense index
index-sequential file
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7April 28, 2017
Dense Index Insertion
find the search key of the record to be inserted in the index
if (index contains search key) {
if (index stores pointers to all records) {
add a pointer to the new record to the index entry
}
else { // index stores pointer to the first record only
ensure that the record is inserted after the first record with the
same search key
}
}
else { // index does not contain the search key
insert an index record with the search key at the appropriate position
}
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8April 28, 2017
Dense Index Deletion
D
 Update can be modelled as a delete followed by an insert
look up the record to be deleted
if (deleted record is the only one with the search key value) {
delete index record
}
else { // there are other records with the same search key
if (index stores pointers to all records) {
delete the corresponding pointer from the index record
}
else { // index stores pointer to the first record only
if (deleted record was the first record with the search key value) {
update the index record to point to the next record
}
}
}
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9April 28, 2017
Sparse Index
 Contains only some of the search key values
 can only be used for primary indices!
 To search a given record in a sparse index the following
steps have to be performed
1) find the index record ri with the largest search key value that is
smaller or equal to the search key value to be found
2) start a linear search at record ri until the record is found
de Rover
Jones
Michelin
2 de Rover Pieter Brussels
1 Frei Urs Zurich
8 Frei Urs Zurich
14 Jones Andrew Paris
53 Meier Beat Zurich
5 Michelin Robert Paris
sparse index
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10April 28, 2017
Sparse Index
 A sparse index consumes less space and produces less
maintenance overhead for insertions and deletions
 On the other hand it takes more time to locate a record
based on a sparse index
 A good trade-off is a sparse index with one index record
for each block
...
sparse index
...
block 1
block 2
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11April 28, 2017
Sparse Index Insertion
// assumption: one index entry for each block
if (new block is created) {
insert first search key value in the new block into the index
}
else { // no new block
if (new record has the smallest search key value in the block) {
update the index record pointing to the block
}
}
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12April 28, 2017
Sparse Index Deletion
 Update can be modelled as a delete followed by an insert
if (index contains the search key value of deleted record) {
if (deleted record is the only one with the search key value) {
if (search key value of the next record is not in the index {
update the index record to point to the next record
}
else { // search key value of next record is already in the index
delete index entry
}
}
else { // there are other records with the same search key
if (deleted record was the first record with the search key value) {
update the index entry to point to the next record with the same
search key value
}
}
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13April 28, 2017
Multilevel Index
 A multilevel index can be used if the
index grows too large to be kept in memory
 inner index froms the primary index file
 outer index is a sparse index on the inner index
...
inner index
...
data
block 1
data
block 2
...
...
outer index
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14April 28, 2017
Multilevel Index ...
 If even the outer index grows too large to be kept in
memory, the process can be repeated
 note that instead of this type of multilevel index it is preferred to
use B+-trees
 The update of the multilevel index (on insertion and
deletion) is an extension of the updates described for the
single level indices
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15April 28, 2017
Secondary Index
 A secondary index has to be a dense index and must
point to all the records with the same search key value
 e.g. indirection via buckets that contain the pointers
 The use of a secondary index normally requires more
I/O operations than a primary index
Brussels
Paris
Zurich
2 de Rover Pieter Brussels
1 Frei Urs Zurich
8 Frei Urs Zurich
14 Jones Andrew Paris
53 Meier Beat Zurich
5 Michelin Robert Paris
secondary index
buckets
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16April 28, 2017
Index-Sequential File Problems
 Index-sequential files have a number of disadvantages
 degradation and loss of performance when the file grows since
many overflow blocks have to be created
 the entire file has to be periodically reorganised
 Alternative index structures that do not show a loss of
performance after updates can be used
 e.g. the B+-tree index file structure is a widely used multilevel
index structure that is based on a B+-tree (balanced and sorted )
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17April 28, 2017
B+-Tree
 Properties of a balanced B+-tree
 every path from the root to a leaf node has the same length
 for a given n, each leaf node has between n-1)/2 and n-1 values
- leaf nodes must always be at least half full
 each non-leaf (and non-root) node has between n/2 and n
children (fanout)
 typical B+-tree node
- K1,..., Km-1 are the ordered search key values (Ki < Ki+1)
- a non-leaf node has pointers P1,..., Pm to child nodes
- a leaf node has pointers P1,..., Pm to file records or to buckets of pointers
- typically the size of a disk block
 if the root node is not a leaf node it has at least 2 children;
otherwise it has between 0 and n-1 values
P1 K1 P2 ... Pm-1 Km-1 Pm ...
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18April 28, 2017
B+-Tree Leaf Nodes
 Leaf node example (n=3)
 for 1 i <n the pointers Pi
point to a file record with the
search key value Ki or to a bucket
of pointers for search keys that are
non-ordered non-candidate keys
 the pointer Pn points to the next leaf node in search key order
de Rover Frei
2 de Rover Pieter Brussels
1 Frei Urs Zurich
8 Frei Urs Zurich
...
next leaf node
index-sequential file
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19April 28, 2017
B+-Tree Non-Leaf (Internal) Nodes
 The non-leaf nodes form a multilevel sparse index on the
leaf nodes
 For a non-leaf node with m pointers the following holds
 the values of the search keys in the subtree that P1 points to are
all smaller than K1
 for 2 i <m the search key values in the subtree to which Pi points
to have values greater or equal than Ki-1 and smaller than Ki
 the search keys in the subtree to which Pm points to have values
greather than or equal to Km-1
P1 K1 P2 ... Pm-1 Km-1 Pm ...
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20April 28, 2017
B+-Tree Example
 Example B+-tree for n=3
 root node must have at least two children
 non-leaf nodes must have between 2 and 3 children (for n=3)
 leaf nodes must have between 1 and 2 values (for n=3)
 note that the node size is normally defined by the block size and
the fanout is typically in the range of 50-100 (e.g. n=100)
Bush Frei Jones Meier Otlet Price
Jones Otlet
Meier
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21April 28, 2017
B+-Tree Properties
 Logically close blocks do not have to be physically close
since the inter-node connections are realised via pointers
 B+-trees generally have a small height
 if there are k search key values, the B+-tree tree height is not
greater than log n/2 (k)
- e.g. for 5'000'000 search keys and n = 100 we get a height not greater than 4
(note that the height would be 23 for a binary tree)
 search operations can be performed efficiently (without many
block accesses)
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22April 28, 2017
B+-Tree Lookup
 Find all records for a given search key v
Bush Frei Jones Meier Otlet Price
Jones Otlet
Meier
c = root node
while (c is not a leaf node) {
let Ki = smallest search key value in c greater than v
if (Ki exists) {
c = node pointed to by Pi
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23April 28, 2017
B+-Tree Lookup ...
 The lookup requires O(lognk) I/O operations in worst case
}
else { // no search key greater than v
c = node pointed to by Pm // where m is the number of pointers
}
}
if (a key value Ki in c with the value v exists) {
pointer Pi leads to the desired record or bucket
}
else {
a record with the given search key value v does not exist
}
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24April 28, 2017
B+-Tree Insertion
(1)Find the leaf node in which the search key value would
appear using the lookup algorithm presented before and
add the new record to the file
(2)If the search key is already present in the leaf node
 if necessary add a new pointer to the bucket
(3)If the search key value is not yet present
 if there is room in the leaf node then insert a new index record
 if there is no room then split the node
- take the n index records (inlcuding the one to be inserted) in sorted order and
place the first n/2records in the original node and the rest in a new node
- let the new node be p and the smallest value in p be k. Insert (p,k) in the parent
node (if there is no space in the parent node, the split is propagated upwards)
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25April 28, 2017
B+-Tree Insertion ...
 In worst case, the root node has to be split and the
height of the tree is increased by 1
 The insertion requires O(lognk) I/O operations in worst
case
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26April 28, 2017
B+-Tree Insertion Example
 insertion of Codd
Bush Frei Jones Meier Otlet Price
Jones Otlet
Meier
Bush Codd Meier Otlet Price
Frei Jones Otlet
Meier
JonesFrei
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27April 28, 2017
B+-Tree Deletion
(1)Find the record to be deleted by using the lookup
algorithm and delete it from the file
(2)Remove the search key and value from the leaf index
node if there is no bucket or if the bucket is empty
(3)If the node has no longer enough entries due to the
deleted index record and the entries in the node and a
sibling fit into a single node then merge the siblings
 insert the search key values of the two nodes into the left node
and delete the right node
 recursively delete the entry for the deleted node from the parent
node (propagation)
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28April 28, 2017
B+-Tree Deletion ...
(4)If the node has no longer enough entries due to the
deleted index record but the entries in the node and a
sibling do not fit into a single node then redistribute the
pointers
 redistribute the pointers between the node and a sibling such that
both have more than the minimum number of entries
 update the search key values in the parent node
 If the root node has only one pointer after deletion, it is
deleted and its child becomes the root
 The deletion of a previously located record requires
O(lognk) I/O operations in worst case
 note that there are variants where leaf nodes are not merged
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29April 28, 2017
B+-Tree Deletion Example
 deletion of Frei
Bush Codd Meier Otlet Price
Frei Jones Otlet
Meier
JonesFrei
Bush Codd Meier Otlet Price
Jones Otlet
Meier
Jones
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30April 28, 2017
B+-Tree Deletion Example ...
 deletion of Meier
Bush Codd Otlet PriceJones
Bush Codd Meier Otlet Price
Jones Otlet
Meier
Jones
Jones Otlet
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31April 28, 2017
B+-Tree Deletion Example ...
 deletion of Meier
Bush Codd Meier Otlet Price
Frei Jones Otlet
Meier
JonesFrei
Bush Codd Otlet Price
Frei Otlet
Jones
JonesFrei
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32April 28, 2017
Features of B+-Tree Index Files
 Advantages
 automatic reorganisation with small local changes for insertions
and deletions
 no reorganisation of the file is required to maintain the
performance (solves index-sequential file problem)
 Disadvantages
 insertion and deletion overhead
 space overhead
 The advantages of B+-trees outweigh the disadvantages
which makes them an extensively used data structure
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33April 28, 2017
B+-Tree File Organisation
 We can use the same B+-tree structure to organise the
records in a file and thereby solve the problem of data
file degradation
 In a B+-tree file organisation, the leaves of the tree store
records instead of pointers to records
 number of records in the leaf nodes normally smaller than the
number of pointers in non-leave nodes
 leaf nodes still have to be at least half full
 two siblings can be used in splits/merges for space optimisation
 adjacent nodes in the tree may not be continuous on the disk
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34April 28, 2017
Indexing Strings
 The use of strings as search keys introduces two major
problems
 the strings can be of variable length which leads to different
fanouts
 the split and merge operations should no longer be based on the
number of search keys but on the fraction of the used space
 The fanout of nodes can be increased by using a prefix
compression technique
 we no longer store the entire search key in non-leaf nodes but
only the part that is necessary to distinguish between the key
values in the subtrees
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35April 28, 2017
B+-Trees vs. B-Trees
 B-trees store every search key value only once
(no redundancy)
 additional pointers to the file records have to be added for non-
leaf nodes
 Advantages of B-trees
 may use less nodes than the corresponding B+-trees
 may find the search key before reaching the leaf node
 Disadvantages of B-trees
 non-leaf nodes get larger which results in a deeper trees
 insertion and deletion gets more complicated
 Normally the advantages of B-trees do not outweigh their
disadvantages
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36April 28, 2017
Multiple-Key Access
 For queries that need fast search on the combination of
multiple attributes we can define composite search keys
 e.g. search key (name, city) for customers
 The ordering of composite search keys is the
lexicographic ordering
 e.g. (a1, a2) < (b1, b2) if
- a1 <b1, or
- a1 = b1 and a2 <b2
 An index with composite search keys can be used to
efficiently answer queries of the form
 WHERE name = 'Max Frisch' AND city = 'Zurich'
 WHERE name = 'Max Frisch' AND city < 'Brussels'
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37April 28, 2017
Static Hashing
 A bucket is a storage unit containing one or more records
 typically a disk block
 In a hash file organisation we get the bucket in a file by
using a hash function on the search key value
 A hash function h is a function from the set of search key
values K to the set of buckets B
 distribution should be uniform and random
 The hash function can be used to locate, insert and
delete records
 A bucket may contain records with different search keys
 bucket has to be sequentially searched
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38April 28, 2017
Hash File Organisation Example
 example of a hash function: binary representation of search key
modulo 6
14 Jones Andrew Paris2 Rover Pieter Brussels
53 Meier Beat Zurich
1 Frei Urs Zurich
8 Frei Urs Zurich
5 Michelin Robert Paris
bucket 0 bucket 1
bucket 2 bucket 3
bucket 4 bucket 5
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 39April 28, 2017
Bucket Overflows
 Buckets can overflow due to different reasons
 insufficent number of buckets
 a skew in the distribution of records
- many records with the same search key
- non-uniform hash function
 The possibility for bucket overflows can be reduced but
we cannot prevent overflows
 bucket overflows are handled by a linked list of additional
overflow buckets
 this form of hashing with overflow buckets is called closed
hashing
 Open hashing does not allow overflow buckets
 other existing buckets have to be "misused" (linear probing)
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 40April 28, 2017
Hash Index
 Hashing cannot only be used for file organisation but
also for the creation of an index
 A hash index manages the search keys with their record
pointers in a hash file structure
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 41April 28, 2017
1
53
Hash Index Example
2 de Rover Pieter Brussels
1 Frei Urs Zurich
8 Frei Urs Zurich
14 Jones Andrew Paris
53 Meier Beat Zurich
5 Michelin Robert Paris
5
bucket 0
bucket 1
14
bucket 2
2
bucket 3
8
overflow bucket
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 42April 28, 2017
Static Hashing Problems
 The hash function maps the search key values to a fixed
number of buckets B but the database will grow or shrink
over time
 if the file grows and the initial number of buckets is too small, the
performance decreases since overflow buckets are necessary
 if space is preallocated for future growth, some space will be
wasted initially
 A possible solution would be to reorganise the file with a
new hash function from time to time
 too expensive and has to be performed exclusively
 An alternative is to dynamically modify the number of
buckets (dynamic hashing)
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 43April 28, 2017
Ordered Indexing vs. Hashing
 The database designer has to choose which form of
index should be used based on different criteria
 cost of periodic reorganisation of the index
 frequency of update operations
 optimised average or worst case access time
 expected types of queries
- hashing is generally better for retrieving records with a specific attribute value
• constant average lookup time
• worst case lookup time proportional to the number of attribute values!
- ordered indices are preferred for range queries where the attribute value lies
within a range of values (e.g. WHERE A > C1 AND A < C2)
 In practice
 most DBMSs support an ordered B+-tree index but not all of them
support hash indices
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 44April 28, 2017
Bitmap Index
 A bitmap index can be used to efficiently query on
multiple keys/attributes
 The following assumptions must hold
 records in a relation must be numbered sequentially (e.g. starting
from 0)
 given a number i, it must be easy to find the ith record
- easy if records have fixed size
 Bitmaps are applicable on attributes with a relatively
small number of distinct values
 e.g. gender, city, ...
 A bitmap is an array with as many bits as records
 the ith bit is 1 if record i has the desired attribute value
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 45April 28, 2017
Bitmap Index ...
 Queries can be answered by using bitmap operations
 and (intersection), or (union) and not (complementation)
- e.g. find all males in Zurich: 101101 AND 011010 = 001000
- e.g. find females not living in Paris: 010010 AND 111010 = 010010
 Retrieve records based on the resulting bit array
 can also be used to count the number of records with a given condition
2 de Rover Pieter m Brussels
21 Brown Kathy f Zurich
8 Frei Urs m Zurich
14 Jones Andrew m Paris
9 Jones Lea f Zurich
5 Michelin Robert m Paris
101101
010010
100000
000101
011010
m
f
Brussel
Paris
Zurich
bitmaps for gender
bitmaps for location
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 46April 28, 2017
Index Definition in SQL
 The create index command can be used to create an
index on a relation
 different DBMSs support different types of indices (e.g. B+-tree,
hash index, etc.)
 Example
CREATE INDEX nameIndex ON Customer USING BTREE (name);
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 47April 28, 2017
Summary
 Different types of indices can be used to speed up the
search of specific records in a file
 The number and types of indices have to be designed
carefully since each index adds additional maintenance
costs on update operations
 The structures that are used for the indices (e.g. B+-tree
or hash index) can also be used to manage the records
in a file and reduce data file degradation
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 48April 28, 2017
Homework
 Study the following chapter of the
Database System Concepts book
 chapter 11
- sections 11.1-11.11
- Indexing and Hashing
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 49April 28, 2017
Exercise 9
 Storage, Indexing and Hashing
Beat Signer - Department of Computer Science - bsigner@vub.ac.be 50April 28, 2017
References
 A. Silberschatz, H. Korth and S. Sudarshan,
Database System Concepts (Sixth Edition),
McGraw-Hill, 2010
2 December 2005
Next Lecture
Query Processing and Optimisation

More Related Content

What's hot

DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS Adams Sidibe
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analyticsFarheen Nilofer
 
(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen XuYueshen Xu
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)tafosepsdfasg
 
The Search of New Issues in the Detection of Near-duplicated Documents
The Search of New Issues in the Detection of Near-duplicated DocumentsThe Search of New Issues in the Detection of Near-duplicated Documents
The Search of New Issues in the Detection of Near-duplicated Documentsijceronline
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra
 
Text categorization using Rough Set
Text categorization using Rough SetText categorization using Rough Set
Text categorization using Rough SetSreekumar Biswas
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAbhishek Mungoli
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Miningsnoreen
 
Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse LearningDatabricks
 
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON DatatypePapers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON DatatypeMax Klymyshyn
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big datajins0618
 

What's hot (19)

DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS DATA STRUCTURE AND ALGORITHMS
DATA STRUCTURE AND ALGORITHMS
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analytics
 
(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu(Hierarchical) Topic Modeling_Yueshen Xu
(Hierarchical) Topic Modeling_Yueshen Xu
 
Starting work with R
Starting work with RStarting work with R
Starting work with R
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
Lists
ListsLists
Lists
 
The Search of New Issues in the Detection of Near-duplicated Documents
The Search of New Issues in the Detection of Near-duplicated DocumentsThe Search of New Issues in the Detection of Near-duplicated Documents
The Search of New Issues in the Detection of Near-duplicated Documents
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
Big Data Overview
Big Data OverviewBig Data Overview
Big Data Overview
 
Text categorization using Rough Set
Text categorization using Rough SetText categorization using Rough Set
Text categorization using Rough Set
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: Simrank
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Networkx tutorial
Networkx tutorialNetworkx tutorial
Networkx tutorial
 
Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse Learning
 
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON DatatypePapers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
 
Cr25555560
Cr25555560Cr25555560
Cr25555560
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
Predicting the relevance of search results for e-commerce systems
Predicting the relevance of search results for e-commerce systemsPredicting the relevance of search results for e-commerce systems
Predicting the relevance of search results for e-commerce systems
 

Viewers also liked

File access methods.54
File access methods.54File access methods.54
File access methods.54myrajendra
 
File Organization
File OrganizationFile Organization
File OrganizationManyi Man
 
Paper-Digital User Interfaces - Applications, Frameworks and Future Challenges
Paper-Digital User Interfaces - Applications, Frameworks and Future ChallengesPaper-Digital User Interfaces - Applications, Frameworks and Future Challenges
Paper-Digital User Interfaces - Applications, Frameworks and Future ChallengesBeat Signer
 
Cross-Media Information Spaces and Architectures (CISA)
Cross-Media Information Spaces and Architectures (CISA)Cross-Media Information Spaces and Architectures (CISA)
Cross-Media Information Spaces and Architectures (CISA)Beat Signer
 
06 file processing
06 file processing06 file processing
06 file processingIssay Meii
 
File org leela mdhm 21 batch aiha lecture
File org   leela mdhm 21 batch aiha lectureFile org   leela mdhm 21 batch aiha lecture
File org leela mdhm 21 batch aiha lectureBobba Leeladhar
 
Python Programming - XII. File Processing
Python Programming - XII. File ProcessingPython Programming - XII. File Processing
Python Programming - XII. File ProcessingRanel Padon
 
File Processing System
File Processing SystemFile Processing System
File Processing SystemDMMMSU-SLUC
 
API 101 - Understanding APIs
API 101 - Understanding APIsAPI 101 - Understanding APIs
API 101 - Understanding APIs3scale
 
Disadvantages of file management system (file processing systems)
Disadvantages of file management system(file processing systems)Disadvantages of file management system(file processing systems)
Disadvantages of file management system (file processing systems) raj upadhyay
 
File organization and processing
File organization and processingFile organization and processing
File organization and processingburhan123456
 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSAbhishek Dutta
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocolVincent Chu
 

Viewers also liked (20)

File access methods.54
File access methods.54File access methods.54
File access methods.54
 
File Organization
File OrganizationFile Organization
File Organization
 
File organization
File organizationFile organization
File organization
 
Paper-Digital User Interfaces - Applications, Frameworks and Future Challenges
Paper-Digital User Interfaces - Applications, Frameworks and Future ChallengesPaper-Digital User Interfaces - Applications, Frameworks and Future Challenges
Paper-Digital User Interfaces - Applications, Frameworks and Future Challenges
 
Cross-Media Information Spaces and Architectures (CISA)
Cross-Media Information Spaces and Architectures (CISA)Cross-Media Information Spaces and Architectures (CISA)
Cross-Media Information Spaces and Architectures (CISA)
 
Files
FilesFiles
Files
 
File organization
File organizationFile organization
File organization
 
File organisation
File organisationFile organisation
File organisation
 
File structures
File structuresFile structures
File structures
 
File Management
File ManagementFile Management
File Management
 
File management
File managementFile management
File management
 
06 file processing
06 file processing06 file processing
06 file processing
 
File org leela mdhm 21 batch aiha lecture
File org   leela mdhm 21 batch aiha lectureFile org   leela mdhm 21 batch aiha lecture
File org leela mdhm 21 batch aiha lecture
 
Python Programming - XII. File Processing
Python Programming - XII. File ProcessingPython Programming - XII. File Processing
Python Programming - XII. File Processing
 
File Processing System
File Processing SystemFile Processing System
File Processing System
 
API 101 - Understanding APIs
API 101 - Understanding APIsAPI 101 - Understanding APIs
API 101 - Understanding APIs
 
Disadvantages of file management system (file processing systems)
Disadvantages of file management system(file processing systems)Disadvantages of file management system(file processing systems)
Disadvantages of file management system (file processing systems)
 
File organization and processing
File organization and processingFile organization and processing
File organization and processing
 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMS
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocol
 

Similar to Introduction to Database Access Methods

12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMSkoolkampus
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashingJeet Poria
 
indexingstructureforfiles-160728120658.pdf
indexingstructureforfiles-160728120658.pdfindexingstructureforfiles-160728120658.pdf
indexingstructureforfiles-160728120658.pdfFraolUmeta
 
Indexing structure for files
Indexing structure for filesIndexing structure for files
Indexing structure for filesZainab Almugbel
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashingsathish sak
 
Indexing and Hashing.ppt
Indexing and Hashing.pptIndexing and Hashing.ppt
Indexing and Hashing.pptvedantihp21
 
Index Structures.pptx
Index Structures.pptxIndex Structures.pptx
Index Structures.pptxMBablu1
 
Furnish an Index Using the Works of Tree Structures
Furnish an Index Using the Works of Tree StructuresFurnish an Index Using the Works of Tree Structures
Furnish an Index Using the Works of Tree Structuresijceronline
 
DBMS 8 | Memory Hierarchy and Indexing
DBMS 8 | Memory Hierarchy and IndexingDBMS 8 | Memory Hierarchy and Indexing
DBMS 8 | Memory Hierarchy and IndexingMohammad Imam Hossain
 
Adbms 22 dynamic multi level index using b and b+ tree
Adbms 22 dynamic multi level index using b  and b+ treeAdbms 22 dynamic multi level index using b  and b+ tree
Adbms 22 dynamic multi level index using b and b+ treeVaibhav Khanna
 

Similar to Introduction to Database Access Methods (20)

12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS
 
indexing and hashing
indexing and hashingindexing and hashing
indexing and hashing
 
Lec 1 indexing and hashing
Lec 1 indexing and hashing Lec 1 indexing and hashing
Lec 1 indexing and hashing
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
5. indexing
5. indexing5. indexing
5. indexing
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
indexingstructureforfiles-160728120658.pdf
indexingstructureforfiles-160728120658.pdfindexingstructureforfiles-160728120658.pdf
indexingstructureforfiles-160728120658.pdf
 
Indexing structure for files
Indexing structure for filesIndexing structure for files
Indexing structure for files
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
 
Ch12
Ch12Ch12
Ch12
 
ch12
ch12ch12
ch12
 
A41001011
A41001011A41001011
A41001011
 
Indexing and Hashing.ppt
Indexing and Hashing.pptIndexing and Hashing.ppt
Indexing and Hashing.ppt
 
Unit08 dbms
Unit08 dbmsUnit08 dbms
Unit08 dbms
 
Storage struct
Storage structStorage struct
Storage struct
 
Index Structures.pptx
Index Structures.pptxIndex Structures.pptx
Index Structures.pptx
 
Furnish an Index Using the Works of Tree Structures
Furnish an Index Using the Works of Tree StructuresFurnish an Index Using the Works of Tree Structures
Furnish an Index Using the Works of Tree Structures
 
DBMS 8 | Memory Hierarchy and Indexing
DBMS 8 | Memory Hierarchy and IndexingDBMS 8 | Memory Hierarchy and Indexing
DBMS 8 | Memory Hierarchy and Indexing
 
Adbms 22 dynamic multi level index using b and b+ tree
Adbms 22 dynamic multi level index using b  and b+ treeAdbms 22 dynamic multi level index using b  and b+ tree
Adbms 22 dynamic multi level index using b and b+ tree
 
Unit 08 dbms
Unit 08 dbmsUnit 08 dbms
Unit 08 dbms
 

More from Beat Signer

Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)Beat Signer
 
Indoor Positioning Using the OpenHPS Framework
Indoor Positioning Using the OpenHPS FrameworkIndoor Positioning Using the OpenHPS Framework
Indoor Positioning Using the OpenHPS FrameworkBeat Signer
 
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...Beat Signer
 
Cross-Media Technologies and Applications - Future Directions for Personal In...
Cross-Media Technologies and Applications - Future Directions for Personal In...Cross-Media Technologies and Applications - Future Directions for Personal In...
Cross-Media Technologies and Applications - Future Directions for Personal In...Beat Signer
 
Bridging the Gap: Managing and Interacting with Information Across Media Boun...
Bridging the Gap: Managing and Interacting with Information Across Media Boun...Bridging the Gap: Managing and Interacting with Information Across Media Boun...
Bridging the Gap: Managing and Interacting with Information Across Media Boun...Beat Signer
 
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming CurriculaCodeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming CurriculaBeat Signer
 
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions Beat Signer
 
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...Beat Signer
 
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)Beat Signer
 
Interaction - Lecture 10 - Information Visualisation (4019538FNR)
Interaction - Lecture 10 - Information Visualisation (4019538FNR)Interaction - Lecture 10 - Information Visualisation (4019538FNR)
Interaction - Lecture 10 - Information Visualisation (4019538FNR)Beat Signer
 
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...Beat Signer
 
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)Beat Signer
 
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...Beat Signer
 
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...Beat Signer
 
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)Beat Signer
 
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)Beat Signer
 
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)Data Representation - Lecture 3 - Information Visualisation (4019538FNR)
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)Beat Signer
 
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...Beat Signer
 
Introduction - Lecture 1 - Information Visualisation (4019538FNR)
Introduction - Lecture 1 - Information Visualisation (4019538FNR)Introduction - Lecture 1 - Information Visualisation (4019538FNR)
Introduction - Lecture 1 - Information Visualisation (4019538FNR)Beat Signer
 
Towards a Framework for Dynamic Data Physicalisation
Towards a Framework for Dynamic Data PhysicalisationTowards a Framework for Dynamic Data Physicalisation
Towards a Framework for Dynamic Data PhysicalisationBeat Signer
 

More from Beat Signer (20)

Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
 
Indoor Positioning Using the OpenHPS Framework
Indoor Positioning Using the OpenHPS FrameworkIndoor Positioning Using the OpenHPS Framework
Indoor Positioning Using the OpenHPS Framework
 
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
 
Cross-Media Technologies and Applications - Future Directions for Personal In...
Cross-Media Technologies and Applications - Future Directions for Personal In...Cross-Media Technologies and Applications - Future Directions for Personal In...
Cross-Media Technologies and Applications - Future Directions for Personal In...
 
Bridging the Gap: Managing and Interacting with Information Across Media Boun...
Bridging the Gap: Managing and Interacting with Information Across Media Boun...Bridging the Gap: Managing and Interacting with Information Across Media Boun...
Bridging the Gap: Managing and Interacting with Information Across Media Boun...
 
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming CurriculaCodeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
 
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
 
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
 
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
 
Interaction - Lecture 10 - Information Visualisation (4019538FNR)
Interaction - Lecture 10 - Information Visualisation (4019538FNR)Interaction - Lecture 10 - Information Visualisation (4019538FNR)
Interaction - Lecture 10 - Information Visualisation (4019538FNR)
 
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
 
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
 
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
 
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
 
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
 
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)
Analysis and Validation - Lecture 4 - Information Visualisation (4019538FNR)
 
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)Data Representation - Lecture 3 - Information Visualisation (4019538FNR)
Data Representation - Lecture 3 - Information Visualisation (4019538FNR)
 
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...
Human Perception and Colour Theory - Lecture 2 - Information Visualisation (4...
 
Introduction - Lecture 1 - Information Visualisation (4019538FNR)
Introduction - Lecture 1 - Information Visualisation (4019538FNR)Introduction - Lecture 1 - Information Visualisation (4019538FNR)
Introduction - Lecture 1 - Information Visualisation (4019538FNR)
 
Towards a Framework for Dynamic Data Physicalisation
Towards a Framework for Dynamic Data PhysicalisationTowards a Framework for Dynamic Data Physicalisation
Towards a Framework for Dynamic Data Physicalisation
 

Recently uploaded

Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...Nguyen Thanh Tu Collection
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfChristalin Nelson
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 

Recently uploaded (20)

Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdf
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 

Introduction to Database Access Methods

  • 1. 2 December 2005 Introduction to Databases Access Methods Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel http://www.beatsigner.com
  • 2. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2April 28, 2017 Context of Today's Lecture Access Methods System Buffers Authorisation Control Integrity Checker Command Processor Program Object Code DDL Compiler File Manager Buffer Manager Recovery Manager Scheduler Query Optimiser Transaction Manager Query Compiler Queries Catalogue Manager DML Preprocessor Database Schema Application Programs Database Manager Data Manager DBMS Programmers Users DB Admins Based on 'Components of a DBMS', Database Systems, T. Connolly and C. Begg, Addison-Wesley 2010 Data, Indices and System Catalogue
  • 3. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3April 28, 2017 Basic Index Concepts  An index is used to efficiently access specific data  e.g. index in a textbook  Often database queries reference only a small number of records in a file  indices can be used to enhance the search for these records  most DBMSs automatically create an index for primary keys  A search key is an attribute (or set of attributes) that is used to lookup records in a file  An index file consists of (searchKey, pointer) index entries  normally much smaller than the original file  pointer identifies a disk block and a record offset within the block
  • 4. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4April 28, 2017 Basic Index Concepts ...  There are two basic types of indices  ordered indices - based on a sorted ordering of search keys  hash indices - search keys are uniformely distributed to different buckets based on a hash function  Indexing techniques have to be evaluated based on  access types - e.g. records with a given attribute value or an attribute value in a specific range  access time  insertion and deletion time - if the underlying data is changed, all indices have to be updated which may result in a significant overhead for modifications  space overhead
  • 5. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5April 28, 2017 Ordered Indices  An ordered index stores the search key values in sorted order  e.g. author catalogue in a library  A primary index (clustering index) is an index whose search key also defines the sequential order of the file  the primary key is often used as a search key of a primary index  files with a clustering index on the search key are called index-sequential-files  A secondary index (non-clustering index) is an index whose search key has a different order than the sequential file order
  • 6. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6April 28, 2017 Dense Index  A dense index contains an entry for each search key in the file  An index record of a dense primary index points to the first file record with the given search key  An index record of a dense secondary index stores a list of pointers to the records with the same search key value de Rover Frei Jones Meier Michelin 2 de Rover Pieter Brussels 1 Frei Urs Zurich 8 Frei Urs Zurich 14 Jones Andrew Paris 53 Meier Beat Zurich 5 Michelin Robert Parisdense index index-sequential file
  • 7. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7April 28, 2017 Dense Index Insertion find the search key of the record to be inserted in the index if (index contains search key) { if (index stores pointers to all records) { add a pointer to the new record to the index entry } else { // index stores pointer to the first record only ensure that the record is inserted after the first record with the same search key } } else { // index does not contain the search key insert an index record with the search key at the appropriate position }
  • 8. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8April 28, 2017 Dense Index Deletion D  Update can be modelled as a delete followed by an insert look up the record to be deleted if (deleted record is the only one with the search key value) { delete index record } else { // there are other records with the same search key if (index stores pointers to all records) { delete the corresponding pointer from the index record } else { // index stores pointer to the first record only if (deleted record was the first record with the search key value) { update the index record to point to the next record } } }
  • 9. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9April 28, 2017 Sparse Index  Contains only some of the search key values  can only be used for primary indices!  To search a given record in a sparse index the following steps have to be performed 1) find the index record ri with the largest search key value that is smaller or equal to the search key value to be found 2) start a linear search at record ri until the record is found de Rover Jones Michelin 2 de Rover Pieter Brussels 1 Frei Urs Zurich 8 Frei Urs Zurich 14 Jones Andrew Paris 53 Meier Beat Zurich 5 Michelin Robert Paris sparse index
  • 10. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10April 28, 2017 Sparse Index  A sparse index consumes less space and produces less maintenance overhead for insertions and deletions  On the other hand it takes more time to locate a record based on a sparse index  A good trade-off is a sparse index with one index record for each block ... sparse index ... block 1 block 2
  • 11. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11April 28, 2017 Sparse Index Insertion // assumption: one index entry for each block if (new block is created) { insert first search key value in the new block into the index } else { // no new block if (new record has the smallest search key value in the block) { update the index record pointing to the block } }
  • 12. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12April 28, 2017 Sparse Index Deletion  Update can be modelled as a delete followed by an insert if (index contains the search key value of deleted record) { if (deleted record is the only one with the search key value) { if (search key value of the next record is not in the index { update the index record to point to the next record } else { // search key value of next record is already in the index delete index entry } } else { // there are other records with the same search key if (deleted record was the first record with the search key value) { update the index entry to point to the next record with the same search key value } }
  • 13. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13April 28, 2017 Multilevel Index  A multilevel index can be used if the index grows too large to be kept in memory  inner index froms the primary index file  outer index is a sparse index on the inner index ... inner index ... data block 1 data block 2 ... ... outer index
  • 14. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14April 28, 2017 Multilevel Index ...  If even the outer index grows too large to be kept in memory, the process can be repeated  note that instead of this type of multilevel index it is preferred to use B+-trees  The update of the multilevel index (on insertion and deletion) is an extension of the updates described for the single level indices
  • 15. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15April 28, 2017 Secondary Index  A secondary index has to be a dense index and must point to all the records with the same search key value  e.g. indirection via buckets that contain the pointers  The use of a secondary index normally requires more I/O operations than a primary index Brussels Paris Zurich 2 de Rover Pieter Brussels 1 Frei Urs Zurich 8 Frei Urs Zurich 14 Jones Andrew Paris 53 Meier Beat Zurich 5 Michelin Robert Paris secondary index buckets
  • 16. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16April 28, 2017 Index-Sequential File Problems  Index-sequential files have a number of disadvantages  degradation and loss of performance when the file grows since many overflow blocks have to be created  the entire file has to be periodically reorganised  Alternative index structures that do not show a loss of performance after updates can be used  e.g. the B+-tree index file structure is a widely used multilevel index structure that is based on a B+-tree (balanced and sorted )
  • 17. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17April 28, 2017 B+-Tree  Properties of a balanced B+-tree  every path from the root to a leaf node has the same length  for a given n, each leaf node has between n-1)/2 and n-1 values - leaf nodes must always be at least half full  each non-leaf (and non-root) node has between n/2 and n children (fanout)  typical B+-tree node - K1,..., Km-1 are the ordered search key values (Ki < Ki+1) - a non-leaf node has pointers P1,..., Pm to child nodes - a leaf node has pointers P1,..., Pm to file records or to buckets of pointers - typically the size of a disk block  if the root node is not a leaf node it has at least 2 children; otherwise it has between 0 and n-1 values P1 K1 P2 ... Pm-1 Km-1 Pm ...
  • 18. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18April 28, 2017 B+-Tree Leaf Nodes  Leaf node example (n=3)  for 1 i <n the pointers Pi point to a file record with the search key value Ki or to a bucket of pointers for search keys that are non-ordered non-candidate keys  the pointer Pn points to the next leaf node in search key order de Rover Frei 2 de Rover Pieter Brussels 1 Frei Urs Zurich 8 Frei Urs Zurich ... next leaf node index-sequential file
  • 19. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19April 28, 2017 B+-Tree Non-Leaf (Internal) Nodes  The non-leaf nodes form a multilevel sparse index on the leaf nodes  For a non-leaf node with m pointers the following holds  the values of the search keys in the subtree that P1 points to are all smaller than K1  for 2 i <m the search key values in the subtree to which Pi points to have values greater or equal than Ki-1 and smaller than Ki  the search keys in the subtree to which Pm points to have values greather than or equal to Km-1 P1 K1 P2 ... Pm-1 Km-1 Pm ...
  • 20. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20April 28, 2017 B+-Tree Example  Example B+-tree for n=3  root node must have at least two children  non-leaf nodes must have between 2 and 3 children (for n=3)  leaf nodes must have between 1 and 2 values (for n=3)  note that the node size is normally defined by the block size and the fanout is typically in the range of 50-100 (e.g. n=100) Bush Frei Jones Meier Otlet Price Jones Otlet Meier
  • 21. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21April 28, 2017 B+-Tree Properties  Logically close blocks do not have to be physically close since the inter-node connections are realised via pointers  B+-trees generally have a small height  if there are k search key values, the B+-tree tree height is not greater than log n/2 (k) - e.g. for 5'000'000 search keys and n = 100 we get a height not greater than 4 (note that the height would be 23 for a binary tree)  search operations can be performed efficiently (without many block accesses)
  • 22. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22April 28, 2017 B+-Tree Lookup  Find all records for a given search key v Bush Frei Jones Meier Otlet Price Jones Otlet Meier c = root node while (c is not a leaf node) { let Ki = smallest search key value in c greater than v if (Ki exists) { c = node pointed to by Pi
  • 23. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23April 28, 2017 B+-Tree Lookup ...  The lookup requires O(lognk) I/O operations in worst case } else { // no search key greater than v c = node pointed to by Pm // where m is the number of pointers } } if (a key value Ki in c with the value v exists) { pointer Pi leads to the desired record or bucket } else { a record with the given search key value v does not exist }
  • 24. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24April 28, 2017 B+-Tree Insertion (1)Find the leaf node in which the search key value would appear using the lookup algorithm presented before and add the new record to the file (2)If the search key is already present in the leaf node  if necessary add a new pointer to the bucket (3)If the search key value is not yet present  if there is room in the leaf node then insert a new index record  if there is no room then split the node - take the n index records (inlcuding the one to be inserted) in sorted order and place the first n/2records in the original node and the rest in a new node - let the new node be p and the smallest value in p be k. Insert (p,k) in the parent node (if there is no space in the parent node, the split is propagated upwards)
  • 25. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25April 28, 2017 B+-Tree Insertion ...  In worst case, the root node has to be split and the height of the tree is increased by 1  The insertion requires O(lognk) I/O operations in worst case
  • 26. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26April 28, 2017 B+-Tree Insertion Example  insertion of Codd Bush Frei Jones Meier Otlet Price Jones Otlet Meier Bush Codd Meier Otlet Price Frei Jones Otlet Meier JonesFrei
  • 27. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27April 28, 2017 B+-Tree Deletion (1)Find the record to be deleted by using the lookup algorithm and delete it from the file (2)Remove the search key and value from the leaf index node if there is no bucket or if the bucket is empty (3)If the node has no longer enough entries due to the deleted index record and the entries in the node and a sibling fit into a single node then merge the siblings  insert the search key values of the two nodes into the left node and delete the right node  recursively delete the entry for the deleted node from the parent node (propagation)
  • 28. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28April 28, 2017 B+-Tree Deletion ... (4)If the node has no longer enough entries due to the deleted index record but the entries in the node and a sibling do not fit into a single node then redistribute the pointers  redistribute the pointers between the node and a sibling such that both have more than the minimum number of entries  update the search key values in the parent node  If the root node has only one pointer after deletion, it is deleted and its child becomes the root  The deletion of a previously located record requires O(lognk) I/O operations in worst case  note that there are variants where leaf nodes are not merged
  • 29. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29April 28, 2017 B+-Tree Deletion Example  deletion of Frei Bush Codd Meier Otlet Price Frei Jones Otlet Meier JonesFrei Bush Codd Meier Otlet Price Jones Otlet Meier Jones
  • 30. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30April 28, 2017 B+-Tree Deletion Example ...  deletion of Meier Bush Codd Otlet PriceJones Bush Codd Meier Otlet Price Jones Otlet Meier Jones Jones Otlet
  • 31. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31April 28, 2017 B+-Tree Deletion Example ...  deletion of Meier Bush Codd Meier Otlet Price Frei Jones Otlet Meier JonesFrei Bush Codd Otlet Price Frei Otlet Jones JonesFrei
  • 32. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32April 28, 2017 Features of B+-Tree Index Files  Advantages  automatic reorganisation with small local changes for insertions and deletions  no reorganisation of the file is required to maintain the performance (solves index-sequential file problem)  Disadvantages  insertion and deletion overhead  space overhead  The advantages of B+-trees outweigh the disadvantages which makes them an extensively used data structure
  • 33. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33April 28, 2017 B+-Tree File Organisation  We can use the same B+-tree structure to organise the records in a file and thereby solve the problem of data file degradation  In a B+-tree file organisation, the leaves of the tree store records instead of pointers to records  number of records in the leaf nodes normally smaller than the number of pointers in non-leave nodes  leaf nodes still have to be at least half full  two siblings can be used in splits/merges for space optimisation  adjacent nodes in the tree may not be continuous on the disk
  • 34. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34April 28, 2017 Indexing Strings  The use of strings as search keys introduces two major problems  the strings can be of variable length which leads to different fanouts  the split and merge operations should no longer be based on the number of search keys but on the fraction of the used space  The fanout of nodes can be increased by using a prefix compression technique  we no longer store the entire search key in non-leaf nodes but only the part that is necessary to distinguish between the key values in the subtrees
  • 35. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35April 28, 2017 B+-Trees vs. B-Trees  B-trees store every search key value only once (no redundancy)  additional pointers to the file records have to be added for non- leaf nodes  Advantages of B-trees  may use less nodes than the corresponding B+-trees  may find the search key before reaching the leaf node  Disadvantages of B-trees  non-leaf nodes get larger which results in a deeper trees  insertion and deletion gets more complicated  Normally the advantages of B-trees do not outweigh their disadvantages
  • 36. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36April 28, 2017 Multiple-Key Access  For queries that need fast search on the combination of multiple attributes we can define composite search keys  e.g. search key (name, city) for customers  The ordering of composite search keys is the lexicographic ordering  e.g. (a1, a2) < (b1, b2) if - a1 <b1, or - a1 = b1 and a2 <b2  An index with composite search keys can be used to efficiently answer queries of the form  WHERE name = 'Max Frisch' AND city = 'Zurich'  WHERE name = 'Max Frisch' AND city < 'Brussels'
  • 37. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37April 28, 2017 Static Hashing  A bucket is a storage unit containing one or more records  typically a disk block  In a hash file organisation we get the bucket in a file by using a hash function on the search key value  A hash function h is a function from the set of search key values K to the set of buckets B  distribution should be uniform and random  The hash function can be used to locate, insert and delete records  A bucket may contain records with different search keys  bucket has to be sequentially searched
  • 38. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38April 28, 2017 Hash File Organisation Example  example of a hash function: binary representation of search key modulo 6 14 Jones Andrew Paris2 Rover Pieter Brussels 53 Meier Beat Zurich 1 Frei Urs Zurich 8 Frei Urs Zurich 5 Michelin Robert Paris bucket 0 bucket 1 bucket 2 bucket 3 bucket 4 bucket 5
  • 39. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 39April 28, 2017 Bucket Overflows  Buckets can overflow due to different reasons  insufficent number of buckets  a skew in the distribution of records - many records with the same search key - non-uniform hash function  The possibility for bucket overflows can be reduced but we cannot prevent overflows  bucket overflows are handled by a linked list of additional overflow buckets  this form of hashing with overflow buckets is called closed hashing  Open hashing does not allow overflow buckets  other existing buckets have to be "misused" (linear probing)
  • 40. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 40April 28, 2017 Hash Index  Hashing cannot only be used for file organisation but also for the creation of an index  A hash index manages the search keys with their record pointers in a hash file structure
  • 41. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 41April 28, 2017 1 53 Hash Index Example 2 de Rover Pieter Brussels 1 Frei Urs Zurich 8 Frei Urs Zurich 14 Jones Andrew Paris 53 Meier Beat Zurich 5 Michelin Robert Paris 5 bucket 0 bucket 1 14 bucket 2 2 bucket 3 8 overflow bucket
  • 42. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 42April 28, 2017 Static Hashing Problems  The hash function maps the search key values to a fixed number of buckets B but the database will grow or shrink over time  if the file grows and the initial number of buckets is too small, the performance decreases since overflow buckets are necessary  if space is preallocated for future growth, some space will be wasted initially  A possible solution would be to reorganise the file with a new hash function from time to time  too expensive and has to be performed exclusively  An alternative is to dynamically modify the number of buckets (dynamic hashing)
  • 43. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 43April 28, 2017 Ordered Indexing vs. Hashing  The database designer has to choose which form of index should be used based on different criteria  cost of periodic reorganisation of the index  frequency of update operations  optimised average or worst case access time  expected types of queries - hashing is generally better for retrieving records with a specific attribute value • constant average lookup time • worst case lookup time proportional to the number of attribute values! - ordered indices are preferred for range queries where the attribute value lies within a range of values (e.g. WHERE A > C1 AND A < C2)  In practice  most DBMSs support an ordered B+-tree index but not all of them support hash indices
  • 44. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 44April 28, 2017 Bitmap Index  A bitmap index can be used to efficiently query on multiple keys/attributes  The following assumptions must hold  records in a relation must be numbered sequentially (e.g. starting from 0)  given a number i, it must be easy to find the ith record - easy if records have fixed size  Bitmaps are applicable on attributes with a relatively small number of distinct values  e.g. gender, city, ...  A bitmap is an array with as many bits as records  the ith bit is 1 if record i has the desired attribute value
  • 45. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 45April 28, 2017 Bitmap Index ...  Queries can be answered by using bitmap operations  and (intersection), or (union) and not (complementation) - e.g. find all males in Zurich: 101101 AND 011010 = 001000 - e.g. find females not living in Paris: 010010 AND 111010 = 010010  Retrieve records based on the resulting bit array  can also be used to count the number of records with a given condition 2 de Rover Pieter m Brussels 21 Brown Kathy f Zurich 8 Frei Urs m Zurich 14 Jones Andrew m Paris 9 Jones Lea f Zurich 5 Michelin Robert m Paris 101101 010010 100000 000101 011010 m f Brussel Paris Zurich bitmaps for gender bitmaps for location
  • 46. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 46April 28, 2017 Index Definition in SQL  The create index command can be used to create an index on a relation  different DBMSs support different types of indices (e.g. B+-tree, hash index, etc.)  Example CREATE INDEX nameIndex ON Customer USING BTREE (name);
  • 47. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 47April 28, 2017 Summary  Different types of indices can be used to speed up the search of specific records in a file  The number and types of indices have to be designed carefully since each index adds additional maintenance costs on update operations  The structures that are used for the indices (e.g. B+-tree or hash index) can also be used to manage the records in a file and reduce data file degradation
  • 48. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 48April 28, 2017 Homework  Study the following chapter of the Database System Concepts book  chapter 11 - sections 11.1-11.11 - Indexing and Hashing
  • 49. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 49April 28, 2017 Exercise 9  Storage, Indexing and Hashing
  • 50. Beat Signer - Department of Computer Science - bsigner@vub.ac.be 50April 28, 2017 References  A. Silberschatz, H. Korth and S. Sudarshan, Database System Concepts (Sixth Edition), McGraw-Hill, 2010
  • 51. 2 December 2005 Next Lecture Query Processing and Optimisation