SlideShare a Scribd company logo
1 of 204
Download to read offline
Analysis of Algorithms
B-Trees
Andres Mendez-Vazquez
November 3, 2015
1 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
2 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
3 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Disk-based Environments
Something Notable
We have the following hierarchy of data access speed
1 CPU
2 Cache
3 Main Memory
4 Secondary Storage: Magnetic Disks and SSD
5 Tertiary Storage: Tapes
We know the following
Data is stored on disk in units called blocks or pages.
Every disk access has to read/write one or multiple blocks.
Even if we need to access a single integer stored in a disk block which
contains thousands of integers, we need to read the whole block in.
4 / 111
Now, What if you use a binary tree
In this structure the nodes are disk blocks
Disk Block 1
Disk Block 2 Disk Block 3
Disk Block 4 Disk Block 5 Disk Block 6
Still, We have the following problem
If a disk block is 8K (8192 bytes)
Problem the necessary information for a node is
A key = 4 bytes
A value = 4 bytes
Two Children = 8 bytes
5 / 111
Now, What if you use a binary tree
In this structure the nodes are disk blocks
Disk Block 1
Disk Block 2 Disk Block 3
Disk Block 4 Disk Block 5 Disk Block 6
Still, We have the following problem
If a disk block is 8K (8192 bytes)
Problem the necessary information for a node is
A key = 4 bytes
A value = 4 bytes
Two Children = 8 bytes
5 / 111
Now, What if you use a binary tree
In this structure the nodes are disk blocks
Disk Block 1
Disk Block 2 Disk Block 3
Disk Block 4 Disk Block 5 Disk Block 6
Still, We have the following problem
If a disk block is 8K (8192 bytes)
Problem the necessary information for a node is
A key = 4 bytes
A value = 4 bytes
Two Children = 8 bytes
5 / 111
Problem!!!
Then
We use only 0.2% of the block is full
Even
If we store multiple tree nodes in a disk!!!
6 / 111
Problem!!!
Then
We use only 0.2% of the block is full
Even
If we store multiple tree nodes in a disk!!!
6 / 111
However
The query and update need to access O (log2 n) nodes
Block i
Block i+1
Block i+2
Block i+3
Worst Case O (log2 n) accesses to disk!!!
7 / 111
Increase the branching
With a large B
logB n log2n (1)
Ok
We can minimize the number of disk access by increasing the
branching!!!
We need a way to access elements in the new branching.
8 / 111
Increase the branching
With a large B
logB n log2n (1)
Ok
We can minimize the number of disk access by increasing the
branching!!!
We need a way to access elements in the new branching.
8 / 111
Motivation for B-Trees
Some facts!
Index structures for large datasets cannot be stored in main memory
(Actually, not anymore the case!!!).
Storing it on disk requires different approach to efficiency.
Assuming that a disk spins at 3600 RPM, one revolution occurs in
1/60 of a second, or 16.7 ms.
Crudely speaking, one disk access takes about the same time as
200,000 instructions!
9 / 111
Motivation for B-Trees
Some facts!
Index structures for large datasets cannot be stored in main memory
(Actually, not anymore the case!!!).
Storing it on disk requires different approach to efficiency.
Assuming that a disk spins at 3600 RPM, one revolution occurs in
1/60 of a second, or 16.7 ms.
Crudely speaking, one disk access takes about the same time as
200,000 instructions!
9 / 111
Motivation for B-Trees
Some facts!
Index structures for large datasets cannot be stored in main memory
(Actually, not anymore the case!!!).
Storing it on disk requires different approach to efficiency.
Assuming that a disk spins at 3600 RPM, one revolution occurs in
1/60 of a second, or 16.7 ms.
Crudely speaking, one disk access takes about the same time as
200,000 instructions!
9 / 111
Motivation for B-Trees
Some facts!
Index structures for large datasets cannot be stored in main memory
(Actually, not anymore the case!!!).
Storing it on disk requires different approach to efficiency.
Assuming that a disk spins at 3600 RPM, one revolution occurs in
1/60 of a second, or 16.7 ms.
Crudely speaking, one disk access takes about the same time as
200,000 instructions!
9 / 111
Motivation for B-Trees
Now
Assume that we use a binary tree to store about 20 million records.
We end up with a very deep binary tree with lots of different disk
accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds.
We know we can’t improve on the log2 n lower bound on search for a
binary tree.
However, the solution is to use more branches and thus reduce the
height of the tree! As branching increases, depth decreases.
10 / 111
Motivation for B-Trees
Now
Assume that we use a binary tree to store about 20 million records.
We end up with a very deep binary tree with lots of different disk
accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds.
We know we can’t improve on the log2 n lower bound on search for a
binary tree.
However, the solution is to use more branches and thus reduce the
height of the tree! As branching increases, depth decreases.
10 / 111
Motivation for B-Trees
Now
Assume that we use a binary tree to store about 20 million records.
We end up with a very deep binary tree with lots of different disk
accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds.
We know we can’t improve on the log2 n lower bound on search for a
binary tree.
However, the solution is to use more branches and thus reduce the
height of the tree! As branching increases, depth decreases.
10 / 111
Motivation for B-Trees
Now
Assume that we use a binary tree to store about 20 million records.
We end up with a very deep binary tree with lots of different disk
accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds.
We know we can’t improve on the log2 n lower bound on search for a
binary tree.
However, the solution is to use more branches and thus reduce the
height of the tree! As branching increases, depth decreases.
10 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
11 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Definitions
Every node x has the following attributes:
x.n number of keys stored at node x.
Each key has an associated payload (Pointer, values, etc).
The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n .
x.leaf is a boolean value and denotes a leaf when is set to TRUE.
12 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Definitions
Every node x has the following attributes:
x.n number of keys stored at node x.
Each key has an associated payload (Pointer, values, etc).
The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n .
x.leaf is a boolean value and denotes a leaf when is set to TRUE.
12 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Definitions
Every node x has the following attributes:
x.n number of keys stored at node x.
Each key has an associated payload (Pointer, values, etc).
The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n .
x.leaf is a boolean value and denotes a leaf when is set to TRUE.
12 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Definitions
Every node x has the following attributes:
x.n number of keys stored at node x.
Each key has an associated payload (Pointer, values, etc).
The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n .
x.leaf is a boolean value and denotes a leaf when is set to TRUE.
12 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
13 / 111
B-Trees definition
In addition
Every node x has the following attributes:
It contains x.n + 1 pointers to its children:
x.c1, x.c2, ..., x.cn+1
Leaf nodes do not have children then they leave this field undefined.
The keys are used to separate the keys stored at the B-Tree. For
example, if ki is any key stored in the subtree stored at tree with root
x.ci then
k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1
14 / 111
B-Trees definition
In addition
Every node x has the following attributes:
It contains x.n + 1 pointers to its children:
x.c1, x.c2, ..., x.cn+1
Leaf nodes do not have children then they leave this field undefined.
The keys are used to separate the keys stored at the B-Tree. For
example, if ki is any key stored in the subtree stored at tree with root
x.ci then
k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1
14 / 111
B-Trees definition
In addition
Every node x has the following attributes:
It contains x.n + 1 pointers to its children:
x.c1, x.c2, ..., x.cn+1
Leaf nodes do not have children then they leave this field undefined.
The keys are used to separate the keys stored at the B-Tree. For
example, if ki is any key stored in the subtree stored at tree with root
x.ci then
k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1
14 / 111
B-Trees definition
In addition
Every node x has the following attributes:
It contains x.n + 1 pointers to its children:
x.c1, x.c2, ..., x.cn+1
Leaf nodes do not have children then they leave this field undefined.
The keys are used to separate the keys stored at the B-Tree. For
example, if ki is any key stored in the subtree stored at tree with root
x.ci then
k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1
14 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Minimum Degree
A fixed integer t ≥ 2 is called the minimum degree or branching of
the tree:
if x = root → t − 1 ≤ x.n ≤ 2t − 1
If x = root→ 1 ≤ x.n ≤ 2t − 1
15 / 111
B-Trees definition
Example
34
15 20
5 10 17 19 21 25 30 35 40 48 55 61 75 95 105
45 60 90
Note: Leaves at the same level
Minimum Degree
A fixed integer t ≥ 2 is called the minimum degree or branching of
the tree:
if x = root → t − 1 ≤ x.n ≤ 2t − 1
If x = root→ 1 ≤ x.n ≤ 2t − 1
15 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
16 / 111
We want to store large sets of indexes
First
We assume that the set is so voluminous that only a small part can be
kept in main memory!!!
Thus
We want to minimize the number of access to hard drive by using the
locality principle!!!
17 / 111
We want to store large sets of indexes
First
We assume that the set is so voluminous that only a small part can be
kept in main memory!!!
Thus
We want to minimize the number of access to hard drive by using the
locality principle!!!
17 / 111
Application: Minimizing disk access when looking for
indexes in databases
Each node is stored as a page
Page size determines t. Since t is usually large, this implies a large
branching factor, so height is small.
Example with t = 1001, we have 1000 (key, elements) per node
18 / 111
Application: Minimizing disk access when looking for
indexes in databases
Each node is stored as a page
Page size determines t. Since t is usually large, this implies a large
branching factor, so height is small.
Example with t = 1001, we have 1000 (key, elements) per node
1000
1000 1000 1000
1000 1000 1000
Branching 1001
18 / 111
Application: Minimizing disk access when looking for
indexes in databases
Example with (2t − 1) + 1 = 1001, we have 1000 (key, elements) per
node
1000
1000 1000 1000
1000 1000 1000
Branching 1001
19 / 111
Application: Minimizing disk access when looking for
indexes in databases
The example above
It can hold over one billion keys.
the height is only 2 (Assuming root at height 0), so we can find any
key with only two disk accesses (Compared to red-black trees, where
the branching factor is 2).
Then, disk accesses are minimal!!!
20 / 111
Application: Minimizing disk access when looking for
indexes in databases
The example above
It can hold over one billion keys.
the height is only 2 (Assuming root at height 0), so we can find any
key with only two disk accesses (Compared to red-black trees, where
the branching factor is 2).
Then, disk accesses are minimal!!!
20 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
21 / 111
Height of a B-Tree
Theorem 18.1
Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T.
Then h ≤ logt
n+1
2
Proof
The root of a B-tree T contains at least one key, and all other nodes
contain at least t − 1 keys.
Thus, T , whose height is h,
It has at least 2 nodes at depth 1.
At least 2t nodes at depth 2.
At least 2t2
nodes at depth 3.
Then, depth h has at least 2th−1
nodes.
22 / 111
Height of a B-Tree
Theorem 18.1
Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T.
Then h ≤ logt
n+1
2
Proof
The root of a B-tree T contains at least one key, and all other nodes
contain at least t − 1 keys.
Thus, T , whose height is h,
It has at least 2 nodes at depth 1.
At least 2t nodes at depth 2.
At least 2t2
nodes at depth 3.
Then, depth h has at least 2th−1
nodes.
22 / 111
Height of a B-Tree
Theorem 18.1
Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T.
Then h ≤ logt
n+1
2
Proof
The root of a B-tree T contains at least one key, and all other nodes
contain at least t − 1 keys.
Thus, T , whose height is h,
It has at least 2 nodes at depth 1.
At least 2t nodes at depth 2.
At least 2t2
nodes at depth 3.
Then, depth h has at least 2th−1
nodes.
22 / 111
Height of a B-Tree
Theorem 18.1
Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T.
Then h ≤ logt
n+1
2
Proof
The root of a B-tree T contains at least one key, and all other nodes
contain at least t − 1 keys.
Thus, T , whose height is h,
It has at least 2 nodes at depth 1.
At least 2t nodes at depth 2.
At least 2t2
nodes at depth 3.
Then, depth h has at least 2th−1
nodes.
22 / 111
Height of a B-Tree
Theorem 18.1
Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T.
Then h ≤ logt
n+1
2
Proof
The root of a B-tree T contains at least one key, and all other nodes
contain at least t − 1 keys.
Thus, T , whose height is h,
It has at least 2 nodes at depth 1.
At least 2t nodes at depth 2.
At least 2t2
nodes at depth 3.
Then, depth h has at least 2th−1
nodes.
22 / 111
For example
We have the following
Depth
Number
Of Nodes
23 / 111
Height of a B-Tree
We have at least
1 Depth 0 - One key
2 Depth 1 - 2t0(t − 1)
3 Depth 2 - 2t1(t − 1)
4 Depth 3 - 2t2(t − 1)
5 ...
Thus
n ≥ 1 + (t − 1)
h
i=1
2ti−1
(2)
24 / 111
Height of a B-Tree
We have at least
1 Depth 0 - One key
2 Depth 1 - 2t0(t − 1)
3 Depth 2 - 2t1(t − 1)
4 Depth 3 - 2t2(t − 1)
5 ...
Thus
n ≥ 1 + (t − 1)
h
i=1
2ti−1
(2)
24 / 111
Height of a B-Tree
We have at least
1 Depth 0 - One key
2 Depth 1 - 2t0(t − 1)
3 Depth 2 - 2t1(t − 1)
4 Depth 3 - 2t2(t − 1)
5 ...
Thus
n ≥ 1 + (t − 1)
h
i=1
2ti−1
(2)
24 / 111
Height of a B-Tree
We have at least
1 Depth 0 - One key
2 Depth 1 - 2t0(t − 1)
3 Depth 2 - 2t1(t − 1)
4 Depth 3 - 2t2(t − 1)
5 ...
Thus
n ≥ 1 + (t − 1)
h
i=1
2ti−1
(2)
24 / 111
Height of a B-Tree
We have at least
1 Depth 0 - One key
2 Depth 1 - 2t0(t − 1)
3 Depth 2 - 2t1(t − 1)
4 Depth 3 - 2t2(t − 1)
5 ...
Thus
n ≥ 1 + (t − 1)
h
i=1
2ti−1
(2)
24 / 111
Height of a B-Tree
Finally
n ≥ 1 + 2(t − 1)
th − 1
t − 1
= 2th
− 1 (3)
Therefore
th
≤
n + 1
2
(4)
25 / 111
Height of a B-Tree
Finally
n ≥ 1 + 2(t − 1)
th − 1
t − 1
= 2th
− 1 (3)
Therefore
th
≤
n + 1
2
(4)
25 / 111
Height of a B-Tree
Finally
h ≤ logt
n + 1
2
(5)
26 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
27 / 111
Constraints on the Operations
The root of the B-tree is always in main memory
1 Disk-Read are never performed on it.
2 Only When is written, we use a Disk-Write.
If a node is passed as parameter
It has already had all the necessary Disk-Read operations performed on it
before hand.
In the code that follows, we use:
Disk-Read: To move node from disk to memory.
Disk-Write: To move node from memory to disk.
28 / 111
Constraints on the Operations
The root of the B-tree is always in main memory
1 Disk-Read are never performed on it.
2 Only When is written, we use a Disk-Write.
If a node is passed as parameter
It has already had all the necessary Disk-Read operations performed on it
before hand.
In the code that follows, we use:
Disk-Read: To move node from disk to memory.
Disk-Write: To move node from memory to disk.
28 / 111
Constraints on the Operations
The root of the B-tree is always in main memory
1 Disk-Read are never performed on it.
2 Only When is written, we use a Disk-Write.
If a node is passed as parameter
It has already had all the necessary Disk-Read operations performed on it
before hand.
In the code that follows, we use:
Disk-Read: To move node from disk to memory.
Disk-Write: To move node from memory to disk.
28 / 111
Constraints on the Operations
The root of the B-tree is always in main memory
1 Disk-Read are never performed on it.
2 Only When is written, we use a Disk-Write.
If a node is passed as parameter
It has already had all the necessary Disk-Read operations performed on it
before hand.
In the code that follows, we use:
Disk-Read: To move node from disk to memory.
Disk-Write: To move node from memory to disk.
28 / 111
Constraints on the Operations
The root of the B-tree is always in main memory
1 Disk-Read are never performed on it.
2 Only When is written, we use a Disk-Write.
If a node is passed as parameter
It has already had all the necessary Disk-Read operations performed on it
before hand.
In the code that follows, we use:
Disk-Read: To move node from disk to memory.
Disk-Write: To move node from memory to disk.
28 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
29 / 111
Search operation
Pseudo-Code
B-Tree-Search(x, k)
1 i = 1
2 while i ≤ x.n and k > x.key [i]
3 i = i + 1
4 if i ≤ x.n and k == x.key [i]
5 return (x, i)
6 elseif x.leaf
7 return NIL
8 else Disk-Read(x.c [i])
9 return B-Tree-Search(x.c [i] , k)
30 / 111
Search operation
Pseudo-Code
B-Tree-Search(x, k)
1 i = 1
2 while i ≤ x.n and k > x.key [i]
3 i = i + 1
4 if i ≤ x.n and k == x.key [i]
5 return (x, i)
6 elseif x.leaf
7 return NIL
8 else Disk-Read(x.c [i])
9 return B-Tree-Search(x.c [i] , k)
30 / 111
Search operation
Pseudo-Code
B-Tree-Search(x, k)
1 i = 1
2 while i ≤ x.n and k > x.key [i]
3 i = i + 1
4 if i ≤ x.n and k == x.key [i]
5 return (x, i)
6 elseif x.leaf
7 return NIL
8 else Disk-Read(x.c [i])
9 return B-Tree-Search(x.c [i] , k)
30 / 111
Search operation
Pseudo-Code
B-Tree-Search(x, k)
1 i = 1
2 while i ≤ x.n and k > x.key [i]
3 i = i + 1
4 if i ≤ x.n and k == x.key [i]
5 return (x, i)
6 elseif x.leaf
7 return NIL
8 else Disk-Read(x.c [i])
9 return B-Tree-Search(x.c [i] , k)
30 / 111
Search operation
Pseudo-Code
B-Tree-Search(x, k)
1 i = 1
2 while i ≤ x.n and k > x.key [i]
3 i = i + 1
4 if i ≤ x.n and k == x.key [i]
5 return (x, i)
6 elseif x.leaf
7 return NIL
8 else Disk-Read(x.c [i])
9 return B-Tree-Search(x.c [i] , k)
30 / 111
Using recursion to make the search easier
So, we use line 1 to 5
1 Move to the key x.key [i] such that k ≤ x.key [i]
2 To return the value if stored at the node by the sorted keys!!!
If the node is a leaf
Return NIL == “That key is not in the B-Tree”
The key could be in the next level
Then, Disk-Read(x.c [i]) and call the recursion in the children node
already in memory.
31 / 111
Using recursion to make the search easier
So, we use line 1 to 5
1 Move to the key x.key [i] such that k ≤ x.key [i]
2 To return the value if stored at the node by the sorted keys!!!
If the node is a leaf
Return NIL == “That key is not in the B-Tree”
The key could be in the next level
Then, Disk-Read(x.c [i]) and call the recursion in the children node
already in memory.
31 / 111
Using recursion to make the search easier
So, we use line 1 to 5
1 Move to the key x.key [i] such that k ≤ x.key [i]
2 To return the value if stored at the node by the sorted keys!!!
If the node is a leaf
Return NIL == “That key is not in the B-Tree”
The key could be in the next level
Then, Disk-Read(x.c [i]) and call the recursion in the children node
already in memory.
31 / 111
Using recursion to make the search easier
So, we use line 1 to 5
1 Move to the key x.key [i] such that k ≤ x.key [i]
2 To return the value if stored at the node by the sorted keys!!!
If the node is a leaf
Return NIL == “That key is not in the B-Tree”
The key could be in the next level
Then, Disk-Read(x.c [i]) and call the recursion in the children node
already in memory.
31 / 111
Search operation
Note
Search(root[t], k) returns (x, i) or NIL if no such key.
32 / 111
Cost of Search
Worst Cost
O(h) = O (logt n) disk reads when going through the entire tree.
x.n < 2t ⇒ O(t) for searching the key at each node
Finally, we have that O(th) = O (t logt n) CPU time.
33 / 111
Cost of Search
Worst Cost
O(h) = O (logt n) disk reads when going through the entire tree.
x.n < 2t ⇒ O(t) for searching the key at each node
Finally, we have that O(th) = O (t logt n) CPU time.
33 / 111
Cost of Search
Worst Cost
O(h) = O (logt n) disk reads when going through the entire tree.
x.n < 2t ⇒ O(t) for searching the key at each node
Finally, we have that O(th) = O (t logt n) CPU time.
33 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
34 / 111
Creating an empty tree
Pseudo-Code
B-Tree-Create(T)
1 x =Allocate-Node()
2 x.leaf =TRUE
3 x.n = 0
4 Disk-Write(x)
5 T.root = x
Note
To create a nonempty tree, first create an empty tree and then insert
nodes.
35 / 111
Creating an empty tree
Pseudo-Code
B-Tree-Create(T)
1 x =Allocate-Node()
2 x.leaf =TRUE
3 x.n = 0
4 Disk-Write(x)
5 T.root = x
Note
To create a nonempty tree, first create an empty tree and then insert
nodes.
35 / 111
Cost of Create
Worst Cost
O(1) disk accesses.
O(1) CPU time.
36 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
37 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Insertion
Something Notable
Here is where the things become interesting!!!
Insertions can only be done in non-full nodes.
The holding data structures for keys and pointers are arrays!!!
What?
This means that if a node has 2t − 1 keys, something needs to be done in
order to make space in the node.
Process
1 Split the node around the median key.
2 You finish with two nodes of size t − 1 and the median key y.
3 Promote the median key to the father node to identify the new
ranges.
4 If the father is full recursively split the father to make room.
38 / 111
Important!!!
We always insert at...
THE LEAF LEVEL!!!
Therefore
What if the leaf child becomes full?
39 / 111
Important!!!
We always insert at...
THE LEAF LEVEL!!!
Therefore
What if the leaf child becomes full?
39 / 111
Splitting
Splitting
Applied to a full child of a non-full parent when full≡ 2t − 1 keys.
Example with t = 4
40 / 111
Splitting
Splitting
Applied to a full child of a non-full parent when full≡ 2t − 1 keys.
Example with t = 4
20 24 2920 29
21 22 23 24 25 26 27 21 22 23 25 26 27
40 / 111
Split-Child
Algorithm
B-Tree-Split-Child(x, i)
1. z = Allocate-Node()
2. y = x.ci
3. z.leaf = y.leaf
4. z.n = t − 1
5. for j = 1 and t − 1
6. z.key [j] = y.key [j + t]
7. if not y.leaf
8. for j = 1 to t
9.
z.c [j] = y.c [j + t]
10. y.n = t − 1
11. for j = x.n + 1 downto
i + 1
12. x.c [j + 1] = x.c [j]
13. x.c [i + 1] = z
14. for j = x.n downto i
15.
x.key [j + 1] = x.key [j]
16. x.key [i] = y.key [t]
17. x.n = x.n + 1
18. Disk-Write(y)
19. Disk-Write(z)
20. Disk-Write(x)
41 / 111
Explanation
First
The code works as follow:
the element y has 2t children (2t − 1 keys) but is reduced to t children.
For this, the new node z takes the t largest children from y, and z
becomes a new child of x.
42 / 111
Explanation
First
The code works as follow:
the element y has 2t children (2t − 1 keys) but is reduced to t children.
For this, the new node z takes the t largest children from y, and z
becomes a new child of x.
42 / 111
Explanation
First
The code works as follow:
the element y has 2t children (2t − 1 keys) but is reduced to t children.
For this, the new node z takes the t largest children from y, and z
becomes a new child of x.
42 / 111
Detailed Explanation
First
Lines 1-4 creates node z
1. z = Allocate-Node()
2. y = x.ci
3. z.leaf = y.leaf
4. z.n = t − 1
Lines 1-4
21 22 23 24 24 26 27
20 2920 29
21 22 23 24 25 26 27
43 / 111
Detailed Explanation
First
Lines 5-6 copies the keys from position j + 1 in the y node to position j in
node z:
5. for j = 1 and t − 1
6. z.key [j] = y.key [j + t]
Lines 5-6
21 22 23 24 26 27 2521 22 23 24 24 26 27
20 29 20 29
44 / 111
Detailed Explanation
First
Lines 5-6 copies the keys from position j + 1 in the y node to position j in
node z:
5. for j = 1 and t − 1
6. z.key [j] = y.key [j + t]
Lines 5-6
21 22 23 24 25 26 2721 22 23 24 27 25 26
20 29 20 29
45 / 111
Detailed Explanation
Then
Lines 7-8 are used to copy the children if you are not a leaf
7. if not y.leaf
8. for j = 1 to t
9. z.c [j] = y.c [j + t]
Lines 5-6
21 22 23 24 25 26 2721 22 23 24 25 26 27
20 29 20 29
46 / 111
Detailed Explanation
Then
Lines 7-8 are used to copy the children if you are not a leaf
7. if not y.leaf
8. for j = 1 to t
9. z.c [j] = y.c [j + t]
Lines 5-6
21 22 23 24 25 26 2721 22 23 24 25 26 27
20 29 20 29
47 / 111
Detailed Explanation
Then
Lines 7-8 are used to copy the children if you are not a leaf
7. if not y.leaf
8. for j = 1 to t
9. z.c [j] = y.c [j + t]
21 22 23 24 25 26 27
20 29
48 / 111
Detailed Explanation
Then
Line 10 adjust the count for y.
10. y.n = t − 1
49 / 111
Detailed Explanation
Then
Line 11-13 make space to the pointer for the z node
11. for j = x.n + 1 downto i + 1
12. x.c [j + 1] = x.c [j]
13. x.c [i + 1] = z
21 22 23 24 25 26 27
From the Back to
the Front
20 29
50 / 111
Detailed Explanation
Then
Line 11-13 make space to the pointer for the z node
11. for j = x.n + 1 downto i + 1
12. x.c [j + 1] = x.c [j]
13. x.c [i + 1] = z
21 22 23 24 25 26 27
20 29
51 / 111
Detailed Explanation
Then
Line 14-15 make space to key from the z node to the node x
14. for j = x.n downto i
15. x.key [j + 1] = x.key [j]
21 22 23 24 25 26 27
20 29
52 / 111
Detailed Explanation
Then
Line 16-17 copy the key to the correct place and increase the counter of x
16. x.key [i] = y.key [t]
17. x.n = x.n + 1
21 22 23 25 26 27
20 24 29
53 / 111
Detailed Explanation
Then
Line 18-20 Write everything to the hard drive
18. Disk-Write(y)
19. Disk-Write(z)
20. Disk-Write(x)
54 / 111
Cost of Split-Child
Complexity
Θ(t) CPU time the for loop to go through the keys
O(1) disk writes.
55 / 111
Insert
Code
B-Tree-Insert(T, k)
1 r = T.root
2 if r.n == 2t − 1
3 s =Allocate-Node()
4 T.root = s
5 s.leaf =FALSE
6 s.n = 0
7 s.c [1] = r
8 B-Tree-Split-Childs(s, 1)
9 B-Tree-Insert-Nonfull(s, k)
10 else B-Tree-Insert-Nonfull(s, k)
56 / 111
Insert
Code
B-Tree-Insert(T, k)
1 r = T.root
2 if r.n == 2t − 1
3 s =Allocate-Node()
4 T.root = s
5 s.leaf =FALSE
6 s.n = 0
7 s.c [1] = r
8 B-Tree-Split-Childs(s, 1)
9 B-Tree-Insert-Nonfull(s, k)
10 else B-Tree-Insert-Nonfull(s, k)
56 / 111
Insert
Code
B-Tree-Insert(T, k)
1 r = T.root
2 if r.n == 2t − 1
3 s =Allocate-Node()
4 T.root = s
5 s.leaf =FALSE
6 s.n = 0
7 s.c [1] = r
8 B-Tree-Split-Childs(s, 1)
9 B-Tree-Insert-Nonfull(s, k)
10 else B-Tree-Insert-Nonfull(s, k)
56 / 111
Insert
Code
B-Tree-Insert(T, k)
1 r = T.root
2 if r.n == 2t − 1
3 s =Allocate-Node()
4 T.root = s
5 s.leaf =FALSE
6 s.n = 0
7 s.c [1] = r
8 B-Tree-Split-Childs(s, 1)
9 B-Tree-Insert-Nonfull(s, k)
10 else B-Tree-Insert-Nonfull(s, k)
56 / 111
Explanation
First
Insert using the root of T and the key k to be inserted.
Second
1 Use a a temporary variable r to look at the root
2 If r.n == 2t − 1 Then prepare to split by creating an alternate s
father node.
1 Then Split the node s using Split-Child
2 Insert using the Insert-Non full operation.
3 else Insert using the Insert-Non full operation.
57 / 111
Explanation
First
Insert using the root of T and the key k to be inserted.
Second
1 Use a a temporary variable r to look at the root
2 If r.n == 2t − 1 Then prepare to split by creating an alternate s
father node.
1 Then Split the node s using Split-Child
2 Insert using the Insert-Non full operation.
3 else Insert using the Insert-Non full operation.
57 / 111
Explanation
First
Insert using the root of T and the key k to be inserted.
Second
1 Use a a temporary variable r to look at the root
2 If r.n == 2t − 1 Then prepare to split by creating an alternate s
father node.
1 Then Split the node s using Split-Child
2 Insert using the Insert-Non full operation.
3 else Insert using the Insert-Non full operation.
57 / 111
Explanation
First
Insert using the root of T and the key k to be inserted.
Second
1 Use a a temporary variable r to look at the root
2 If r.n == 2t − 1 Then prepare to split by creating an alternate s
father node.
1 Then Split the node s using Split-Child
2 Insert using the Insert-Non full operation.
3 else Insert using the Insert-Non full operation.
57 / 111
Explanation
First
Insert using the root of T and the key k to be inserted.
Second
1 Use a a temporary variable r to look at the root
2 If r.n == 2t − 1 Then prepare to split by creating an alternate s
father node.
1 Then Split the node s using Split-Child
2 Insert using the Insert-Non full operation.
3 else Insert using the Insert-Non full operation.
57 / 111
Insert-Full
Note
First, modify tree (if necessary) to create room for new key. Then, call
Insert-Nonfull()
Example
58 / 111
Insert-Full
Note
First, modify tree (if necessary) to create room for new key. Then, call
Insert-Nonfull()
Example
24
21 22 23 24 25 26 27 21 22 23 25 26 27
58 / 111
Insert-Nonfull
Algorithm
B-Tree-Insert-Nonfull(x, k)
1. i = x.n
2. if x.leaf
3. while i ≥ 1 and
k < x.key [i]
4.
x.key [i + 1] = x.key [i]
5. i = i − 1
6. x.key [i + 1] = k
7. x.n = x.n + 1
8. Disk-Write(x)
9. else while i ≥ 1 and
k < x.key [i]
10. i = i − 1
11. i = i + 1
12. Disk-Read(x.c [i])
13. if x.c [i] .n == 2t − 1
14.
B-Tree-Split-Child(x, i)
15. if k > x.key [i]
16. i = i + 1
17. B-Tree-Insert-
Nonfull(x.c [i] , k)
59 / 111
Explanation
Line 1
it gets the rightmost key of the B-Tree
1. i = x.n
if x.leaf == TRUE
We make space on the key array because we have space for it.
3. while i ≥ 1 and k < x.key [i]
4. x.key [i + 1] = x.key [i]
5. i = i − 1
60 / 111
Explanation
Line 1
it gets the rightmost key of the B-Tree
1. i = x.n
if x.leaf == TRUE
We make space on the key array because we have space for it.
3. while i ≥ 1 and k < x.key [i]
4. x.key [i + 1] = x.key [i]
5. i = i − 1
60 / 111
Explanation
Insert the key with the payload at the correct position and increase
the counter of x
6. x.key [i + 1] = k
7. x.n = x.n + 1
Write everything to the disk
8. Disk-Write(x)
61 / 111
Explanation
Insert the key with the payload at the correct position and increase
the counter of x
6. x.key [i + 1] = k
7. x.n = x.n + 1
Write everything to the disk
8. Disk-Write(x)
61 / 111
Explanation
if x.leaf ! = TRUE
Get into the correct child and bring it from the hard drive
9. else while i ≥ 1 and k < x.key [i]
10. i = i − 1
11. i = i + 1
12. Disk-Read(x.c [i])
if the child x.c [i] is full split it
13. if x.c [i] .n == 2t − 1
14. B-Tree-Split-Child(x, i)
62 / 111
Explanation
if x.leaf ! = TRUE
Get into the correct child and bring it from the hard drive
9. else while i ≥ 1 and k < x.key [i]
10. i = i − 1
11. i = i + 1
12. Disk-Read(x.c [i])
if the child x.c [i] is full split it
13. if x.c [i] .n == 2t − 1
14. B-Tree-Split-Child(x, i)
62 / 111
Explanation
Now we need to decide
if k == x.key[i]
Then, we take the left child of x.key[i]
If not,
we take the right child of x.key[i]
15. if k > x.key [i]
16. i = i + 1
After that, we insert in a non-full element
17. B-Tree-Insert-Nonfull(x.c [i] , k)
63 / 111
Explanation
Now we need to decide
if k == x.key[i]
Then, we take the left child of x.key[i]
If not,
we take the right child of x.key[i]
15. if k > x.key [i]
16. i = i + 1
After that, we insert in a non-full element
17. B-Tree-Insert-Nonfull(x.c [i] , k)
63 / 111
Explanation
Now we need to decide
if k == x.key[i]
Then, we take the left child of x.key[i]
If not,
we take the right child of x.key[i]
15. if k > x.key [i]
16. i = i + 1
After that, we insert in a non-full element
17. B-Tree-Insert-Nonfull(x.c [i] , k)
63 / 111
Explanation
Now we need to decide
if k == x.key[i]
Then, we take the left child of x.key[i]
If not,
we take the right child of x.key[i]
15. if k > x.key [i]
16. i = i + 1
After that, we insert in a non-full element
17. B-Tree-Insert-Nonfull(x.c [i] , k)
63 / 111
Explanation
Now we need to decide
if k == x.key[i]
Then, we take the left child of x.key[i]
If not,
we take the right child of x.key[i]
15. if k > x.key [i]
16. i = i + 1
After that, we insert in a non-full element
17. B-Tree-Insert-Nonfull(x.c [i] , k)
63 / 111
Cost of Insertion
Worst case
Θ(logt n) disk writes.
Θ(t logt n) CPU time.
64 / 111
Example of Constructing a B-Tree by Insertion
Proceed as follows
Suppose we start with an empty B-Tree and keys arrive in the following
order:
1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29,
53, 55, 24, 23, 22, 11.
Something Notable
We want to build a B-Tree with at most 5 keys. Thus:
2t − 1 = 5
2t = 6
t = 3
65 / 111
Example of Constructing a B-Tree by Insertion
Proceed as follows
Suppose we start with an empty B-Tree and keys arrive in the following
order:
1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29,
53, 55, 24, 23, 22, 11.
Something Notable
We want to build a B-Tree with at most 5 keys. Thus:
2t − 1 = 5
2t = 6
t = 3
65 / 111
Example of Constructing a B-Tree by Insertion
Proceed as follows
Suppose we start with an empty B-Tree and keys arrive in the following
order:
1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29,
53, 55, 24, 23, 22, 11.
Something Notable
We want to build a B-Tree with at most 5 keys. Thus:
2t − 1 = 5
2t = 6
t = 3
65 / 111
Example of Constructing a B-Tree by Insertion
Proceed as follows
Suppose we start with an empty B-Tree and keys arrive in the following
order:
1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29,
53, 55, 24, 23, 22, 11.
Something Notable
We want to build a B-Tree with at most 5 keys. Thus:
2t − 1 = 5
2t = 6
t = 3
65 / 111
First
We insert the first 5 elements in the root node
1 2 8 12 25
66 / 111
Constructing a B-Tree
Then, we want to insert 6 and for this we split promoting 8
1 2 8 12 25
67 / 111
Constructing a B-Tree
Then, we want to insert 6 and for this we split promoting 8
8
1 2 12 25
68 / 111
Constructing a B-Tree
6, 14, 28, 19 get added to the leaf nodes
8
1 2 6 12 14 19 25 28
69 / 111
Constructing a B-Tree
Add 20, Split necessary by promoting 19
8 19
1 2 6 12 14 25 28
70 / 111
Constructing a B-Tree
Add 20 to the leaf node
8 19
1 2 6 12 14 20 25 28
71 / 111
Constructing a B-Tree
Add 17, 7, 52, 16, 48 to the leaf nodes
8 19
1 2 6 7 12 14 16 17 20 25 28 48 52
72 / 111
Constructing a B-Tree
Add 60 to a leaf node, it is necessary to split by promoting 28 to the
root
8 19 28
1 2 6 7 12 14 16 17 20 25 48 52
73 / 111
Constructing a B-Tree
Add 60
8 19 28
1 2 6 7 12 14 16 17 20 25 48 52
74 / 111
Constructing a B-Tree
Add 68, 3, 26, 27, 53 to the leaf nodes
8 19 28
1 2 6 7 12 14 16 17 20 25 48 52 60
75 / 111
Constructing a B-Tree
Add 68, 3, 26, 27, 53 to the leaf nodes
8 19 28
1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 53 60 68
76 / 111
Constructing a B-Tree
Add 55 by splitting a leaf node and promoting 54
8 19 28 53
1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 60 68
77 / 111
Constructing a B-Tree
Add 55 to the leaf
8 19 28 53
1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 55 60 68
78 / 111
Constructing a B-Tree
Add 24 to the leaf
8 19 28 53
1 2 3 6 7 12 14 16 17 20 24 25 26 27 48 52 55 60 68
79 / 111
Constructing a B-Tree
Add 22 by splitting a leaf node and promoting 25
8 19 25 28 53
1 2 3 6 7 12 14 16 17 20 22 24 26 27 48 52 55 60 68
80 / 111
Constructing a B-Tree
Add 11 to the leaf node by adding a empty root node
8 19 25 28 53
1 2 3 6 7 12 14 16 17 20 22 24 26 27 48 52 55 60 68
81 / 111
Constructing a B-Tree
Split the old root by promoting 25
8 19
1 2 3 6 7 20 22 24 26 2712 14 16 17
25
28 53
48 52 55 60 68
82 / 111
Constructing a B-Tree
Add 11 to the leaf
8 19
1 2 3 6 7 20 22 24 26 2711 12 14 16 17
25
28 53
48 52 55 60 68
83 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
84 / 111
Deletion
Main idea
Recursively descend the tree.
Ensure
Ensure any non-root node x that is considered for deletion has at least t
keys.
Note that
May have to move a key down from parent.
85 / 111
Deletion
Main idea
Recursively descend the tree.
Ensure
Ensure any non-root node x that is considered for deletion has at least t
keys.
Note that
May have to move a key down from parent.
85 / 111
Deletion
Main idea
Recursively descend the tree.
Ensure
Ensure any non-root node x that is considered for deletion has at least t
keys.
Note that
May have to move a key down from parent.
85 / 111
Deletion Cases
Case 0: You delete the only key at the root ≈ Empty root
Then, you make root’s only child the new root:
Case 1: k in x and x.leaf == TRUE, then delete k from x.
86 / 111
Deletion Cases
Case 0: You delete the only key at the root ≈ Empty root
Then, you make root’s only child the new root:
Case 1: k in x and x.leaf == TRUE, then delete k from x.
leaf
86 / 111
Deletion Cases
Case 2: k in x, x internal
not a
leaf
87 / 111
Deletion Cases
Subcase A: y has at least t keys; find predecessor k of k in subtree
rooted at y, recursively delete k , replace k by k in x.
not a
leaf
Predecesor
of k
88 / 111
Deletion Cases
Subcase B: z has at least t keys; find successor k in subtree rooted
at z, recursively delete k , replace k by k in x.
not a
leaf
Succesor
of k
89 / 111
Deletion Cases
Subcase C: y and z both have t − 1 keys; merge k and z into y, free
z, recursively delete k from y.
not a
leaf
not a
leaf
90 / 111
Deletion cases
Case 3
If the key k is not present in internal node x, determine the root x.ci
of the appropriate subtree that must contain k, if k is in the tree at
all.
If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to
guarantee that we descend to a node containing at least t keys.
Then finish by recursing on the appropriate child of x.
91 / 111
Deletion cases
Case 3
If the key k is not present in internal node x, determine the root x.ci
of the appropriate subtree that must contain k, if k is in the tree at
all.
If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to
guarantee that we descend to a node containing at least t keys.
Then finish by recursing on the appropriate child of x.
91 / 111
Deletion cases
Case 3
If the key k is not present in internal node x, determine the root x.ci
of the appropriate subtree that must contain k, if k is in the tree at
all.
If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to
guarantee that we descend to a node containing at least t keys.
Then finish by recursing on the appropriate child of x.
91 / 111
Case 3.A
Subcase A
If x.ci has only t − 1 keys but has an immediate sibling with at least t keys, give
x.ci an extra key by moving a key from x down into x.ci , moving a key from
x.ci’s immediate left or right sibling up into x, and moving the appropriate child
pointer from the sibling into x.ci.
not a
leaf
Recursively
Descend
92 / 111
Case 3.B
Subcase B
If x.ci and both of x.ci’s immediate siblings have t − 1 keys, merge x.ci
with one sibling, which involves moving a key from x down into the new
merged node to become the median key for that node.
not a
leaf
Recursively
Descend
93 / 111
Delete Example
Delete 14 - Case 3.B
8 19
1 2 3 6 7 20 22 24 48 52 55 60 6826 2711 12 14 16 17
25
28 53
94 / 111
Delete Example
Delete 14 - move 25 down from the root and join the children nodes
8 19 25 28 53
1 2 3 6 7 20 22 24 26 2711 12 14 16 17 48 52 55 60 68
95 / 111
Delete Example
Delete 14
8 19 25 28 53
1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68
96 / 111
Delete Example
Case 0
8 19 25 28 53
1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68
97 / 111
Delete Example
Delete 28 - Case 2.C
8 19 25 28 53
1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68
98 / 111
Delete Example
Join the left and right children of 28 and move it down
8 19 25 53
1 2 3 6 7 20 22 24 26 27 28 48 5211 12 16 17 55 60 68
99 / 111
Delete Example
Recursively Delete 28
8 19 25 53
1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68
100 / 111
Delete Example
Delete 25 - Case 2.B
8 19 25 53
1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68
101 / 111
Delete Example
Move 26 to the position of 25
8 19 25 53
1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68
102 / 111
Delete Example
Move 26 to the position of 25
8 19 26 53
1 2 3 6 7 20 22 24 27 48 5211 12 16 17 55 60 68
103 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
104 / 111
Reasons for using B-Trees
Justification
When searching tables held on disc, the cost of each disc transfer is high,
but does not depend much on the amount of data transferred, especially if
consecutive items are transferred.
Example
If we use a B-Tree of order 101, say, we can transfer each node in one
disc read operation.
A B-Tree of order 101 and height 3 can hold 1014 − 1 items
(approximately 100 million) and any item can be accessed with 3 disc
reads (assuming we hold the root in memory).
105 / 111
Reasons for using B-Trees
Justification
When searching tables held on disc, the cost of each disc transfer is high,
but does not depend much on the amount of data transferred, especially if
consecutive items are transferred.
Example
If we use a B-Tree of order 101, say, we can transfer each node in one
disc read operation.
A B-Tree of order 101 and height 3 can hold 1014 − 1 items
(approximately 100 million) and any item can be accessed with 3 disc
reads (assuming we hold the root in memory).
105 / 111
Reasons for using B-Trees
Justification
When searching tables held on disc, the cost of each disc transfer is high,
but does not depend much on the amount of data transferred, especially if
consecutive items are transferred.
Example
If we use a B-Tree of order 101, say, we can transfer each node in one
disc read operation.
A B-Tree of order 101 and height 3 can hold 1014 − 1 items
(approximately 100 million) and any item can be accessed with 3 disc
reads (assuming we hold the root in memory).
105 / 111
Comparing trees
Binary trees
They can become unbalanced and lose their good time complexity
(big O).
AVL trees are strict binary trees that overcome the balance problem.
Heaps remain balanced, but only prioritize (not order) the keys.
Multi-way trees
B-Trees can be m-way, they have any even number of children.
The 2-3 (or 3 way) approximates a permanently balanced binary tree.
106 / 111
Comparing trees
Binary trees
They can become unbalanced and lose their good time complexity
(big O).
AVL trees are strict binary trees that overcome the balance problem.
Heaps remain balanced, but only prioritize (not order) the keys.
Multi-way trees
B-Trees can be m-way, they have any even number of children.
The 2-3 (or 3 way) approximates a permanently balanced binary tree.
106 / 111
Comparing trees
Binary trees
They can become unbalanced and lose their good time complexity
(big O).
AVL trees are strict binary trees that overcome the balance problem.
Heaps remain balanced, but only prioritize (not order) the keys.
Multi-way trees
B-Trees can be m-way, they have any even number of children.
The 2-3 (or 3 way) approximates a permanently balanced binary tree.
106 / 111
Comparing trees
Binary trees
They can become unbalanced and lose their good time complexity
(big O).
AVL trees are strict binary trees that overcome the balance problem.
Heaps remain balanced, but only prioritize (not order) the keys.
Multi-way trees
B-Trees can be m-way, they have any even number of children.
The 2-3 (or 3 way) approximates a permanently balanced binary tree.
106 / 111
Comparing trees
Binary trees
They can become unbalanced and lose their good time complexity
(big O).
AVL trees are strict binary trees that overcome the balance problem.
Heaps remain balanced, but only prioritize (not order) the keys.
Multi-way trees
B-Trees can be m-way, they have any even number of children.
The 2-3 (or 3 way) approximates a permanently balanced binary tree.
106 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
107 / 111
Extending the B-Tree Structure: B+ Trees
B+ Tree
A B+ Tree is like a B-tree except that the interior and leaf nodes have a
different structure.
Actually
A B+ tree can be viewed as a B-tree in which each node contains only
keys and pointers to the children.
Finally
At leaves level you have the real data items(They could be pointers to
specific data).
Node
This allows to pack more information in each node.
108 / 111
Extending the B-Tree Structure: B+ Trees
B+ Tree
A B+ Tree is like a B-tree except that the interior and leaf nodes have a
different structure.
Actually
A B+ tree can be viewed as a B-tree in which each node contains only
keys and pointers to the children.
Finally
At leaves level you have the real data items(They could be pointers to
specific data).
Node
This allows to pack more information in each node.
108 / 111
Extending the B-Tree Structure: B+ Trees
B+ Tree
A B+ Tree is like a B-tree except that the interior and leaf nodes have a
different structure.
Actually
A B+ tree can be viewed as a B-tree in which each node contains only
keys and pointers to the children.
Finally
At leaves level you have the real data items(They could be pointers to
specific data).
Node
This allows to pack more information in each node.
108 / 111
Extending the B-Tree Structure: B+ Trees
B+ Tree
A B+ Tree is like a B-tree except that the interior and leaf nodes have a
different structure.
Actually
A B+ tree can be viewed as a B-tree in which each node contains only
keys and pointers to the children.
Finally
At leaves level you have the real data items(They could be pointers to
specific data).
Node
This allows to pack more information in each node.
108 / 111
In the paper
Something Notable
“Modularizing B+-Trees: Three-Level B+-Trees Work Fine” by Shigero
Sasaki and Takuya Araki from NEC
NEC
NEC Corporation (Nippon Denki Kabushiki Gaisha) is a Japanese
multinational provider of information technology (IT) services and
products, with its headquarters in Minato, Tokyo, Japan.NEC provides
information technology (IT) and network solutions to business enterprises,
communications services providers and to government agencies.
109 / 111
In the paper
Something Notable
“Modularizing B+-Trees: Three-Level B+-Trees Work Fine” by Shigero
Sasaki and Takuya Araki from NEC
NEC
NEC Corporation (Nippon Denki Kabushiki Gaisha) is a Japanese
multinational provider of information technology (IT) services and
products, with its headquarters in Minato, Tokyo, Japan.NEC provides
information technology (IT) and network solutions to business enterprises,
communications services providers and to government agencies.
109 / 111
Outline
1 Introduction
Motivation for B-Trees
2 Basic Definitions
B-Trees definition
Application for B-Trees
3 Height of a B-Tree
The Height Property
4 Operations
B-Tree operations
Search
Create
Insertion
Insertion Example
Deletion
Delete Example for t = 3
Reasons for using B-Trees
B+-Trees
5 Exercises
Some Exercises that you can try
110 / 111
Exercises
You can try the following ones
1 18.1-3
2 18.1-4
3 18.2-3
4 18.2-5
5 18.2-4
6 18.2-6
7 18.2-7
8 18.3-1
111 / 111

More Related Content

What's hot

Building a userspace filesystem in node.js
Building a userspace filesystem in node.jsBuilding a userspace filesystem in node.js
Building a userspace filesystem in node.jsClay Smith
 
Raid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive DisksRaid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive DisksMudit Mishra
 
Chapter06 Managing Disks And Data Storage
Chapter06      Managing  Disks And  Data  StorageChapter06      Managing  Disks And  Data  Storage
Chapter06 Managing Disks And Data StorageRaja Waseem Akhtar
 
Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linuxZkre Saleh
 
Mass Storage Structure
Mass Storage StructureMass Storage Structure
Mass Storage StructureVimalanathan D
 
HARD DISK PARTITIONING,FORMATING
HARD DISK PARTITIONING,FORMATINGHARD DISK PARTITIONING,FORMATING
HARD DISK PARTITIONING,FORMATINGchiju chinnu
 
Resilient file system
Resilient file systemResilient file system
Resilient file systemAyush Gupta
 
Asif Jamal disk (it)
Asif Jamal disk (it)Asif Jamal disk (it)
Asif Jamal disk (it)Asif Jamal
 
Hard disk_
 Hard disk_ Hard disk_
Hard disk_carabung
 
Disk management / hard drive partition management / create drive or partition...
Disk management / hard drive partition management / create drive or partition...Disk management / hard drive partition management / create drive or partition...
Disk management / hard drive partition management / create drive or partition...Ajay Panchal
 

What's hot (20)

Building a userspace filesystem in node.js
Building a userspace filesystem in node.jsBuilding a userspace filesystem in node.js
Building a userspace filesystem in node.js
 
Raid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive DisksRaid- Redundant Array of Inexpensive Disks
Raid- Redundant Array of Inexpensive Disks
 
Chapter06 Managing Disks And Data Storage
Chapter06      Managing  Disks And  Data  StorageChapter06      Managing  Disks And  Data  Storage
Chapter06 Managing Disks And Data Storage
 
Hard disk drive
Hard disk driveHard disk drive
Hard disk drive
 
Hard disk drive
Hard disk driveHard disk drive
Hard disk drive
 
Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linux
 
Sistema Operativo
Sistema OperativoSistema Operativo
Sistema Operativo
 
Drobo range ppt v.1.6
Drobo range ppt v.1.6Drobo range ppt v.1.6
Drobo range ppt v.1.6
 
NTFS and Inode
NTFS and InodeNTFS and Inode
NTFS and Inode
 
Computer with terms
Computer with termsComputer with terms
Computer with terms
 
Mass Storage Structure
Mass Storage StructureMass Storage Structure
Mass Storage Structure
 
Ntfs forensics
Ntfs forensicsNtfs forensics
Ntfs forensics
 
HARD DISK PARTITIONING,FORMATING
HARD DISK PARTITIONING,FORMATINGHARD DISK PARTITIONING,FORMATING
HARD DISK PARTITIONING,FORMATING
 
Resilient file system
Resilient file systemResilient file system
Resilient file system
 
Hard disk
Hard diskHard disk
Hard disk
 
Asif Jamal disk (it)
Asif Jamal disk (it)Asif Jamal disk (it)
Asif Jamal disk (it)
 
FAT vs NTFS
FAT vs NTFSFAT vs NTFS
FAT vs NTFS
 
Hard disk_
 Hard disk_ Hard disk_
Hard disk_
 
Hard drive partitions
Hard drive partitionsHard drive partitions
Hard drive partitions
 
Disk management / hard drive partition management / create drive or partition...
Disk management / hard drive partition management / create drive or partition...Disk management / hard drive partition management / create drive or partition...
Disk management / hard drive partition management / create drive or partition...
 

Similar to 15 B-Trees

Raid : Redundant Array of Inexpensive Disks
Raid : Redundant Array of Inexpensive DisksRaid : Redundant Array of Inexpensive Disks
Raid : Redundant Array of Inexpensive DisksCloudbells.com
 
6 chapter 6 record storage and primary file organization
6 chapter 6  record storage and primary file organization6 chapter 6  record storage and primary file organization
6 chapter 6 record storage and primary file organizationsiragezeynu
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationlin bao
 
Storage System and Backup Media
Storage System and Backup MediaStorage System and Backup Media
Storage System and Backup MediaImranulHasan6
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationJafar Nesargi
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationJafar Nesargi
 
Internal representation of file chapter 4 Sowmya Jyothi
Internal representation of file chapter 4 Sowmya JyothiInternal representation of file chapter 4 Sowmya Jyothi
Internal representation of file chapter 4 Sowmya JyothiSowmya Jyothi
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementJ Singh
 
Record storage and primary file organization
Record storage and primary file organizationRecord storage and primary file organization
Record storage and primary file organizationJafar Nesargi
 
Introduction to Hadoop Distributed File System(HDFS).pptx
Introduction to Hadoop Distributed File System(HDFS).pptxIntroduction to Hadoop Distributed File System(HDFS).pptx
Introduction to Hadoop Distributed File System(HDFS).pptxSakthiVinoth78
 
ITBIS105 5
ITBIS105 5ITBIS105 5
ITBIS105 5Suad 00
 

Similar to 15 B-Trees (20)

Raid : Redundant Array of Inexpensive Disks
Raid : Redundant Array of Inexpensive DisksRaid : Redundant Array of Inexpensive Disks
Raid : Redundant Array of Inexpensive Disks
 
Storage memory
Storage memoryStorage memory
Storage memory
 
Magnetic disk - Krishna Geetha.ppt
Magnetic disk  - Krishna Geetha.pptMagnetic disk  - Krishna Geetha.ppt
Magnetic disk - Krishna Geetha.ppt
 
ch11
ch11ch11
ch11
 
6 chapter 6 record storage and primary file organization
6 chapter 6  record storage and primary file organization6 chapter 6  record storage and primary file organization
6 chapter 6 record storage and primary file organization
 
File Allocation Methods.ppt
File Allocation Methods.pptFile Allocation Methods.ppt
File Allocation Methods.ppt
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentation
 
Storage System and Backup Media
Storage System and Backup MediaStorage System and Backup Media
Storage System and Backup Media
 
March 2011 HUG: HDFS Federation
March 2011 HUG: HDFS FederationMarch 2011 HUG: HDFS Federation
March 2011 HUG: HDFS Federation
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organization
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organization
 
UNIT III.pptx
UNIT III.pptxUNIT III.pptx
UNIT III.pptx
 
Internal representation of file chapter 4 Sowmya Jyothi
Internal representation of file chapter 4 Sowmya JyothiInternal representation of file chapter 4 Sowmya Jyothi
Internal representation of file chapter 4 Sowmya Jyothi
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management
 
Storage4922
Storage4922Storage4922
Storage4922
 
ZFS
ZFSZFS
ZFS
 
Record storage and primary file organization
Record storage and primary file organizationRecord storage and primary file organization
Record storage and primary file organization
 
Introduction to Hadoop Distributed File System(HDFS).pptx
Introduction to Hadoop Distributed File System(HDFS).pptxIntroduction to Hadoop Distributed File System(HDFS).pptx
Introduction to Hadoop Distributed File System(HDFS).pptx
 
RAID.ppt
RAID.pptRAID.ppt
RAID.ppt
 
ITBIS105 5
ITBIS105 5ITBIS105 5
ITBIS105 5
 

More from Andres Mendez-Vazquez

01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectorsAndres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issuesAndres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiationAndres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep LearningAndres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learningAndres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusAndres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusAndres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesAndres Mendez-Vazquez
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variationsAndres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
06 recurrent neural_networks
06 recurrent neural_networks06 recurrent neural_networks
06 recurrent neural_networks
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 

Recently uploaded

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 

Recently uploaded (20)

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 

15 B-Trees

  • 1. Analysis of Algorithms B-Trees Andres Mendez-Vazquez November 3, 2015 1 / 111
  • 2. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 2 / 111
  • 3. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 3 / 111
  • 4. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 5. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 6. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 7. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 8. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 9. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 10. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 11. Disk-based Environments Something Notable We have the following hierarchy of data access speed 1 CPU 2 Cache 3 Main Memory 4 Secondary Storage: Magnetic Disks and SSD 5 Tertiary Storage: Tapes We know the following Data is stored on disk in units called blocks or pages. Every disk access has to read/write one or multiple blocks. Even if we need to access a single integer stored in a disk block which contains thousands of integers, we need to read the whole block in. 4 / 111
  • 12. Now, What if you use a binary tree In this structure the nodes are disk blocks Disk Block 1 Disk Block 2 Disk Block 3 Disk Block 4 Disk Block 5 Disk Block 6 Still, We have the following problem If a disk block is 8K (8192 bytes) Problem the necessary information for a node is A key = 4 bytes A value = 4 bytes Two Children = 8 bytes 5 / 111
  • 13. Now, What if you use a binary tree In this structure the nodes are disk blocks Disk Block 1 Disk Block 2 Disk Block 3 Disk Block 4 Disk Block 5 Disk Block 6 Still, We have the following problem If a disk block is 8K (8192 bytes) Problem the necessary information for a node is A key = 4 bytes A value = 4 bytes Two Children = 8 bytes 5 / 111
  • 14. Now, What if you use a binary tree In this structure the nodes are disk blocks Disk Block 1 Disk Block 2 Disk Block 3 Disk Block 4 Disk Block 5 Disk Block 6 Still, We have the following problem If a disk block is 8K (8192 bytes) Problem the necessary information for a node is A key = 4 bytes A value = 4 bytes Two Children = 8 bytes 5 / 111
  • 15. Problem!!! Then We use only 0.2% of the block is full Even If we store multiple tree nodes in a disk!!! 6 / 111
  • 16. Problem!!! Then We use only 0.2% of the block is full Even If we store multiple tree nodes in a disk!!! 6 / 111
  • 17. However The query and update need to access O (log2 n) nodes Block i Block i+1 Block i+2 Block i+3 Worst Case O (log2 n) accesses to disk!!! 7 / 111
  • 18. Increase the branching With a large B logB n log2n (1) Ok We can minimize the number of disk access by increasing the branching!!! We need a way to access elements in the new branching. 8 / 111
  • 19. Increase the branching With a large B logB n log2n (1) Ok We can minimize the number of disk access by increasing the branching!!! We need a way to access elements in the new branching. 8 / 111
  • 20. Motivation for B-Trees Some facts! Index structures for large datasets cannot be stored in main memory (Actually, not anymore the case!!!). Storing it on disk requires different approach to efficiency. Assuming that a disk spins at 3600 RPM, one revolution occurs in 1/60 of a second, or 16.7 ms. Crudely speaking, one disk access takes about the same time as 200,000 instructions! 9 / 111
  • 21. Motivation for B-Trees Some facts! Index structures for large datasets cannot be stored in main memory (Actually, not anymore the case!!!). Storing it on disk requires different approach to efficiency. Assuming that a disk spins at 3600 RPM, one revolution occurs in 1/60 of a second, or 16.7 ms. Crudely speaking, one disk access takes about the same time as 200,000 instructions! 9 / 111
  • 22. Motivation for B-Trees Some facts! Index structures for large datasets cannot be stored in main memory (Actually, not anymore the case!!!). Storing it on disk requires different approach to efficiency. Assuming that a disk spins at 3600 RPM, one revolution occurs in 1/60 of a second, or 16.7 ms. Crudely speaking, one disk access takes about the same time as 200,000 instructions! 9 / 111
  • 23. Motivation for B-Trees Some facts! Index structures for large datasets cannot be stored in main memory (Actually, not anymore the case!!!). Storing it on disk requires different approach to efficiency. Assuming that a disk spins at 3600 RPM, one revolution occurs in 1/60 of a second, or 16.7 ms. Crudely speaking, one disk access takes about the same time as 200,000 instructions! 9 / 111
  • 24. Motivation for B-Trees Now Assume that we use a binary tree to store about 20 million records. We end up with a very deep binary tree with lots of different disk accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds. We know we can’t improve on the log2 n lower bound on search for a binary tree. However, the solution is to use more branches and thus reduce the height of the tree! As branching increases, depth decreases. 10 / 111
  • 25. Motivation for B-Trees Now Assume that we use a binary tree to store about 20 million records. We end up with a very deep binary tree with lots of different disk accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds. We know we can’t improve on the log2 n lower bound on search for a binary tree. However, the solution is to use more branches and thus reduce the height of the tree! As branching increases, depth decreases. 10 / 111
  • 26. Motivation for B-Trees Now Assume that we use a binary tree to store about 20 million records. We end up with a very deep binary tree with lots of different disk accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds. We know we can’t improve on the log2 n lower bound on search for a binary tree. However, the solution is to use more branches and thus reduce the height of the tree! As branching increases, depth decreases. 10 / 111
  • 27. Motivation for B-Trees Now Assume that we use a binary tree to store about 20 million records. We end up with a very deep binary tree with lots of different disk accesses; log2 20 × 106 is about 24, so this takes about 0.2 seconds. We know we can’t improve on the log2 n lower bound on search for a binary tree. However, the solution is to use more branches and thus reduce the height of the tree! As branching increases, depth decreases. 10 / 111
  • 28. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 11 / 111
  • 29. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Definitions Every node x has the following attributes: x.n number of keys stored at node x. Each key has an associated payload (Pointer, values, etc). The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n . x.leaf is a boolean value and denotes a leaf when is set to TRUE. 12 / 111
  • 30. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Definitions Every node x has the following attributes: x.n number of keys stored at node x. Each key has an associated payload (Pointer, values, etc). The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n . x.leaf is a boolean value and denotes a leaf when is set to TRUE. 12 / 111
  • 31. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Definitions Every node x has the following attributes: x.n number of keys stored at node x. Each key has an associated payload (Pointer, values, etc). The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n . x.leaf is a boolean value and denotes a leaf when is set to TRUE. 12 / 111
  • 32. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Definitions Every node x has the following attributes: x.n number of keys stored at node x. Each key has an associated payload (Pointer, values, etc). The keys are sorted key1 ≤ key2 ≤ ... ≤ keyx.n . x.leaf is a boolean value and denotes a leaf when is set to TRUE. 12 / 111
  • 33. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level 13 / 111
  • 34. B-Trees definition In addition Every node x has the following attributes: It contains x.n + 1 pointers to its children: x.c1, x.c2, ..., x.cn+1 Leaf nodes do not have children then they leave this field undefined. The keys are used to separate the keys stored at the B-Tree. For example, if ki is any key stored in the subtree stored at tree with root x.ci then k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1 14 / 111
  • 35. B-Trees definition In addition Every node x has the following attributes: It contains x.n + 1 pointers to its children: x.c1, x.c2, ..., x.cn+1 Leaf nodes do not have children then they leave this field undefined. The keys are used to separate the keys stored at the B-Tree. For example, if ki is any key stored in the subtree stored at tree with root x.ci then k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1 14 / 111
  • 36. B-Trees definition In addition Every node x has the following attributes: It contains x.n + 1 pointers to its children: x.c1, x.c2, ..., x.cn+1 Leaf nodes do not have children then they leave this field undefined. The keys are used to separate the keys stored at the B-Tree. For example, if ki is any key stored in the subtree stored at tree with root x.ci then k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1 14 / 111
  • 37. B-Trees definition In addition Every node x has the following attributes: It contains x.n + 1 pointers to its children: x.c1, x.c2, ..., x.cn+1 Leaf nodes do not have children then they leave this field undefined. The keys are used to separate the keys stored at the B-Tree. For example, if ki is any key stored in the subtree stored at tree with root x.ci then k1 ≤ x.key1 ≤ k2 ≤ x.key2 ≤ ... ≤ x.keyn ≤ kx.n+1 14 / 111
  • 38. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Minimum Degree A fixed integer t ≥ 2 is called the minimum degree or branching of the tree: if x = root → t − 1 ≤ x.n ≤ 2t − 1 If x = root→ 1 ≤ x.n ≤ 2t − 1 15 / 111
  • 39. B-Trees definition Example 34 15 20 5 10 17 19 21 25 30 35 40 48 55 61 75 95 105 45 60 90 Note: Leaves at the same level Minimum Degree A fixed integer t ≥ 2 is called the minimum degree or branching of the tree: if x = root → t − 1 ≤ x.n ≤ 2t − 1 If x = root→ 1 ≤ x.n ≤ 2t − 1 15 / 111
  • 40. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 16 / 111
  • 41. We want to store large sets of indexes First We assume that the set is so voluminous that only a small part can be kept in main memory!!! Thus We want to minimize the number of access to hard drive by using the locality principle!!! 17 / 111
  • 42. We want to store large sets of indexes First We assume that the set is so voluminous that only a small part can be kept in main memory!!! Thus We want to minimize the number of access to hard drive by using the locality principle!!! 17 / 111
  • 43. Application: Minimizing disk access when looking for indexes in databases Each node is stored as a page Page size determines t. Since t is usually large, this implies a large branching factor, so height is small. Example with t = 1001, we have 1000 (key, elements) per node 18 / 111
  • 44. Application: Minimizing disk access when looking for indexes in databases Each node is stored as a page Page size determines t. Since t is usually large, this implies a large branching factor, so height is small. Example with t = 1001, we have 1000 (key, elements) per node 1000 1000 1000 1000 1000 1000 1000 Branching 1001 18 / 111
  • 45. Application: Minimizing disk access when looking for indexes in databases Example with (2t − 1) + 1 = 1001, we have 1000 (key, elements) per node 1000 1000 1000 1000 1000 1000 1000 Branching 1001 19 / 111
  • 46. Application: Minimizing disk access when looking for indexes in databases The example above It can hold over one billion keys. the height is only 2 (Assuming root at height 0), so we can find any key with only two disk accesses (Compared to red-black trees, where the branching factor is 2). Then, disk accesses are minimal!!! 20 / 111
  • 47. Application: Minimizing disk access when looking for indexes in databases The example above It can hold over one billion keys. the height is only 2 (Assuming root at height 0), so we can find any key with only two disk accesses (Compared to red-black trees, where the branching factor is 2). Then, disk accesses are minimal!!! 20 / 111
  • 48. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 21 / 111
  • 49. Height of a B-Tree Theorem 18.1 Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T. Then h ≤ logt n+1 2 Proof The root of a B-tree T contains at least one key, and all other nodes contain at least t − 1 keys. Thus, T , whose height is h, It has at least 2 nodes at depth 1. At least 2t nodes at depth 2. At least 2t2 nodes at depth 3. Then, depth h has at least 2th−1 nodes. 22 / 111
  • 50. Height of a B-Tree Theorem 18.1 Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T. Then h ≤ logt n+1 2 Proof The root of a B-tree T contains at least one key, and all other nodes contain at least t − 1 keys. Thus, T , whose height is h, It has at least 2 nodes at depth 1. At least 2t nodes at depth 2. At least 2t2 nodes at depth 3. Then, depth h has at least 2th−1 nodes. 22 / 111
  • 51. Height of a B-Tree Theorem 18.1 Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T. Then h ≤ logt n+1 2 Proof The root of a B-tree T contains at least one key, and all other nodes contain at least t − 1 keys. Thus, T , whose height is h, It has at least 2 nodes at depth 1. At least 2t nodes at depth 2. At least 2t2 nodes at depth 3. Then, depth h has at least 2th−1 nodes. 22 / 111
  • 52. Height of a B-Tree Theorem 18.1 Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T. Then h ≤ logt n+1 2 Proof The root of a B-tree T contains at least one key, and all other nodes contain at least t − 1 keys. Thus, T , whose height is h, It has at least 2 nodes at depth 1. At least 2t nodes at depth 2. At least 2t2 nodes at depth 3. Then, depth h has at least 2th−1 nodes. 22 / 111
  • 53. Height of a B-Tree Theorem 18.1 Let n be the number of keys in T, n ≥ 1, t ≥ 2, and h be the height of T. Then h ≤ logt n+1 2 Proof The root of a B-tree T contains at least one key, and all other nodes contain at least t − 1 keys. Thus, T , whose height is h, It has at least 2 nodes at depth 1. At least 2t nodes at depth 2. At least 2t2 nodes at depth 3. Then, depth h has at least 2th−1 nodes. 22 / 111
  • 54. For example We have the following Depth Number Of Nodes 23 / 111
  • 55. Height of a B-Tree We have at least 1 Depth 0 - One key 2 Depth 1 - 2t0(t − 1) 3 Depth 2 - 2t1(t − 1) 4 Depth 3 - 2t2(t − 1) 5 ... Thus n ≥ 1 + (t − 1) h i=1 2ti−1 (2) 24 / 111
  • 56. Height of a B-Tree We have at least 1 Depth 0 - One key 2 Depth 1 - 2t0(t − 1) 3 Depth 2 - 2t1(t − 1) 4 Depth 3 - 2t2(t − 1) 5 ... Thus n ≥ 1 + (t − 1) h i=1 2ti−1 (2) 24 / 111
  • 57. Height of a B-Tree We have at least 1 Depth 0 - One key 2 Depth 1 - 2t0(t − 1) 3 Depth 2 - 2t1(t − 1) 4 Depth 3 - 2t2(t − 1) 5 ... Thus n ≥ 1 + (t − 1) h i=1 2ti−1 (2) 24 / 111
  • 58. Height of a B-Tree We have at least 1 Depth 0 - One key 2 Depth 1 - 2t0(t − 1) 3 Depth 2 - 2t1(t − 1) 4 Depth 3 - 2t2(t − 1) 5 ... Thus n ≥ 1 + (t − 1) h i=1 2ti−1 (2) 24 / 111
  • 59. Height of a B-Tree We have at least 1 Depth 0 - One key 2 Depth 1 - 2t0(t − 1) 3 Depth 2 - 2t1(t − 1) 4 Depth 3 - 2t2(t − 1) 5 ... Thus n ≥ 1 + (t − 1) h i=1 2ti−1 (2) 24 / 111
  • 60. Height of a B-Tree Finally n ≥ 1 + 2(t − 1) th − 1 t − 1 = 2th − 1 (3) Therefore th ≤ n + 1 2 (4) 25 / 111
  • 61. Height of a B-Tree Finally n ≥ 1 + 2(t − 1) th − 1 t − 1 = 2th − 1 (3) Therefore th ≤ n + 1 2 (4) 25 / 111
  • 62. Height of a B-Tree Finally h ≤ logt n + 1 2 (5) 26 / 111
  • 63. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 27 / 111
  • 64. Constraints on the Operations The root of the B-tree is always in main memory 1 Disk-Read are never performed on it. 2 Only When is written, we use a Disk-Write. If a node is passed as parameter It has already had all the necessary Disk-Read operations performed on it before hand. In the code that follows, we use: Disk-Read: To move node from disk to memory. Disk-Write: To move node from memory to disk. 28 / 111
  • 65. Constraints on the Operations The root of the B-tree is always in main memory 1 Disk-Read are never performed on it. 2 Only When is written, we use a Disk-Write. If a node is passed as parameter It has already had all the necessary Disk-Read operations performed on it before hand. In the code that follows, we use: Disk-Read: To move node from disk to memory. Disk-Write: To move node from memory to disk. 28 / 111
  • 66. Constraints on the Operations The root of the B-tree is always in main memory 1 Disk-Read are never performed on it. 2 Only When is written, we use a Disk-Write. If a node is passed as parameter It has already had all the necessary Disk-Read operations performed on it before hand. In the code that follows, we use: Disk-Read: To move node from disk to memory. Disk-Write: To move node from memory to disk. 28 / 111
  • 67. Constraints on the Operations The root of the B-tree is always in main memory 1 Disk-Read are never performed on it. 2 Only When is written, we use a Disk-Write. If a node is passed as parameter It has already had all the necessary Disk-Read operations performed on it before hand. In the code that follows, we use: Disk-Read: To move node from disk to memory. Disk-Write: To move node from memory to disk. 28 / 111
  • 68. Constraints on the Operations The root of the B-tree is always in main memory 1 Disk-Read are never performed on it. 2 Only When is written, we use a Disk-Write. If a node is passed as parameter It has already had all the necessary Disk-Read operations performed on it before hand. In the code that follows, we use: Disk-Read: To move node from disk to memory. Disk-Write: To move node from memory to disk. 28 / 111
  • 69. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 29 / 111
  • 70. Search operation Pseudo-Code B-Tree-Search(x, k) 1 i = 1 2 while i ≤ x.n and k > x.key [i] 3 i = i + 1 4 if i ≤ x.n and k == x.key [i] 5 return (x, i) 6 elseif x.leaf 7 return NIL 8 else Disk-Read(x.c [i]) 9 return B-Tree-Search(x.c [i] , k) 30 / 111
  • 71. Search operation Pseudo-Code B-Tree-Search(x, k) 1 i = 1 2 while i ≤ x.n and k > x.key [i] 3 i = i + 1 4 if i ≤ x.n and k == x.key [i] 5 return (x, i) 6 elseif x.leaf 7 return NIL 8 else Disk-Read(x.c [i]) 9 return B-Tree-Search(x.c [i] , k) 30 / 111
  • 72. Search operation Pseudo-Code B-Tree-Search(x, k) 1 i = 1 2 while i ≤ x.n and k > x.key [i] 3 i = i + 1 4 if i ≤ x.n and k == x.key [i] 5 return (x, i) 6 elseif x.leaf 7 return NIL 8 else Disk-Read(x.c [i]) 9 return B-Tree-Search(x.c [i] , k) 30 / 111
  • 73. Search operation Pseudo-Code B-Tree-Search(x, k) 1 i = 1 2 while i ≤ x.n and k > x.key [i] 3 i = i + 1 4 if i ≤ x.n and k == x.key [i] 5 return (x, i) 6 elseif x.leaf 7 return NIL 8 else Disk-Read(x.c [i]) 9 return B-Tree-Search(x.c [i] , k) 30 / 111
  • 74. Search operation Pseudo-Code B-Tree-Search(x, k) 1 i = 1 2 while i ≤ x.n and k > x.key [i] 3 i = i + 1 4 if i ≤ x.n and k == x.key [i] 5 return (x, i) 6 elseif x.leaf 7 return NIL 8 else Disk-Read(x.c [i]) 9 return B-Tree-Search(x.c [i] , k) 30 / 111
  • 75. Using recursion to make the search easier So, we use line 1 to 5 1 Move to the key x.key [i] such that k ≤ x.key [i] 2 To return the value if stored at the node by the sorted keys!!! If the node is a leaf Return NIL == “That key is not in the B-Tree” The key could be in the next level Then, Disk-Read(x.c [i]) and call the recursion in the children node already in memory. 31 / 111
  • 76. Using recursion to make the search easier So, we use line 1 to 5 1 Move to the key x.key [i] such that k ≤ x.key [i] 2 To return the value if stored at the node by the sorted keys!!! If the node is a leaf Return NIL == “That key is not in the B-Tree” The key could be in the next level Then, Disk-Read(x.c [i]) and call the recursion in the children node already in memory. 31 / 111
  • 77. Using recursion to make the search easier So, we use line 1 to 5 1 Move to the key x.key [i] such that k ≤ x.key [i] 2 To return the value if stored at the node by the sorted keys!!! If the node is a leaf Return NIL == “That key is not in the B-Tree” The key could be in the next level Then, Disk-Read(x.c [i]) and call the recursion in the children node already in memory. 31 / 111
  • 78. Using recursion to make the search easier So, we use line 1 to 5 1 Move to the key x.key [i] such that k ≤ x.key [i] 2 To return the value if stored at the node by the sorted keys!!! If the node is a leaf Return NIL == “That key is not in the B-Tree” The key could be in the next level Then, Disk-Read(x.c [i]) and call the recursion in the children node already in memory. 31 / 111
  • 79. Search operation Note Search(root[t], k) returns (x, i) or NIL if no such key. 32 / 111
  • 80. Cost of Search Worst Cost O(h) = O (logt n) disk reads when going through the entire tree. x.n < 2t ⇒ O(t) for searching the key at each node Finally, we have that O(th) = O (t logt n) CPU time. 33 / 111
  • 81. Cost of Search Worst Cost O(h) = O (logt n) disk reads when going through the entire tree. x.n < 2t ⇒ O(t) for searching the key at each node Finally, we have that O(th) = O (t logt n) CPU time. 33 / 111
  • 82. Cost of Search Worst Cost O(h) = O (logt n) disk reads when going through the entire tree. x.n < 2t ⇒ O(t) for searching the key at each node Finally, we have that O(th) = O (t logt n) CPU time. 33 / 111
  • 83. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 34 / 111
  • 84. Creating an empty tree Pseudo-Code B-Tree-Create(T) 1 x =Allocate-Node() 2 x.leaf =TRUE 3 x.n = 0 4 Disk-Write(x) 5 T.root = x Note To create a nonempty tree, first create an empty tree and then insert nodes. 35 / 111
  • 85. Creating an empty tree Pseudo-Code B-Tree-Create(T) 1 x =Allocate-Node() 2 x.leaf =TRUE 3 x.n = 0 4 Disk-Write(x) 5 T.root = x Note To create a nonempty tree, first create an empty tree and then insert nodes. 35 / 111
  • 86. Cost of Create Worst Cost O(1) disk accesses. O(1) CPU time. 36 / 111
  • 87. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 37 / 111
  • 88. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 89. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 90. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 91. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 92. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 93. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 94. Insertion Something Notable Here is where the things become interesting!!! Insertions can only be done in non-full nodes. The holding data structures for keys and pointers are arrays!!! What? This means that if a node has 2t − 1 keys, something needs to be done in order to make space in the node. Process 1 Split the node around the median key. 2 You finish with two nodes of size t − 1 and the median key y. 3 Promote the median key to the father node to identify the new ranges. 4 If the father is full recursively split the father to make room. 38 / 111
  • 95. Important!!! We always insert at... THE LEAF LEVEL!!! Therefore What if the leaf child becomes full? 39 / 111
  • 96. Important!!! We always insert at... THE LEAF LEVEL!!! Therefore What if the leaf child becomes full? 39 / 111
  • 97. Splitting Splitting Applied to a full child of a non-full parent when full≡ 2t − 1 keys. Example with t = 4 40 / 111
  • 98. Splitting Splitting Applied to a full child of a non-full parent when full≡ 2t − 1 keys. Example with t = 4 20 24 2920 29 21 22 23 24 25 26 27 21 22 23 25 26 27 40 / 111
  • 99. Split-Child Algorithm B-Tree-Split-Child(x, i) 1. z = Allocate-Node() 2. y = x.ci 3. z.leaf = y.leaf 4. z.n = t − 1 5. for j = 1 and t − 1 6. z.key [j] = y.key [j + t] 7. if not y.leaf 8. for j = 1 to t 9. z.c [j] = y.c [j + t] 10. y.n = t − 1 11. for j = x.n + 1 downto i + 1 12. x.c [j + 1] = x.c [j] 13. x.c [i + 1] = z 14. for j = x.n downto i 15. x.key [j + 1] = x.key [j] 16. x.key [i] = y.key [t] 17. x.n = x.n + 1 18. Disk-Write(y) 19. Disk-Write(z) 20. Disk-Write(x) 41 / 111
  • 100. Explanation First The code works as follow: the element y has 2t children (2t − 1 keys) but is reduced to t children. For this, the new node z takes the t largest children from y, and z becomes a new child of x. 42 / 111
  • 101. Explanation First The code works as follow: the element y has 2t children (2t − 1 keys) but is reduced to t children. For this, the new node z takes the t largest children from y, and z becomes a new child of x. 42 / 111
  • 102. Explanation First The code works as follow: the element y has 2t children (2t − 1 keys) but is reduced to t children. For this, the new node z takes the t largest children from y, and z becomes a new child of x. 42 / 111
  • 103. Detailed Explanation First Lines 1-4 creates node z 1. z = Allocate-Node() 2. y = x.ci 3. z.leaf = y.leaf 4. z.n = t − 1 Lines 1-4 21 22 23 24 24 26 27 20 2920 29 21 22 23 24 25 26 27 43 / 111
  • 104. Detailed Explanation First Lines 5-6 copies the keys from position j + 1 in the y node to position j in node z: 5. for j = 1 and t − 1 6. z.key [j] = y.key [j + t] Lines 5-6 21 22 23 24 26 27 2521 22 23 24 24 26 27 20 29 20 29 44 / 111
  • 105. Detailed Explanation First Lines 5-6 copies the keys from position j + 1 in the y node to position j in node z: 5. for j = 1 and t − 1 6. z.key [j] = y.key [j + t] Lines 5-6 21 22 23 24 25 26 2721 22 23 24 27 25 26 20 29 20 29 45 / 111
  • 106. Detailed Explanation Then Lines 7-8 are used to copy the children if you are not a leaf 7. if not y.leaf 8. for j = 1 to t 9. z.c [j] = y.c [j + t] Lines 5-6 21 22 23 24 25 26 2721 22 23 24 25 26 27 20 29 20 29 46 / 111
  • 107. Detailed Explanation Then Lines 7-8 are used to copy the children if you are not a leaf 7. if not y.leaf 8. for j = 1 to t 9. z.c [j] = y.c [j + t] Lines 5-6 21 22 23 24 25 26 2721 22 23 24 25 26 27 20 29 20 29 47 / 111
  • 108. Detailed Explanation Then Lines 7-8 are used to copy the children if you are not a leaf 7. if not y.leaf 8. for j = 1 to t 9. z.c [j] = y.c [j + t] 21 22 23 24 25 26 27 20 29 48 / 111
  • 109. Detailed Explanation Then Line 10 adjust the count for y. 10. y.n = t − 1 49 / 111
  • 110. Detailed Explanation Then Line 11-13 make space to the pointer for the z node 11. for j = x.n + 1 downto i + 1 12. x.c [j + 1] = x.c [j] 13. x.c [i + 1] = z 21 22 23 24 25 26 27 From the Back to the Front 20 29 50 / 111
  • 111. Detailed Explanation Then Line 11-13 make space to the pointer for the z node 11. for j = x.n + 1 downto i + 1 12. x.c [j + 1] = x.c [j] 13. x.c [i + 1] = z 21 22 23 24 25 26 27 20 29 51 / 111
  • 112. Detailed Explanation Then Line 14-15 make space to key from the z node to the node x 14. for j = x.n downto i 15. x.key [j + 1] = x.key [j] 21 22 23 24 25 26 27 20 29 52 / 111
  • 113. Detailed Explanation Then Line 16-17 copy the key to the correct place and increase the counter of x 16. x.key [i] = y.key [t] 17. x.n = x.n + 1 21 22 23 25 26 27 20 24 29 53 / 111
  • 114. Detailed Explanation Then Line 18-20 Write everything to the hard drive 18. Disk-Write(y) 19. Disk-Write(z) 20. Disk-Write(x) 54 / 111
  • 115. Cost of Split-Child Complexity Θ(t) CPU time the for loop to go through the keys O(1) disk writes. 55 / 111
  • 116. Insert Code B-Tree-Insert(T, k) 1 r = T.root 2 if r.n == 2t − 1 3 s =Allocate-Node() 4 T.root = s 5 s.leaf =FALSE 6 s.n = 0 7 s.c [1] = r 8 B-Tree-Split-Childs(s, 1) 9 B-Tree-Insert-Nonfull(s, k) 10 else B-Tree-Insert-Nonfull(s, k) 56 / 111
  • 117. Insert Code B-Tree-Insert(T, k) 1 r = T.root 2 if r.n == 2t − 1 3 s =Allocate-Node() 4 T.root = s 5 s.leaf =FALSE 6 s.n = 0 7 s.c [1] = r 8 B-Tree-Split-Childs(s, 1) 9 B-Tree-Insert-Nonfull(s, k) 10 else B-Tree-Insert-Nonfull(s, k) 56 / 111
  • 118. Insert Code B-Tree-Insert(T, k) 1 r = T.root 2 if r.n == 2t − 1 3 s =Allocate-Node() 4 T.root = s 5 s.leaf =FALSE 6 s.n = 0 7 s.c [1] = r 8 B-Tree-Split-Childs(s, 1) 9 B-Tree-Insert-Nonfull(s, k) 10 else B-Tree-Insert-Nonfull(s, k) 56 / 111
  • 119. Insert Code B-Tree-Insert(T, k) 1 r = T.root 2 if r.n == 2t − 1 3 s =Allocate-Node() 4 T.root = s 5 s.leaf =FALSE 6 s.n = 0 7 s.c [1] = r 8 B-Tree-Split-Childs(s, 1) 9 B-Tree-Insert-Nonfull(s, k) 10 else B-Tree-Insert-Nonfull(s, k) 56 / 111
  • 120. Explanation First Insert using the root of T and the key k to be inserted. Second 1 Use a a temporary variable r to look at the root 2 If r.n == 2t − 1 Then prepare to split by creating an alternate s father node. 1 Then Split the node s using Split-Child 2 Insert using the Insert-Non full operation. 3 else Insert using the Insert-Non full operation. 57 / 111
  • 121. Explanation First Insert using the root of T and the key k to be inserted. Second 1 Use a a temporary variable r to look at the root 2 If r.n == 2t − 1 Then prepare to split by creating an alternate s father node. 1 Then Split the node s using Split-Child 2 Insert using the Insert-Non full operation. 3 else Insert using the Insert-Non full operation. 57 / 111
  • 122. Explanation First Insert using the root of T and the key k to be inserted. Second 1 Use a a temporary variable r to look at the root 2 If r.n == 2t − 1 Then prepare to split by creating an alternate s father node. 1 Then Split the node s using Split-Child 2 Insert using the Insert-Non full operation. 3 else Insert using the Insert-Non full operation. 57 / 111
  • 123. Explanation First Insert using the root of T and the key k to be inserted. Second 1 Use a a temporary variable r to look at the root 2 If r.n == 2t − 1 Then prepare to split by creating an alternate s father node. 1 Then Split the node s using Split-Child 2 Insert using the Insert-Non full operation. 3 else Insert using the Insert-Non full operation. 57 / 111
  • 124. Explanation First Insert using the root of T and the key k to be inserted. Second 1 Use a a temporary variable r to look at the root 2 If r.n == 2t − 1 Then prepare to split by creating an alternate s father node. 1 Then Split the node s using Split-Child 2 Insert using the Insert-Non full operation. 3 else Insert using the Insert-Non full operation. 57 / 111
  • 125. Insert-Full Note First, modify tree (if necessary) to create room for new key. Then, call Insert-Nonfull() Example 58 / 111
  • 126. Insert-Full Note First, modify tree (if necessary) to create room for new key. Then, call Insert-Nonfull() Example 24 21 22 23 24 25 26 27 21 22 23 25 26 27 58 / 111
  • 127. Insert-Nonfull Algorithm B-Tree-Insert-Nonfull(x, k) 1. i = x.n 2. if x.leaf 3. while i ≥ 1 and k < x.key [i] 4. x.key [i + 1] = x.key [i] 5. i = i − 1 6. x.key [i + 1] = k 7. x.n = x.n + 1 8. Disk-Write(x) 9. else while i ≥ 1 and k < x.key [i] 10. i = i − 1 11. i = i + 1 12. Disk-Read(x.c [i]) 13. if x.c [i] .n == 2t − 1 14. B-Tree-Split-Child(x, i) 15. if k > x.key [i] 16. i = i + 1 17. B-Tree-Insert- Nonfull(x.c [i] , k) 59 / 111
  • 128. Explanation Line 1 it gets the rightmost key of the B-Tree 1. i = x.n if x.leaf == TRUE We make space on the key array because we have space for it. 3. while i ≥ 1 and k < x.key [i] 4. x.key [i + 1] = x.key [i] 5. i = i − 1 60 / 111
  • 129. Explanation Line 1 it gets the rightmost key of the B-Tree 1. i = x.n if x.leaf == TRUE We make space on the key array because we have space for it. 3. while i ≥ 1 and k < x.key [i] 4. x.key [i + 1] = x.key [i] 5. i = i − 1 60 / 111
  • 130. Explanation Insert the key with the payload at the correct position and increase the counter of x 6. x.key [i + 1] = k 7. x.n = x.n + 1 Write everything to the disk 8. Disk-Write(x) 61 / 111
  • 131. Explanation Insert the key with the payload at the correct position and increase the counter of x 6. x.key [i + 1] = k 7. x.n = x.n + 1 Write everything to the disk 8. Disk-Write(x) 61 / 111
  • 132. Explanation if x.leaf ! = TRUE Get into the correct child and bring it from the hard drive 9. else while i ≥ 1 and k < x.key [i] 10. i = i − 1 11. i = i + 1 12. Disk-Read(x.c [i]) if the child x.c [i] is full split it 13. if x.c [i] .n == 2t − 1 14. B-Tree-Split-Child(x, i) 62 / 111
  • 133. Explanation if x.leaf ! = TRUE Get into the correct child and bring it from the hard drive 9. else while i ≥ 1 and k < x.key [i] 10. i = i − 1 11. i = i + 1 12. Disk-Read(x.c [i]) if the child x.c [i] is full split it 13. if x.c [i] .n == 2t − 1 14. B-Tree-Split-Child(x, i) 62 / 111
  • 134. Explanation Now we need to decide if k == x.key[i] Then, we take the left child of x.key[i] If not, we take the right child of x.key[i] 15. if k > x.key [i] 16. i = i + 1 After that, we insert in a non-full element 17. B-Tree-Insert-Nonfull(x.c [i] , k) 63 / 111
  • 135. Explanation Now we need to decide if k == x.key[i] Then, we take the left child of x.key[i] If not, we take the right child of x.key[i] 15. if k > x.key [i] 16. i = i + 1 After that, we insert in a non-full element 17. B-Tree-Insert-Nonfull(x.c [i] , k) 63 / 111
  • 136. Explanation Now we need to decide if k == x.key[i] Then, we take the left child of x.key[i] If not, we take the right child of x.key[i] 15. if k > x.key [i] 16. i = i + 1 After that, we insert in a non-full element 17. B-Tree-Insert-Nonfull(x.c [i] , k) 63 / 111
  • 137. Explanation Now we need to decide if k == x.key[i] Then, we take the left child of x.key[i] If not, we take the right child of x.key[i] 15. if k > x.key [i] 16. i = i + 1 After that, we insert in a non-full element 17. B-Tree-Insert-Nonfull(x.c [i] , k) 63 / 111
  • 138. Explanation Now we need to decide if k == x.key[i] Then, we take the left child of x.key[i] If not, we take the right child of x.key[i] 15. if k > x.key [i] 16. i = i + 1 After that, we insert in a non-full element 17. B-Tree-Insert-Nonfull(x.c [i] , k) 63 / 111
  • 139. Cost of Insertion Worst case Θ(logt n) disk writes. Θ(t logt n) CPU time. 64 / 111
  • 140. Example of Constructing a B-Tree by Insertion Proceed as follows Suppose we start with an empty B-Tree and keys arrive in the following order: 1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29, 53, 55, 24, 23, 22, 11. Something Notable We want to build a B-Tree with at most 5 keys. Thus: 2t − 1 = 5 2t = 6 t = 3 65 / 111
  • 141. Example of Constructing a B-Tree by Insertion Proceed as follows Suppose we start with an empty B-Tree and keys arrive in the following order: 1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29, 53, 55, 24, 23, 22, 11. Something Notable We want to build a B-Tree with at most 5 keys. Thus: 2t − 1 = 5 2t = 6 t = 3 65 / 111
  • 142. Example of Constructing a B-Tree by Insertion Proceed as follows Suppose we start with an empty B-Tree and keys arrive in the following order: 1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29, 53, 55, 24, 23, 22, 11. Something Notable We want to build a B-Tree with at most 5 keys. Thus: 2t − 1 = 5 2t = 6 t = 3 65 / 111
  • 143. Example of Constructing a B-Tree by Insertion Proceed as follows Suppose we start with an empty B-Tree and keys arrive in the following order: 1, 12, 8, 2, 25, 6 ,14, 28, 19, 20, 17, 7, 52, 16, 48, 60, 68, 3, 26, 29, 53, 55, 24, 23, 22, 11. Something Notable We want to build a B-Tree with at most 5 keys. Thus: 2t − 1 = 5 2t = 6 t = 3 65 / 111
  • 144. First We insert the first 5 elements in the root node 1 2 8 12 25 66 / 111
  • 145. Constructing a B-Tree Then, we want to insert 6 and for this we split promoting 8 1 2 8 12 25 67 / 111
  • 146. Constructing a B-Tree Then, we want to insert 6 and for this we split promoting 8 8 1 2 12 25 68 / 111
  • 147. Constructing a B-Tree 6, 14, 28, 19 get added to the leaf nodes 8 1 2 6 12 14 19 25 28 69 / 111
  • 148. Constructing a B-Tree Add 20, Split necessary by promoting 19 8 19 1 2 6 12 14 25 28 70 / 111
  • 149. Constructing a B-Tree Add 20 to the leaf node 8 19 1 2 6 12 14 20 25 28 71 / 111
  • 150. Constructing a B-Tree Add 17, 7, 52, 16, 48 to the leaf nodes 8 19 1 2 6 7 12 14 16 17 20 25 28 48 52 72 / 111
  • 151. Constructing a B-Tree Add 60 to a leaf node, it is necessary to split by promoting 28 to the root 8 19 28 1 2 6 7 12 14 16 17 20 25 48 52 73 / 111
  • 152. Constructing a B-Tree Add 60 8 19 28 1 2 6 7 12 14 16 17 20 25 48 52 74 / 111
  • 153. Constructing a B-Tree Add 68, 3, 26, 27, 53 to the leaf nodes 8 19 28 1 2 6 7 12 14 16 17 20 25 48 52 60 75 / 111
  • 154. Constructing a B-Tree Add 68, 3, 26, 27, 53 to the leaf nodes 8 19 28 1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 53 60 68 76 / 111
  • 155. Constructing a B-Tree Add 55 by splitting a leaf node and promoting 54 8 19 28 53 1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 60 68 77 / 111
  • 156. Constructing a B-Tree Add 55 to the leaf 8 19 28 53 1 2 3 6 7 12 14 16 17 20 25 26 27 48 52 55 60 68 78 / 111
  • 157. Constructing a B-Tree Add 24 to the leaf 8 19 28 53 1 2 3 6 7 12 14 16 17 20 24 25 26 27 48 52 55 60 68 79 / 111
  • 158. Constructing a B-Tree Add 22 by splitting a leaf node and promoting 25 8 19 25 28 53 1 2 3 6 7 12 14 16 17 20 22 24 26 27 48 52 55 60 68 80 / 111
  • 159. Constructing a B-Tree Add 11 to the leaf node by adding a empty root node 8 19 25 28 53 1 2 3 6 7 12 14 16 17 20 22 24 26 27 48 52 55 60 68 81 / 111
  • 160. Constructing a B-Tree Split the old root by promoting 25 8 19 1 2 3 6 7 20 22 24 26 2712 14 16 17 25 28 53 48 52 55 60 68 82 / 111
  • 161. Constructing a B-Tree Add 11 to the leaf 8 19 1 2 3 6 7 20 22 24 26 2711 12 14 16 17 25 28 53 48 52 55 60 68 83 / 111
  • 162. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 84 / 111
  • 163. Deletion Main idea Recursively descend the tree. Ensure Ensure any non-root node x that is considered for deletion has at least t keys. Note that May have to move a key down from parent. 85 / 111
  • 164. Deletion Main idea Recursively descend the tree. Ensure Ensure any non-root node x that is considered for deletion has at least t keys. Note that May have to move a key down from parent. 85 / 111
  • 165. Deletion Main idea Recursively descend the tree. Ensure Ensure any non-root node x that is considered for deletion has at least t keys. Note that May have to move a key down from parent. 85 / 111
  • 166. Deletion Cases Case 0: You delete the only key at the root ≈ Empty root Then, you make root’s only child the new root: Case 1: k in x and x.leaf == TRUE, then delete k from x. 86 / 111
  • 167. Deletion Cases Case 0: You delete the only key at the root ≈ Empty root Then, you make root’s only child the new root: Case 1: k in x and x.leaf == TRUE, then delete k from x. leaf 86 / 111
  • 168. Deletion Cases Case 2: k in x, x internal not a leaf 87 / 111
  • 169. Deletion Cases Subcase A: y has at least t keys; find predecessor k of k in subtree rooted at y, recursively delete k , replace k by k in x. not a leaf Predecesor of k 88 / 111
  • 170. Deletion Cases Subcase B: z has at least t keys; find successor k in subtree rooted at z, recursively delete k , replace k by k in x. not a leaf Succesor of k 89 / 111
  • 171. Deletion Cases Subcase C: y and z both have t − 1 keys; merge k and z into y, free z, recursively delete k from y. not a leaf not a leaf 90 / 111
  • 172. Deletion cases Case 3 If the key k is not present in internal node x, determine the root x.ci of the appropriate subtree that must contain k, if k is in the tree at all. If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least t keys. Then finish by recursing on the appropriate child of x. 91 / 111
  • 173. Deletion cases Case 3 If the key k is not present in internal node x, determine the root x.ci of the appropriate subtree that must contain k, if k is in the tree at all. If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least t keys. Then finish by recursing on the appropriate child of x. 91 / 111
  • 174. Deletion cases Case 3 If the key k is not present in internal node x, determine the root x.ci of the appropriate subtree that must contain k, if k is in the tree at all. If x.ci has only t − 1 keys, execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least t keys. Then finish by recursing on the appropriate child of x. 91 / 111
  • 175. Case 3.A Subcase A If x.ci has only t − 1 keys but has an immediate sibling with at least t keys, give x.ci an extra key by moving a key from x down into x.ci , moving a key from x.ci’s immediate left or right sibling up into x, and moving the appropriate child pointer from the sibling into x.ci. not a leaf Recursively Descend 92 / 111
  • 176. Case 3.B Subcase B If x.ci and both of x.ci’s immediate siblings have t − 1 keys, merge x.ci with one sibling, which involves moving a key from x down into the new merged node to become the median key for that node. not a leaf Recursively Descend 93 / 111
  • 177. Delete Example Delete 14 - Case 3.B 8 19 1 2 3 6 7 20 22 24 48 52 55 60 6826 2711 12 14 16 17 25 28 53 94 / 111
  • 178. Delete Example Delete 14 - move 25 down from the root and join the children nodes 8 19 25 28 53 1 2 3 6 7 20 22 24 26 2711 12 14 16 17 48 52 55 60 68 95 / 111
  • 179. Delete Example Delete 14 8 19 25 28 53 1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68 96 / 111
  • 180. Delete Example Case 0 8 19 25 28 53 1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68 97 / 111
  • 181. Delete Example Delete 28 - Case 2.C 8 19 25 28 53 1 2 3 6 7 20 22 24 26 2711 12 16 17 48 52 55 60 68 98 / 111
  • 182. Delete Example Join the left and right children of 28 and move it down 8 19 25 53 1 2 3 6 7 20 22 24 26 27 28 48 5211 12 16 17 55 60 68 99 / 111
  • 183. Delete Example Recursively Delete 28 8 19 25 53 1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68 100 / 111
  • 184. Delete Example Delete 25 - Case 2.B 8 19 25 53 1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68 101 / 111
  • 185. Delete Example Move 26 to the position of 25 8 19 25 53 1 2 3 6 7 20 22 24 26 27 48 5211 12 16 17 55 60 68 102 / 111
  • 186. Delete Example Move 26 to the position of 25 8 19 26 53 1 2 3 6 7 20 22 24 27 48 5211 12 16 17 55 60 68 103 / 111
  • 187. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 104 / 111
  • 188. Reasons for using B-Trees Justification When searching tables held on disc, the cost of each disc transfer is high, but does not depend much on the amount of data transferred, especially if consecutive items are transferred. Example If we use a B-Tree of order 101, say, we can transfer each node in one disc read operation. A B-Tree of order 101 and height 3 can hold 1014 − 1 items (approximately 100 million) and any item can be accessed with 3 disc reads (assuming we hold the root in memory). 105 / 111
  • 189. Reasons for using B-Trees Justification When searching tables held on disc, the cost of each disc transfer is high, but does not depend much on the amount of data transferred, especially if consecutive items are transferred. Example If we use a B-Tree of order 101, say, we can transfer each node in one disc read operation. A B-Tree of order 101 and height 3 can hold 1014 − 1 items (approximately 100 million) and any item can be accessed with 3 disc reads (assuming we hold the root in memory). 105 / 111
  • 190. Reasons for using B-Trees Justification When searching tables held on disc, the cost of each disc transfer is high, but does not depend much on the amount of data transferred, especially if consecutive items are transferred. Example If we use a B-Tree of order 101, say, we can transfer each node in one disc read operation. A B-Tree of order 101 and height 3 can hold 1014 − 1 items (approximately 100 million) and any item can be accessed with 3 disc reads (assuming we hold the root in memory). 105 / 111
  • 191. Comparing trees Binary trees They can become unbalanced and lose their good time complexity (big O). AVL trees are strict binary trees that overcome the balance problem. Heaps remain balanced, but only prioritize (not order) the keys. Multi-way trees B-Trees can be m-way, they have any even number of children. The 2-3 (or 3 way) approximates a permanently balanced binary tree. 106 / 111
  • 192. Comparing trees Binary trees They can become unbalanced and lose their good time complexity (big O). AVL trees are strict binary trees that overcome the balance problem. Heaps remain balanced, but only prioritize (not order) the keys. Multi-way trees B-Trees can be m-way, they have any even number of children. The 2-3 (or 3 way) approximates a permanently balanced binary tree. 106 / 111
  • 193. Comparing trees Binary trees They can become unbalanced and lose their good time complexity (big O). AVL trees are strict binary trees that overcome the balance problem. Heaps remain balanced, but only prioritize (not order) the keys. Multi-way trees B-Trees can be m-way, they have any even number of children. The 2-3 (or 3 way) approximates a permanently balanced binary tree. 106 / 111
  • 194. Comparing trees Binary trees They can become unbalanced and lose their good time complexity (big O). AVL trees are strict binary trees that overcome the balance problem. Heaps remain balanced, but only prioritize (not order) the keys. Multi-way trees B-Trees can be m-way, they have any even number of children. The 2-3 (or 3 way) approximates a permanently balanced binary tree. 106 / 111
  • 195. Comparing trees Binary trees They can become unbalanced and lose their good time complexity (big O). AVL trees are strict binary trees that overcome the balance problem. Heaps remain balanced, but only prioritize (not order) the keys. Multi-way trees B-Trees can be m-way, they have any even number of children. The 2-3 (or 3 way) approximates a permanently balanced binary tree. 106 / 111
  • 196. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 107 / 111
  • 197. Extending the B-Tree Structure: B+ Trees B+ Tree A B+ Tree is like a B-tree except that the interior and leaf nodes have a different structure. Actually A B+ tree can be viewed as a B-tree in which each node contains only keys and pointers to the children. Finally At leaves level you have the real data items(They could be pointers to specific data). Node This allows to pack more information in each node. 108 / 111
  • 198. Extending the B-Tree Structure: B+ Trees B+ Tree A B+ Tree is like a B-tree except that the interior and leaf nodes have a different structure. Actually A B+ tree can be viewed as a B-tree in which each node contains only keys and pointers to the children. Finally At leaves level you have the real data items(They could be pointers to specific data). Node This allows to pack more information in each node. 108 / 111
  • 199. Extending the B-Tree Structure: B+ Trees B+ Tree A B+ Tree is like a B-tree except that the interior and leaf nodes have a different structure. Actually A B+ tree can be viewed as a B-tree in which each node contains only keys and pointers to the children. Finally At leaves level you have the real data items(They could be pointers to specific data). Node This allows to pack more information in each node. 108 / 111
  • 200. Extending the B-Tree Structure: B+ Trees B+ Tree A B+ Tree is like a B-tree except that the interior and leaf nodes have a different structure. Actually A B+ tree can be viewed as a B-tree in which each node contains only keys and pointers to the children. Finally At leaves level you have the real data items(They could be pointers to specific data). Node This allows to pack more information in each node. 108 / 111
  • 201. In the paper Something Notable “Modularizing B+-Trees: Three-Level B+-Trees Work Fine” by Shigero Sasaki and Takuya Araki from NEC NEC NEC Corporation (Nippon Denki Kabushiki Gaisha) is a Japanese multinational provider of information technology (IT) services and products, with its headquarters in Minato, Tokyo, Japan.NEC provides information technology (IT) and network solutions to business enterprises, communications services providers and to government agencies. 109 / 111
  • 202. In the paper Something Notable “Modularizing B+-Trees: Three-Level B+-Trees Work Fine” by Shigero Sasaki and Takuya Araki from NEC NEC NEC Corporation (Nippon Denki Kabushiki Gaisha) is a Japanese multinational provider of information technology (IT) services and products, with its headquarters in Minato, Tokyo, Japan.NEC provides information technology (IT) and network solutions to business enterprises, communications services providers and to government agencies. 109 / 111
  • 203. Outline 1 Introduction Motivation for B-Trees 2 Basic Definitions B-Trees definition Application for B-Trees 3 Height of a B-Tree The Height Property 4 Operations B-Tree operations Search Create Insertion Insertion Example Deletion Delete Example for t = 3 Reasons for using B-Trees B+-Trees 5 Exercises Some Exercises that you can try 110 / 111
  • 204. Exercises You can try the following ones 1 18.1-3 2 18.1-4 3 18.2-3 4 18.2-5 5 18.2-4 6 18.2-6 7 18.2-7 8 18.3-1 111 / 111