SlideShare a Scribd company logo
1 of 28
B-Tree
An Analysis


       By:
              Nikhil Sharma
                BE/8034/09
Definition
A B-tree is a tree data structure that keeps data sorted and allows
  searches, sequential access, insertions, and deletions in
  logarithmic amortized time. The B-tree is a generalization of
  a binary search tree in which more than two paths diverge from a
  single node.

A B-tree of order m (the maximum number of children for each
  node) is a tree which satisfies the following properties:
 Every node has at most m children.
 Every node (except root and leaves) has at least m⁄2 children.
 The root has at least two children if it is not a leaf node.
 All leaves appear in the same level, and carry information.
 A non-leaf node with k children contains k−1 keys.

              Declaration in C:
              typedef struct { int Count;             // number of keys stored in the current node
              ItemType Key[3]; // array to hold the 3 keys [4];
              long Branch[4]; // array of fake pointers (record numbers)
               } NodeType;
Order & Key of a B-Tree




The following is an example of a B-tree of order 5. This means
that (other than the root node) all internal nodes have at least 3
children (and hence at least 2 keys). Of course, the maximum
number of children that a node can have is 5 (so that 4 is the
maximum number of keys). In practice B-trees usually have
orders a lot bigger than 5. The first row in each node shows the
keys, while the second row shows the pointers to the child nodes
Height of B-Tree

   If n ≥ 1, then for any n-key B-tree T of height h and
    minimum degree t ≥ 2, h logt (n 1)/2

   Height of the B-Tree with n keys is important as it bound
    the number of disk accesses.

   The height of the tree is maximum when each node has
    minimum number of the subtree pointers, q m / 2
           .
   Note:If number of nodes in B-tree equal 2,000,000 (2
    million) and m=200 then maximum height of B-tree is 3,
    where as the binary tree would be of height 20.
Search in a B-Tree

   Search in a B-tree is similar to the search in BST except that in B-
    tree we make a multiway branching decision instead of binary
    branching in BST.


                                        25 62



            12 19                     32 39                     73 84


    3   5    15 17    21 23   30 31    34 37    45 51   69 71    75 79   90 94




                                      Search key 71
B-Tree Insert Operation
   Insertion in B-tree is more complicated than in BST.

   In BST, the keys are added in top down fashion
    resulting in an unbalanced tree.

   B-tree is built bottom up, the keys are added in the
    leaf node, if the leaf node is full another node is
    created, keys are evenly distributed and middle key
    is promoted to the parent. If parent is full, the
    process is repeated.

   B-tree can also be built in top down fashion using
    pre-splitting technique.
Basic Idea : Insertion
                         Find position for the key
                       in the appropriate leaf node




Insert key in order                Is node
and adjust pointer
                        No           full ?

                                          yes
               Split node:                                           If parent is full
               • Create a new node
               • Move half of the keys from the full node to
                 the new node and adjust pointers
               • Promote the median key (before split)
                to the parent
               Split guarantees that each node has        m/ 2   1
               keys.
Cases in B-Tree Insert Operation


   In B-tree insertion we have the following
    cases:
    ◦ Case 1: The leaf node has room for the new
      key.
    ◦ Case 2: The leaf in which key is to be placed is
      full.
      This case can lead to the increase in tree height.
B-Tree Insert Operation

 Case 1: The leaf node has room for the new key.

                                              Find appropriate leaf
      Insert 3                                   node for key 3
                          3
                               10 25



              5   8            14 19 20 23   32 38

           Insert 3 in order
B-Tree Insert Operation
        Case 2: The leaf in which key is to be placed is full.

                                                         Find appropriate leaf
            Insert 16                                       node for key 16

                                  16
                                        10 25
                                           19



                      3   5   8         14 19 20 23   32 38

                                                       No room for key 16 in leaf node
Insert key 19 in parent node in order
                                                              Move median key 19 up and
                                                              Split node: create a new node
                                                              and move keys to the new
                                                              node.
                              14 16 20 23
                                 19
B-Tree Insert Operation


   Case 2: The leaf in which key is to be placed is full
    and this lead to the increase in tree height.




                              45   55   67   81
B-Tree Insert Operation
       Case 2: The height of the tree increases.
   Insert 16
                                                                                                                   No room for 27 in parent, Split node
                        Insert 27 in parent in order                            55


                                                              16
                                                                     45    55   67   81
                                                                          55

No room for 19 in parent,
Split parent node                                                                        48   52                        57   61                       72   77                       86       92


                                   13   27
                                        19
                                        27    33    38

                                                                                     3    3        4   5   5   7    3   3     4   5   5   7   3   3        4    5   5   7   3   3        4        5   5   7
                                                                                     2    8        7   1   9   5    2   8     7   1   9   5   2   8        7    1   9   5   2   8        7        1   9   5




  9     12         14    19   20   23    29    31        35   36      41       42


                                                                   Insert 19 in parent node in order


                                                               No room for key 16,
                                                               Move median key 19 up & Split node

             19
   14    16       20    23
B-Tree Delete Operation
   Deletion is analogous to insertion, but a
    little more complicated.
   Two major cases
    ◦ Case 1: Deletion from leaf node
    ◦ Case 2: Deletion from non-leaf node
      Apply delete by copy technique used in BST, this
       will reduce this case to case 1.
      In delete by copy, the key to be deleted is
       replaced by the largest key in the right subtree or
       smallest in left subtree (which is always a leaf).
B-Tree Delete Operation

   Leaf node deletion cases:
    ◦ After deletion node is at least half full.
    ◦ After deletion underflow occurs
      Redistribute: if number of keys in siblings > 2       .
                                                     m
                                                         1

      Merge nodes if number of keys in siblings < m 2
                                                         1   .
      Merging leads to decrease in tree height.
B-Tree Delete Operation

   After deletion node is at least half full. (inverse of insertion
    case 1)

                                                       Search key 3

                                   10 25



                  3   5   8        14 19             32 38 40 45

                 Key found, delete key 3.
                 Move others keys in the node to eliminate
                 the gap.
B-Tree Delete Operation

   Underflow occurs, evenly redistribute the keys if left or right
    sibling has keys          .
                            m/ 2 1


                                                       Search key
        Delete 14
                                                       14
                               10 25



                    5   8      14 19             32 38 40 45


                             Underflow occurs, evenly redistribute
                             keys
                             in the underflow node, in its sibling
                             and the separator key.
B-Tree Delete Operation

   Underflow occurs and the keys in the left & right sibling are
      m / 2 1 . Merge the underflow node and a sibling.


        Delete 25                                  Move separator key down.

                                                   Move the keys to underflow
                             10 32
                                                   node and discard the sibling.



                    5   8    19 25            38 40


                             Underflow occurs, merge
                             nodes.
B-Tree Delete Operation

           Underflow occurs, height decreases after merging.
             Delete 21
                                          70
Underflow
occurs, merge
nodes
                   8   32                                  79 85



3       5          21 27            47 66       73 75 78   81 83   88 90 92

                  Underflow occurs, merge
                  nodes by moving separator
                  key and the keys in sibling
                  node to the underflow
                  node.
B-Tree V/s Binary Tree
             Advantages
 Efficient in real life problems where
  number of records is very large (i.e.
  large datasets)
 Frees up RAM as all nodes located
  on secondary memory
 B Tree reduces depth of the tree
  hence, desired record is located
  faster

           Disadvantages
 Decision process at each node is
  more complicated in a B-tree
 A sophisticated program is required
  to execute the operations in a B-tree

                                          Fig. Comparison of linear growth rate vs.
                                          logarithmic growth rate
The End
Insert Algorithm

Insert
• Cannot just create a new leaf node and insert it
– resulting tree is not B-tree
• Insert new key into an existing leaf node
• If leaf node is full
– Split full node y (with 2t-1) keys around its median
keyt[y] into two nodes each having t-1 keys
– Move the median key into y’s parent.
– If parent is full, recursively split, all the way to the
  root
node if necessary.
– If root is full, split root - height of tree increase by
  one.
Delete Algorithm

• If k is in an internal node, swap k with its inorder
successor (in a leaf node) then delete k from the
leaf node.
• Deleting k from a leaf x may cause n[x]<t-1.
– if the left sibling has more than t-1 elements, we can
transfer an element from there to retain the property
n[x]≥t-1. To retain the order of the elements, this is
done by moving the largest element in the left sibling
  to
the parent and moving the parent to the left most
position in x
Delete Algorithm

– else, if right sibling has more than t-1 element,
transfer from right sibling through the parent.
– else, merge x with left sibling. One pointer
from the parent needs to be removed in this
case. This is done by moving the parent
element into the new merged node. If the parent
now has fewer than t-1 element, recurse on the
parent.
• Height of the tree may be reduced by 1 if
root contains no element after delete.
• Can also do delete in one pass down, similar
to insert (see textbook).
Insertion
Deletion
Height of B-Tree
   The height of B-tree is maximum if all nodes have minimum
    number of keys.

    1 key in the root + 2(q-1) keys on the second level +……+ 2qh-2(q-1) keys in
      the leaves (level h).

     1 2(q - 1) 2q(q - 1)          2qh -2 (q - 1)
                   h 2
         1 (q 1)         2q i
                   i 0

     Applyingtheformulaof geometricprogression
                  qh 1 1
         1 2(q 1)
                   q 1
        1 2q h 1
     T hus, thenumber of keys in B - T reeof height h is given as :
     n  1 2q h 1
            n 1
     h logq      1
             2
Height of B-Tree
   The height of B-tree is minimum if all nodes are full, thus we have
    m-1 keys in the root + m(m-1) keys on the second level +……+ mh-1(m-1) keys in the leaf nodes


          (m - 1) m(m - 1) m 2 (m - 1)                m h-1 (m - 1)
              h 1                          h 1
                              i
                    ( m 1)m       ( m 1)         mi
              i 0                          i 0

          Applyingt heformulaof geomet ricprogression
                     mh 1
              ( m 1)
                     m 1
              mh 1
          T hus, t henumber of keysin B - T reeof height h is given as :
          n     mh 1
          h     logm ( n 1)
                                         n 1
          logm ( n 1)         h   logq       1
                                          2
Height of B-Tree

   Note: Order m is chosen so that B-tree node size is
    nearly equal to the disk block size.

More Related Content

What's hot

Lecture6 text
Lecture6   textLecture6   text
Lecture6 text
Jay Patel
 
Lecture6 audio
Lecture6   audioLecture6   audio
Lecture6 audio
Mr SMAK
 

What's hot (20)

Audio compression
Audio compressionAudio compression
Audio compression
 
Ch6 bandwidth utilisation multiplexing and spreading
Ch6 bandwidth utilisation multiplexing and spreadingCh6 bandwidth utilisation multiplexing and spreading
Ch6 bandwidth utilisation multiplexing and spreading
 
Unit I.fundamental of Programmable DSP
Unit I.fundamental of Programmable DSPUnit I.fundamental of Programmable DSP
Unit I.fundamental of Programmable DSP
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Multimedia
MultimediaMultimedia
Multimedia
 
MIMO in 15 minutes
MIMO in 15 minutesMIMO in 15 minutes
MIMO in 15 minutes
 
Blue ray disc ppt
Blue ray disc pptBlue ray disc ppt
Blue ray disc ppt
 
5. protocol layering
5. protocol layering5. protocol layering
5. protocol layering
 
Multimedia compression
Multimedia compressionMultimedia compression
Multimedia compression
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression Basics
 
Otsu binarization
Otsu binarizationOtsu binarization
Otsu binarization
 
Architecture of 8085
Architecture of  8085Architecture of  8085
Architecture of 8085
 
Lecture6 text
Lecture6   textLecture6   text
Lecture6 text
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
 
Lecture6 audio
Lecture6   audioLecture6   audio
Lecture6 audio
 
Point processing
Point processingPoint processing
Point processing
 
Multiple access techniques
Multiple access techniquesMultiple access techniques
Multiple access techniques
 
Multimedia:Multimedia compression
Multimedia:Multimedia compression Multimedia:Multimedia compression
Multimedia:Multimedia compression
 
Lecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard TransformLecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard Transform
 
Mimo tutorial by-fuyun_ling
Mimo tutorial by-fuyun_lingMimo tutorial by-fuyun_ling
Mimo tutorial by-fuyun_ling
 

Similar to B tree short (15)

Btrees
BtreesBtrees
Btrees
 
B trees in Data Structure
B trees in Data StructureB trees in Data Structure
B trees in Data Structure
 
Btrees
BtreesBtrees
Btrees
 
Multi way&btree
Multi way&btreeMulti way&btree
Multi way&btree
 
B trees
B treesB trees
B trees
 
B trees2
B trees2B trees2
B trees2
 
08 B Trees
08 B Trees08 B Trees
08 B Trees
 
Btree
BtreeBtree
Btree
 
16807097.ppt b tree are a good data structure
16807097.ppt b tree are a good data structure16807097.ppt b tree are a good data structure
16807097.ppt b tree are a good data structure
 
302 B+Tree Ind Hash
302 B+Tree Ind Hash302 B+Tree Ind Hash
302 B+Tree Ind Hash
 
Threaded Binary Tree
Threaded Binary TreeThreaded Binary Tree
Threaded Binary Tree
 
B trees
B treesB trees
B trees
 
Lec15
Lec15Lec15
Lec15
 
BTrees-fall2010.ppt
BTrees-fall2010.pptBTrees-fall2010.ppt
BTrees-fall2010.ppt
 
Data structures trees - B Tree & B+Tree.pptx
Data structures trees - B Tree & B+Tree.pptxData structures trees - B Tree & B+Tree.pptx
Data structures trees - B Tree & B+Tree.pptx
 

More from Nikhil Sharma (6)

Digital Life
Digital LifeDigital Life
Digital Life
 
B tree long
B tree longB tree long
B tree long
 
India
IndiaIndia
India
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 
Impurities in wastewater & problems caused by it
Impurities in wastewater & problems caused by itImpurities in wastewater & problems caused by it
Impurities in wastewater & problems caused by it
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

B tree short

  • 1. B-Tree An Analysis By: Nikhil Sharma BE/8034/09
  • 2. Definition A B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic amortized time. The B-tree is a generalization of a binary search tree in which more than two paths diverge from a single node. A B-tree of order m (the maximum number of children for each node) is a tree which satisfies the following properties:  Every node has at most m children.  Every node (except root and leaves) has at least m⁄2 children.  The root has at least two children if it is not a leaf node.  All leaves appear in the same level, and carry information.  A non-leaf node with k children contains k−1 keys. Declaration in C: typedef struct { int Count; // number of keys stored in the current node ItemType Key[3]; // array to hold the 3 keys [4]; long Branch[4]; // array of fake pointers (record numbers) } NodeType;
  • 3. Order & Key of a B-Tree The following is an example of a B-tree of order 5. This means that (other than the root node) all internal nodes have at least 3 children (and hence at least 2 keys). Of course, the maximum number of children that a node can have is 5 (so that 4 is the maximum number of keys). In practice B-trees usually have orders a lot bigger than 5. The first row in each node shows the keys, while the second row shows the pointers to the child nodes
  • 4. Height of B-Tree  If n ≥ 1, then for any n-key B-tree T of height h and minimum degree t ≥ 2, h logt (n 1)/2  Height of the B-Tree with n keys is important as it bound the number of disk accesses.  The height of the tree is maximum when each node has minimum number of the subtree pointers, q m / 2 .  Note:If number of nodes in B-tree equal 2,000,000 (2 million) and m=200 then maximum height of B-tree is 3, where as the binary tree would be of height 20.
  • 5. Search in a B-Tree  Search in a B-tree is similar to the search in BST except that in B- tree we make a multiway branching decision instead of binary branching in BST. 25 62 12 19 32 39 73 84 3 5 15 17 21 23 30 31 34 37 45 51 69 71 75 79 90 94 Search key 71
  • 6. B-Tree Insert Operation  Insertion in B-tree is more complicated than in BST.  In BST, the keys are added in top down fashion resulting in an unbalanced tree.  B-tree is built bottom up, the keys are added in the leaf node, if the leaf node is full another node is created, keys are evenly distributed and middle key is promoted to the parent. If parent is full, the process is repeated.  B-tree can also be built in top down fashion using pre-splitting technique.
  • 7. Basic Idea : Insertion Find position for the key in the appropriate leaf node Insert key in order Is node and adjust pointer No full ? yes Split node: If parent is full • Create a new node • Move half of the keys from the full node to the new node and adjust pointers • Promote the median key (before split) to the parent Split guarantees that each node has m/ 2 1 keys.
  • 8. Cases in B-Tree Insert Operation  In B-tree insertion we have the following cases: ◦ Case 1: The leaf node has room for the new key. ◦ Case 2: The leaf in which key is to be placed is full.  This case can lead to the increase in tree height.
  • 9. B-Tree Insert Operation  Case 1: The leaf node has room for the new key. Find appropriate leaf Insert 3 node for key 3 3 10 25 5 8 14 19 20 23 32 38 Insert 3 in order
  • 10. B-Tree Insert Operation  Case 2: The leaf in which key is to be placed is full. Find appropriate leaf Insert 16 node for key 16 16 10 25 19 3 5 8 14 19 20 23 32 38 No room for key 16 in leaf node Insert key 19 in parent node in order Move median key 19 up and Split node: create a new node and move keys to the new node. 14 16 20 23 19
  • 11. B-Tree Insert Operation  Case 2: The leaf in which key is to be placed is full and this lead to the increase in tree height. 45 55 67 81
  • 12. B-Tree Insert Operation  Case 2: The height of the tree increases. Insert 16 No room for 27 in parent, Split node Insert 27 in parent in order 55 16 45 55 67 81 55 No room for 19 in parent, Split parent node 48 52 57 61 72 77 86 92 13 27 19 27 33 38 3 3 4 5 5 7 3 3 4 5 5 7 3 3 4 5 5 7 3 3 4 5 5 7 2 8 7 1 9 5 2 8 7 1 9 5 2 8 7 1 9 5 2 8 7 1 9 5 9 12 14 19 20 23 29 31 35 36 41 42 Insert 19 in parent node in order No room for key 16, Move median key 19 up & Split node 19 14 16 20 23
  • 13. B-Tree Delete Operation  Deletion is analogous to insertion, but a little more complicated.  Two major cases ◦ Case 1: Deletion from leaf node ◦ Case 2: Deletion from non-leaf node  Apply delete by copy technique used in BST, this will reduce this case to case 1.  In delete by copy, the key to be deleted is replaced by the largest key in the right subtree or smallest in left subtree (which is always a leaf).
  • 14. B-Tree Delete Operation  Leaf node deletion cases: ◦ After deletion node is at least half full. ◦ After deletion underflow occurs  Redistribute: if number of keys in siblings > 2 . m 1  Merge nodes if number of keys in siblings < m 2 1 .  Merging leads to decrease in tree height.
  • 15. B-Tree Delete Operation  After deletion node is at least half full. (inverse of insertion case 1) Search key 3 10 25 3 5 8 14 19 32 38 40 45 Key found, delete key 3. Move others keys in the node to eliminate the gap.
  • 16. B-Tree Delete Operation  Underflow occurs, evenly redistribute the keys if left or right sibling has keys . m/ 2 1 Search key Delete 14 14 10 25 5 8 14 19 32 38 40 45 Underflow occurs, evenly redistribute keys in the underflow node, in its sibling and the separator key.
  • 17. B-Tree Delete Operation  Underflow occurs and the keys in the left & right sibling are m / 2 1 . Merge the underflow node and a sibling. Delete 25 Move separator key down. Move the keys to underflow 10 32 node and discard the sibling. 5 8 19 25 38 40 Underflow occurs, merge nodes.
  • 18. B-Tree Delete Operation  Underflow occurs, height decreases after merging. Delete 21 70 Underflow occurs, merge nodes 8 32 79 85 3 5 21 27 47 66 73 75 78 81 83 88 90 92 Underflow occurs, merge nodes by moving separator key and the keys in sibling node to the underflow node.
  • 19. B-Tree V/s Binary Tree Advantages  Efficient in real life problems where number of records is very large (i.e. large datasets)  Frees up RAM as all nodes located on secondary memory  B Tree reduces depth of the tree hence, desired record is located faster Disadvantages  Decision process at each node is more complicated in a B-tree  A sophisticated program is required to execute the operations in a B-tree Fig. Comparison of linear growth rate vs. logarithmic growth rate
  • 21. Insert Algorithm Insert • Cannot just create a new leaf node and insert it – resulting tree is not B-tree • Insert new key into an existing leaf node • If leaf node is full – Split full node y (with 2t-1) keys around its median keyt[y] into two nodes each having t-1 keys – Move the median key into y’s parent. – If parent is full, recursively split, all the way to the root node if necessary. – If root is full, split root - height of tree increase by one.
  • 22. Delete Algorithm • If k is in an internal node, swap k with its inorder successor (in a leaf node) then delete k from the leaf node. • Deleting k from a leaf x may cause n[x]<t-1. – if the left sibling has more than t-1 elements, we can transfer an element from there to retain the property n[x]≥t-1. To retain the order of the elements, this is done by moving the largest element in the left sibling to the parent and moving the parent to the left most position in x
  • 23. Delete Algorithm – else, if right sibling has more than t-1 element, transfer from right sibling through the parent. – else, merge x with left sibling. One pointer from the parent needs to be removed in this case. This is done by moving the parent element into the new merged node. If the parent now has fewer than t-1 element, recurse on the parent. • Height of the tree may be reduced by 1 if root contains no element after delete. • Can also do delete in one pass down, similar to insert (see textbook).
  • 26. Height of B-Tree  The height of B-tree is maximum if all nodes have minimum number of keys. 1 key in the root + 2(q-1) keys on the second level +……+ 2qh-2(q-1) keys in the leaves (level h). 1 2(q - 1) 2q(q - 1) 2qh -2 (q - 1) h 2 1 (q 1) 2q i i 0 Applyingtheformulaof geometricprogression qh 1 1 1 2(q 1) q 1 1 2q h 1 T hus, thenumber of keys in B - T reeof height h is given as : n 1 2q h 1 n 1 h logq 1 2
  • 27. Height of B-Tree  The height of B-tree is minimum if all nodes are full, thus we have m-1 keys in the root + m(m-1) keys on the second level +……+ mh-1(m-1) keys in the leaf nodes (m - 1) m(m - 1) m 2 (m - 1) m h-1 (m - 1) h 1 h 1 i ( m 1)m ( m 1) mi i 0 i 0 Applyingt heformulaof geomet ricprogression mh 1 ( m 1) m 1 mh 1 T hus, t henumber of keysin B - T reeof height h is given as : n mh 1 h logm ( n 1) n 1 logm ( n 1) h logq 1 2
  • 28. Height of B-Tree  Note: Order m is chosen so that B-tree node size is nearly equal to the disk block size.

Editor's Notes

  1. A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node, then, may also be large. A B-tree is designed to branch out in this large number of directions and to contain a lot of keys in each node so that the height of the tree is relatively small. This means that only a small number of nodes must be read from disk to retrieve an item. The goal is to get fast access to the data, and with disk drives this means reading a very small number of records. Note that a large node size (with lots of keys in the node) also fits with the fact that with a disk drive one can usually read a fair amount of data at once.
  2. Insert the following letters into what is originally an empty B-tree of order 5: C N G A H E K Q M F W L T Z D P R X Y S Order 5 means that a node can have a maximum of 5 children and 4 keys. All nodes other than the root must have a minimum of 2 keys. The first 4 letters get inserted into the same node, resulting in this pictureWhen we try to insert the H, we find no room in this node, so we split it into 2 nodes, moving the median item G up into a new root node. Note that in practice we just leave the A and C in the current node and place the H and N into a new node to the right of the old one.Inserting E, K, and Q proceeds without requiring any splits:Inserting M requires a split. Note that M happens to be the median key and so is moved up into the parent node.
  3. In the B-tree as we left it at the end of the last section, delete H. Of course, we first do a lookup to find H. Since H is in a leaf and the leaf has more than the minimum number of keys, this is easy. We move the K over where the H had been and the L over where the K had been. Next, delete the T. Since T is not in a leaf, we find its successor (the next item in ascending order), which happens to be W, and move W up to replace the T. That way, what we really have to do is to delete W from the leaf, which we already know how to do, since this leaf has extra keys. In ALL cases we reduce deletion to a deletion in a leaf, by using this method.Next, delete R. Although R is in a leaf, this leaf does not have an extra key; the deletion results in a node with only one key, which is not acceptable for a B-tree of order 5. If the sibling node to the immediate left or right has an extra key, we can then borrow a key from the parent and move a key up from this sibling. In our specific case, the sibling to the right has an extra key. So, the successor W of S (the last key in the node where the deletion occurred), is moved down from the parent, and the X is moved up. (Of course, the S is moved over so that the W can be inserted in its proper place.)Finally, let&apos;s delete E. This one causes lots of problems. Although E is in a leaf, the leaf has no extra keys, nor do the siblings to the immediate right or left. In such a case the leaf has to be combined with one of these two siblings. This includes moving down the parent&apos;s key that was between those of these two leaves. In our example, let&apos;s combine the leaf containing F with the leaf containing A C. We also move down the D.