Sudan University for Science and Technology
          College of graduate studies
          Msc in Computer Science



               B-Trees
           Space & time tradeoffs

Presented by
Mohamed Zeinelabdeen Abdelgader
Outline

   Space & time tradeoffs
   B-tree

   Definition

   Search in B-Tree




                             2
Space & time tradeoffs
Two varieties of space & time algorithms:
 input enhancement — preprocess the input (or its part) to
    store some info to be used later in solving the problem
        string searching algorithms


   prestructuring — preprocess the input to make accessing its
    elements easier
        indexing schemes (e.g., B-trees)




                                                                  3
B-tree
   The B-tree's creators, R.Bayer and E.
    McCreight. The most common belief is
    that B stands for balanced, as all the leaf
    nodes are at the same level in the tree. B
    may also stand for Bayer, or for Boeing,
    because they were working for Boeing
    Scientific Research Labs at the time.



                                                  4
n[x]                       leaf[x]
                                    3       FALSE

Definition                              Q   T   X     keyi[x]
                                                      i=1..n[x]

 A B-tree T is a rooted tree having the following
   properties:
  1 Every node x has the following fields
      n[x], the number of keys currently stored in node x
      The n[x] keys themselves stored in nondecreasing
       order, so that key1[x] ≤ key2[x] ≤ … ≤ keyn[x] [x]
      Leaf[x],a boolean value that is TRUE if x is a leaf and
       FALSE if x is an internal node.


                                                                5
n[x]                       leaf[x]
                                   3       FALSE

(cont.)              ci[x]
                                       Q   T   X    keyi[x]
                                   ... … … …        i=1..n[x]
                     i=1..n[x]+1

 2 Each internal node x also contains n[x]+1
  pointers c1[x], c2[x],…,cn[x]+1[x] to its children.
  leaf nodes have no children, so their ci fields are
  undefined
 3 The keys keyi[x] separate the ranges of keys
  stored in each subtree:if ki is any key stored in
  the subtree with root ci[x], then
k1≤ key1 [x] ≤ k2 ≤ key2 [x] ≤… ≤ keyn[x] [x] ≤
  kn[x]+1
                                                              6
n[x]                       leaf[x]
                                3       FALSE

(cont.)           ci[x]
                                    Q   T   X    keyi[x]
                                ... … … …        i=1..n[x]
                  i=1..n[x]+1

   4 All leaves have the same depth, which is
    the tree’s height h.
   5 There are lower and upper bounds on
    the number of keys a node can
    contain.These bounds can be expressed in
    terms of a fixed integer t≥2 called a
    minimum degree of the B-tree:

                                                           7
n[x]                       leaf[x]
                                   3       FALSE

(cont.)              ci[x]
                                       Q   T   X    keyi[x]
                                   ... … … …        i=1..n[x]
                     i=1..n[x]+1

    Every node other than the root must have
     at least t-1 keys. Every internal node other
     than the root thus has at least t children, If
     the tree is nonempty, the root must have
     at least one key
    Every node other than the root can contain
     at most 2t-1 keys, therefore, an internal
     node can have at most 2t children. we say
     that a node is full if it contains exactly 2t-1
     keys
                                                              8
Search-overview
      The search operation on a B-tree is
       analogous to a search on a binary tree.
      Instead of choosing between a left and a
       right child as in a binary tree, a B-tree search
       must make an (n[x] +1)-way choice. The
       correct child is chosen by performing a linear
       search of the values in the node.




                                                     9
Search-example




Search for key R

                   10
Search analysis

    After finding the value greater than or equal
     to the desired value, the child pointer to the
     immediate left of that value is followed. If all
     values are less than the desired value, the
     rightmost child pointer is followed.
    Of course, the search can be terminated as
     soon as the desired node is found.




                                                        11
Thm: Let T be a B-tree with n>2 keys and t ≥ 2 of minimum degree
. Then the height h of the
B-tree is bounded above by
Algorithm

  B-TREE-SEARCH(x,k)
   i←1
   while i ≤ n[x] and k > keyi[x]
       do i ← i + 1
   if i ≤ n[x] and k = keyi[x]
       then return (x, i)
   if leaf[x]
       then return NIL
        else Disk-Read(ci[x])
            return B-Tree-Search(ci[x], k)
                                             13
Algorithm analysis
   Input size=Number of children of node.
   Basic operation=
         Disk-Read(ci[x])
   Cases:
       Best case : When find the key in the current node.
            T(n) Є θ(1).
       Average case : When the key is not find in the
        current node.


                                                             14
Algorithm analysis

     The recurrence relation:
          T(n)=T(n/t)+θ(1) where t>= 2

     To solve the recurrence Use the master
      Theorem:
            T(n)=aT(n/b)+f(n) where f(n) Є θ (nd) , a>=1 b>=2 d>=0
                        If a<bd , T(n) Є θ( (nd)
                        If a = bd , T(n) Є θ (nd log n)
                        If a>bd    , T(n) Є θ(nlogab)
      a=1 , b=t , d=0

                        T(n) Є   θ(logn)                             15
References
Introduction to The Design and Analysis
 Algorithms
B trees

B trees

  • 1.
    Sudan University forScience and Technology College of graduate studies Msc in Computer Science B-Trees Space & time tradeoffs Presented by Mohamed Zeinelabdeen Abdelgader
  • 2.
    Outline Space & time tradeoffs  B-tree  Definition  Search in B-Tree 2
  • 3.
    Space & timetradeoffs Two varieties of space & time algorithms:  input enhancement — preprocess the input (or its part) to store some info to be used later in solving the problem  string searching algorithms  prestructuring — preprocess the input to make accessing its elements easier  indexing schemes (e.g., B-trees) 3
  • 4.
    B-tree  The B-tree's creators, R.Bayer and E. McCreight. The most common belief is that B stands for balanced, as all the leaf nodes are at the same level in the tree. B may also stand for Bayer, or for Boeing, because they were working for Boeing Scientific Research Labs at the time. 4
  • 5.
    n[x] leaf[x] 3 FALSE Definition Q T X keyi[x] i=1..n[x] A B-tree T is a rooted tree having the following properties:  1 Every node x has the following fields  n[x], the number of keys currently stored in node x  The n[x] keys themselves stored in nondecreasing order, so that key1[x] ≤ key2[x] ≤ … ≤ keyn[x] [x]  Leaf[x],a boolean value that is TRUE if x is a leaf and FALSE if x is an internal node. 5
  • 6.
    n[x] leaf[x] 3 FALSE (cont.) ci[x] Q T X keyi[x] ... … … … i=1..n[x] i=1..n[x]+1  2 Each internal node x also contains n[x]+1 pointers c1[x], c2[x],…,cn[x]+1[x] to its children. leaf nodes have no children, so their ci fields are undefined  3 The keys keyi[x] separate the ranges of keys stored in each subtree:if ki is any key stored in the subtree with root ci[x], then k1≤ key1 [x] ≤ k2 ≤ key2 [x] ≤… ≤ keyn[x] [x] ≤ kn[x]+1 6
  • 7.
    n[x] leaf[x] 3 FALSE (cont.) ci[x] Q T X keyi[x] ... … … … i=1..n[x] i=1..n[x]+1  4 All leaves have the same depth, which is the tree’s height h.  5 There are lower and upper bounds on the number of keys a node can contain.These bounds can be expressed in terms of a fixed integer t≥2 called a minimum degree of the B-tree: 7
  • 8.
    n[x] leaf[x] 3 FALSE (cont.) ci[x] Q T X keyi[x] ... … … … i=1..n[x] i=1..n[x]+1  Every node other than the root must have at least t-1 keys. Every internal node other than the root thus has at least t children, If the tree is nonempty, the root must have at least one key  Every node other than the root can contain at most 2t-1 keys, therefore, an internal node can have at most 2t children. we say that a node is full if it contains exactly 2t-1 keys 8
  • 9.
    Search-overview  The search operation on a B-tree is analogous to a search on a binary tree.  Instead of choosing between a left and a right child as in a binary tree, a B-tree search must make an (n[x] +1)-way choice. The correct child is chosen by performing a linear search of the values in the node. 9
  • 10.
  • 11.
    Search analysis  After finding the value greater than or equal to the desired value, the child pointer to the immediate left of that value is followed. If all values are less than the desired value, the rightmost child pointer is followed.  Of course, the search can be terminated as soon as the desired node is found. 11
  • 12.
    Thm: Let Tbe a B-tree with n>2 keys and t ≥ 2 of minimum degree . Then the height h of the B-tree is bounded above by
  • 13.
    Algorithm B-TREE-SEARCH(x,k) i←1 while i ≤ n[x] and k > keyi[x] do i ← i + 1 if i ≤ n[x] and k = keyi[x] then return (x, i) if leaf[x] then return NIL else Disk-Read(ci[x]) return B-Tree-Search(ci[x], k) 13
  • 14.
    Algorithm analysis  Input size=Number of children of node.  Basic operation= Disk-Read(ci[x])  Cases:  Best case : When find the key in the current node.  T(n) Є θ(1).  Average case : When the key is not find in the current node. 14
  • 15.
    Algorithm analysis  The recurrence relation:  T(n)=T(n/t)+θ(1) where t>= 2  To solve the recurrence Use the master Theorem: T(n)=aT(n/b)+f(n) where f(n) Є θ (nd) , a>=1 b>=2 d>=0 If a<bd , T(n) Є θ( (nd) If a = bd , T(n) Є θ (nd log n) If a>bd , T(n) Є θ(nlogab) a=1 , b=t , d=0 T(n) Є θ(logn) 15
  • 16.
    References Introduction to TheDesign and Analysis Algorithms