Buffer Trees - Utility and Applications for External Memory Data Processing
1. 1
CSCI-B 561 Advanced Database
Concepts Project Report
Buffer Trees - Utility and Applications for
External Memory Data Processing
Milind Gokhale
mgokhale@indiana.edu
November 16, 2014
1 ABSTRACT
Now-a-days, due to the very large amounts of data, dependence on External Memory for data
processing has increased tremendously. However there aren’t many generic External Memory
tools designed for processing the data in a database on the external memory.
This report will focus on the basics of buffer tree and some of the possibilities of its utility as
a generic tool for processing data on the external memory. We introduce the problem of bottleneck
in external memory data processing and the motivation for creation of Buffer Trees. We then
describe the Buffer Tree data structure and observations of some experiments conducted on Buffer
Trees in [1]. Finally we enlist some possible applications of Buffer Trees and conclude.
2 INTRODUCTION
Today users have plenty of high quality and high resolution data available through various
technologies and more data keeps on generating in various domains and fields. So the passage of
huge data sets between External memory and Internal memory of computers has become
commonplace. However there is a vast difference between data access speeds on internal memory
and external memory. Internal memory is very fast while external memory is about 105
to 106
times slower in performing random access than the main memory. This has resulted in the growing
demand for high performance input and output mechanisms to pass the huge data between fast
internal memory and slower external storage. The I/O bandwidth is a bottleneck in many large
scale applications like multimedia, GIS, land information seismic databases, virtual reality
applications, satellite imagery, digital libraries and real-time online applications.
3 MOTIVATION
There is an issue of bottleneck of communication between the internal memory and external
memory. The present methodologies for addressing this issue are [1]:
2. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
2
1. Increasing secondary storage device parallelism - thus improving the bandwidth
between secondary memory and main memory.
2. Exploiting locality reference via organization of the data and processing sequence
3. Overlapping I/O with computation, e.g. using prefetching.
Much work has been done on designing the external versions of data structures designed for
internal memory. Mostly these data structures are used in on-line setting where queries should be
answered immediately and within a good worst case number of I/Os. Since they are used in the
on-line setting, they often do not take the advantage of the available main memory [4].
There are many times when the problem or the system is composed of batch setting where
similar processing operations are performed on many data sets. Problems where the sequence of
operations on the data structure are known in advance are known as batched dynamic problems.
Bulk operations help in processing such batched dynamic problems. A bulk operation is a
collection of individual operations that are executed in consecution without being interrupted by
other requests [3]. Typically in industry - bulk order processing, end-of-day job processing, pre-
provisioning of logical resources, temporal and spatial database processing etc. are some of the
biggest batched problems where bulk operations are performed. In batched problems although the
queries are not required to be answered instantly like in on-line setting, however there are tight
service level agreements to complete processing enormous records in a rather short time.
An important paradigm for batched problems in internal memory setting is to use dynamic
data structure to process a sequence of updates [2]. For example – to sort n items, we can insert
them in one by one in the priority queue, followed by a sequence of N deletemin operations.
However if the same paradigm is used naively in the External Memory (EM) setting, with a B-tree
as the dynamic data structure, it will result into sub-optimal I/O performance [2]. For example if
we use the B-tree as a priority queue in sorting, each update and query operation takes O(logBN)
I/Os, thus resulting in a total of O(NlogBN) I/Os which is larger than the optimal sort (N) by a
substantial factor of roughly B . L. Arge developed an elegant buffer tree data structure to support
batched dynamic operations, as in the sweep-line example, where the queries do not have to be
answered right away or in any particular order [4].
4 BUFFERING METHOD AND BUFFER TREE
The I/O Model being used: We list the terms that we use to denote various components of the
I/O model.
i. N = number of elements in the problem instance.
ii. M = number of elements that can fit into the main memory.
iii. B = number of elements per block.
iv. n = N/B = number of blocks in the problem
v. m = M/B = number of blocks that fit in to the main memory
vi. M < N
vii. 0 < B < M/2
3. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
3
viii. An I/O operation is a swap of B elements from the internal memory with B consecutive
elements from external memory.
4.1 WHAT IS THE BUFFERING TECHNIQUE?
Main idea of buffering technique is to perform operations on an external tree data structure in
a lazy manner [4]. This can be achieved by associating the main memory sized buffers with the
internal nodes of the tree. We assign buffers of size m blocks to each of the internal nodes of the
structure. When inserting an element we do not search down the tree for the relevant leaf right
away but wait until we have collected a batch of insertions (or other operations) and then we insert
this block into the buffer of the root. If the buffer runs full, then buffer overflow is said to have
occurred and in such a case the buffer emptying process pushes the elements in the buffer one level
down to the buffers on the next level of the tree [4]. The advantage of buffering is that as a result
of the laziness, we can have several insertions and deletions of the same elements in the tree at the
same time [4].
4.2 BUFFER TREE
4.2.1 Data Structure
Figure 1
It is an a,b tree with a = m/4 and b = m over n
leaves containing B elements each extended with a
buffer of size m attached to each node [3]. All leaves
are on the same level and all nodes have a fan-out
between m/4 and m i.e. between a and b. Internal
nodes are the nodes that do not have leaves as
children. While Leaf nodes are nodes that are not
internal nodes. The height of the tree is O (logmn) and
the structure uses O (n) space for the n elements.
Figure 2
4. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
4
4.2.2 Operations on Buffer Tree
a. Update:
A request element is created consisting of record to be inserted or deleted, a flag
indicating the type of operation and an automatically generated timestamp [1]. Requests
are collected in internal memory until a block of B requests has been formed. The request
elements as a block are inserted into the buffer of the root using one I/O.
b. Buffer Emptying process:
If the buffer of the root contains less than m/2 blocks then nothing is to be done. If there
are more than m/2 blocks in the root, then buffer is emptied using buffer emptying process
[1]. Buffer emptying process at an internal node requires O (m) I/Os since we load m/2
blocks into the internal memory and distribute the elements among theta (m) children of
that node. Throughout the buffer emptying process the process maintains an invariant that
the buffers of the nodes on the path from the root to the leaf node with full buffer are all
empty. The buffer emptying process is not applied on all the internal nodes recursively, but
is rather applied along the path from root to leaf node. This is done to prevent different
rebalancing operations from interfering with each other. The deletion of a block may
involve initiation of several buffer emptying process at the node involved. The buffer
emptying process can be protected from interference from other processes by using dummy
blocks [1].
c. Rebalancing:
Buffer emptying process at the leaf may require rebalancing the underlying a,b tree. An
a,b tree is rebalanced by performing a series of "fuse" operation in the case of insertion and
"share" operation in the case of a delete [1]. Before performing a rebalance operation, we
ensure that the buffers for the corresponding nodes are empty. This is achieved by doing
buffer emptying process at the node involved.
4.2.3 I/O complexity analysis of the buffer tree operations [1]
a. Update:
Each update element on insertion into the root buffer = O((logmn)/B).
Each block in the buffer of any node v = O(height of the tree rooted at v).
b. Buffer Emptying:
Buffer emptying process at any node = O(m). So ignoring the cost of rebalancing, the
total cost of all buffer emptying process on an internal node bounded by O(nlogmn) I/Os.
c. Rebalancing:
Total number of rebalance operations required in a,b tree where b>2a, over k update
operations on an initially empty a,b tree is bounded by k/(b/2 - a).
d. Theorem
The total cost of an arbitrary sequence of intermixed insert and delete operation on an
initially empty buffer tree is O(nlogmn), i.e. the amortized cost of an operation is
O((logmn)/B) I/Os. The tree used O(n) space [4].
5. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
5
5 EXPERIMENTS AND COMPARISONS
As seen in [1], in order to obtain meaningful performance results, several tests were performed
on buffer trees. Some factors like Contention with other processes for machine resources and
Virtual memory effects were controlled.
5.1 COMPARISON TO QUICK SORT
Figure 3 [1]
Sorting executed with buffer tree (Buffer Tree Sort BTS) outperformed the build in internal
memory quicksort. Although for smaller inputs the internal quicksort requires less running time,
for large input sets, quicksort ran out of memory. As per the test results [1] as seen in figure 3, the
internal quicksort failed due to lack of internal memory for problem sizes with ‘n’ larger than 2.8
million items.
5.2 BUFFER TREE TUNING
Figure 4 Number of block pushes for different b [1] Figure 5 running time of BTS for different b [1]
6. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
6
It was found that the values of a and b in terms of m is important for the performance of buffer
tree sort (BTS) as seen in figure 4. Adjusting parameter b reduces the number of I/O operations
performed by BTS. The values (a, b) = (m/32, m/8) gave the best performance in terms of running
time (almost linear). Along with optimal values of a and b, reducing the fan-out increases the
expected size of data in a block push while buffer emptying process and thus reduces the required
number of I/O operations as seen in figure 5.
6 APPLICATIONS
There are places where buffer trees can be utilized like - Buffer trees can be used as a
subroutine in the standard sweep algorithm in order to get an optimal External Memory algorithm
for orthogonal segment intersection. Buffer trees can also be extended to implement segment trees
in external memory in a batched dynamic setting by reducing the node degrees theta (root m) and
by introducing multi-slabs in each node. Buffer trees provide natural amortized implementation of
the priority queue for time-forward processing applications such as discrete event simulation,
sweeping, and list ranking.
6.1 SORTING
First N items are inserted in the buffer tree. Then write/empty operations are performed. For
this the buffer emptying process is started at the root of the buffer tree to all the way down to the
leaves. Then the leaves of the buffer tree are read sequentially from left to right to obtain the
elements in a sorted order. This can be done in complexity of computing the buffer tree data
structure i.e. O (nlogmn) time. Thus the Corollary: N elements can be sorted in O (nlogmn) I/O
operations using the buffer tree [1]. In practice however, other factors like CPU time can also affect
the running time.
6.2 PRIORITY QUEUES
In general, the leftmost leaf of the search tree contains the smallest element. In a buffer tree,
the smallest element need not be in the leftmost leaf. A buffer tree can be used for maintaining
priority queue in external memory by permitting update operation into priority queue and adding
deletemin operation. So in order to extract the minimum element, first the buffer emptying process
is performed on all the nodes on the path from root the leftmost leaf. Hence the leftmost leaf
contains the B smallest elements, and the children of the leftmost node in the buffer tree consists
of at least Bm/4 smallest elements. So when the deletemin operation is executed, at least Bm/4
deletemin operations can be answered without doing additional I/Os and thus the amortized cost
of operation is also reduced. Thus we have Theorem 2: The total cost of an arbitrary sequence of
N insert, delete and deletemin operations on an initially empty buffer tree is O (nlogmn). However
because of this the buffer tree does not support the changing priorities of the elements in the
priority queue.
7. Milind Gokhale Buffer Trees - Utility and Applications for External Memory Data Processing
7
7 CONCLUSION
Thus we conclude that although the generic internal memory data structures serve well for
various problems, owing to greater dependence on external memory, there need to be certain data
structures for processing databases on external memory. Simply transforming the internal memory
data structures to work on external memory will not provide optimal structures because they will
not use the internal memory effectively and thus are sub-optimal. Buffer tree uses both external
and internal memory effectively to give optimal running time performance. Taking into
consideration the theory and tests performed on buffer tree in [1] a buffer tree as a generic data
structure appears to perform well in theory and practice. Since the buffer tree takes the advantage
of the large internal memory we get a good amortized performance in processing the batched
dynamic operations.
8 FUTURE DIRECTION
For any external memory algorithms using buffer tree, the actual running times may be
improved by tuning various other parameters. Measuring I/O efficiency experimentally is an
important topic that can be further explored for various known parameters and currently unknown
parameters.
REFERENCES
[1] D. Hutchinson, A. Maheshwari, J.-R. Sack, and R. Velicescu, “Early experiences in
implementing the buffer tree,” in Proceedings of the Workshop on Algorithm Engineering,
Springer-Verlag, 1997.
[2] Jeffrey Scott Vitter. “Algorithms and Data Structures for External Memory”. Foundations and
Trends in Theoretical Computer Science Volume 2 Issue 4, 2006.
[3] J. van den Bercken, B. Seeger, and P. Widmayer, “A generic approach to bulk loading
multidimensional index structures,” in Proceedings of the International Conference on Very
Large Databases, pp. 406–415, 1997.
[4] L. Arge, “The buffer tree: A technique for designing batched external data structures,”
Algorithmica, vol. 37, no. 1, pp. 1–24, 2003.