MS Thesis

Introduction
Background Knowledge
Problems Solved
Conclusion and Future Work
Eﬃcient Data Structures and Algorithms for
Range Aggregate Problems
Jatin Agarwal
Advisor: Dr. Kannan Srinathan and Dr. Kishore Kothapalli
CSTAR, IIIT Hyderabad
November 27, 2013
Jatin Agarwal Eﬃcient Data Structures and Algorithms for Range Aggregate Pr

Introduction
Problems Solved
Computational Geometry
What is Computational Geometry?
Study of Geometric Algorithms
Why study CG?
Geometric problems arise in the ﬁeld of databases,
Geographical Information Systems (GIS), graphics, VLSI etc.

Introduction
Problems Solved
Classic problems
In Health GIS there is a need to demarcate disease infected areas.
Figure: Convex hull of a ﬁnite set of points

Introduction
Problems Solved
Classic Problems
Problem of ﬁnding closest pair w.r.t Euclidean distance.
Figure: Closest pair of a ﬁnite set of points

Introduction
Problems Solved
Classic Problems
Location based services (ﬁnding the nearest restaurant or the
nearest theater)
Figure: Nearest neighbour problem. This lead to discovery of Voronoi
diagram

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Orthogonal Range Successor Queries
Outline
1 Introduction
2 Background Knowledge
Output Sensitvity
Computation Models
Range Search
3 Problems Solved
Planar Range Maxima
Planar Range Convex Hull and Related problems
4 Conclusion and Future Work

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Example
Given an unsorted array of n numbers and any given query
q = [a, b] report all the numbers x in range a < x < b.
5 9 3 1 7 6 4 12 2
Figure: Unsorted Array of Numbers

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Naive Algorithm
Scan the whole Array !!
Query time: O(n).

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Smart Algorithm
Preprocessing Step: Sort the Array.
Query Algorithm: Binary Search !
Query Time: O(log n + k)
1 2 3 4 5 6 7 9 12
Figure: Unsorted Array of Numbers

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Output Sensitive
Reporting Problem: Lower bound of ω(k) query time, where k
are the number of items reported.
Objective: O(f (n) + k), where f (n) is a small value. An
algorithm which satisﬁes this objective is considered
output-sensitive.

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Ways to Compute
All results in this thesis are either on Comparison Based Model or
word-RAM models.
Cell Probe
Transdichotomous RAM
AC RAM0
Word RAM
Pointer Machine
BST
Pentium RAM
Figure: Hierarchy of the Computation models

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Models
Cell Probe Model: Used for proving lower bounds as we pay
for each memory operation.
Word RAM Model: word-size on computer changes with input
size n.
Comparison Based Model: Only Comparison is only allowed
operation.
Pointer Machine Model: Described by a directed graph with
constant branching factor.

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Applications
Range Tree has tremendous applications in :
GIS
Databases
VLSI

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Range Query
X
Y
q
Figure: Two dimensional Range Tree
S is the set of given points and q is the query hyperbox. The
objective is to eﬃciently
report
count

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Range Searching
How do we solve this problem?

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Naive Technique
Linear Scan of the database.
O(n) query time. No pre-processing involved.
Not output sensitive !!!

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Reduction to Binary Search
Preprocessing step: Build a binary search tree T on the
x-coordinates of the points.
Query algorithm: T is queried with qx and ﬁlter the points.
Query time: O(log n + |S ∩ qx |). Not good if |S ∩ qx | >> k.
Therefore, there was a need to build eﬃcient data structures.

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Popular Structures for Range Searching
Reporting data structures
Query Type Query Time Storage Space query region
Range Tree O(logd−1
n + k) O(n logd
n) Orthogonal
kd-Tree O(log1−1/d
n + k) O(n) arbitrary
Table: n is the number of points, d is the dimensional space and k is the
size of the output

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Range Tree Example
Figure: Two dimensional Range Tree

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Partition based on Range Tree
v1 v2 ............................... vl
w1
wm
.
(b)
xlt xrt
w2
.
.
.
Figure: (a)Standard Range Tree (b)Query Region get partition into
O(log2
n) regions

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Four Possible Range Successor queries
(a) (b)
(c) (d)
Figure: Four diﬀerent types of range successor queries possible in a planeJatin Agarwal Eﬃcient Data Structures and Algorithms for Range Aggregate Pr

Introduction
Problems Solved
Output Sensitvity
Computation Models
Range Search
Compact Range Trees
Navarro & Nekrich in WADS 2012 proposed following result.
Lemma
There exists a data structure that uses O(n) space and answers
orthogonal range successor queries in O(log n) time.
We will use above result to answer all types of Range Successor
Queries.

Introduction
Problems Solved
Planar Range Maxima
Problems Solved
Planar Range Skyline
2-sided Range Reporting
2-sided Range Counting
3-sided Range Counting
Planar Range Convex hull
Othogonal Range Convex hull Reporting
Orthogonal Range Convex Hull Counting
Area of Convex hull in an Orthogonal Range
Perimater of Convex hull in an Orthogonal Range

Introduction
Problems Solved
Planar Range Maxima
Common Challenge
Maintain order in more than one dimension

Introduction
Problems Solved
Planar Range Maxima
Common Challenge
Maintain order in more than one dimension
Output sensitivity
Starting with ﬁnding all points in the range is not output
sensitive
Single-shot algorithm: O(log n + m) time, where m is the total
number of points inside the query range

Introduction
Problems Solved
Planar Range Maxima
Outline
1 Introduction
2 Background Knowledge
Output Sensitvity
Computation Models
Range Search
3 Problems Solved
Planar Range Maxima
4 Conclusion and Future Work

Introduction
Problems Solved
Planar Range Maxima
Dominating or Maximalpoint
Point A = (xa, ya) is said to
dominate point B = (xb, yb)
iﬀ xa > xb and ya > yb.
Maximal point or skyline point:
Whose North East quadrant is
empty.
Red point is Maximal point.
Maxima or skyline is used
interchangeably
Empty
Figure: Dominating point or
Maximal point or Skyline
point

Introduction
Problems Solved
Planar Range Maxima
Maxima
x
y
Figure: Skyline of a ﬁnite set of points

Introduction
Problems Solved
Planar Range Maxima
Planar Range Maxima
x
y
q
Figure: Range Maxima for a given orthogonal range query

Introduction
Problems Solved
Planar Range Maxima
Orthogonal range maxima queries

Introduction
Problems Solved
Planar Range Maxima
Problem Statements
Given a set S of n points in a plane and an orthogonal query Q of
the form [a, b] × [c, d] where [a, b] is a range on the x-axis and
[c, d] is a range on the y-axis, report the skyline points in S ∩ Q.
Problem 1: Report the skyline inside S ∩ Q. Here
Q = [a, +∞] × [c, +∞].
Problem 2: Count the number of points on the skyline of S ∩ Q.
Here Q = [a, +∞] × [c, +∞].
Problem 3: Report the skyline inside S ∩ Q. Here
Q = [a, +∞] × [c, d].
Problem 4: Count the number of points on the skyline of S ∩ Q.
Here Q = [a, +∞] × [c, d].

Introduction
Problems Solved
Planar Range Maxima
Skyline in 2-sided and 3-sided queries
Infy
InfyInfy
(a,d)
(a,c)
pt
pb(a,c)
pt
pb
Figure: (a) 2-sided query unbounded on top and right (b) 3-sided query
unbounded on the right

Introduction
Problems Solved
Planar Range Maxima
2-sided Range Maxima
(a,c)
pt
pb
Figure: 2-sided query unbounded on top and right

Introduction
Problems Solved
Planar Range Maxima
Preprocessing for 2-sided queries
Compute Single shot skyline on given set S of points and discard
remaining points.
Figure: Preprocessing on ﬁnite set S of points
Time complexity:O(n log n)

Introduction
Problems Solved
Planar Range Maxima
Preprocessing Array for 2-sided queries
Construct range successor data structure D on this points.
Figure: Preprocessed on ﬁnite set S of points
Preprocessing Time complexity:O(n log n)

Introduction
Problems Solved
Planar Range Maxima
2-sided query
Given any query Q = [a, +∞] × [c, +∞].
query q=(a,c)
Figure: 2-sided query to unbounded on top and right

Introduction
Problems Solved
Planar Range Maxima
Range Successor Query for Reporting
Find the point pt by making Range Successor query unbounded on
bottom.
pt
Figure: Range Successor query unbounded on bottom

Introduction
Problems Solved
Planar Range Maxima
Keep reporting all the points from pt until we encounter some
point whose x co-ordinate is greater than a.
pt
(a,c)
Figure: 2-sided Range Reporting till x = a
Query Time complexity:O(log n + k) where is total number of
points reported. Jatin Agarwal Eﬃcient Data Structures and Algorithms for Range Aggregate Pr

Introduction
Problems Solved
Planar Range Maxima
2-sided Range counting
To solve counting problem we ﬁnd point pb.
(a,c)
pt
pb
Figure: 2-sided Range Counting

Introduction
Problems Solved
Planar Range Maxima
Range Successor Query for Counting
left.
pb
Figure: Range Successor query unbounded on left
Time complexity:O(log n)

Introduction
Problems Solved
Planar Range Maxima
count = index[pt] − index[pb] + 1 takes constant O(1) time.
(a,c)
pt
pb
Query Time complexity:O(log n)

Introduction
Problems Solved
Planar Range Maxima
Our results on 2-sided queries
Theorem
Given a set S of n points in R2, we can pre-process S into a data
structure of size O(n) in time O(n log n) such that, given an
2-sided query Q = [a, ∞] × [c, ∞], we can report/count the
maximal points of S ∩ Q in time O(log n + k)/O(log n), where k
is the number of points reported.

Introduction
Problems Solved
Planar Range Maxima
Finite set of points on a plane
Figure: Finite Set S of points in 2d-plane

Introduction
Problems Solved
Planar Range Maxima
Preprocessing for 3-sided queries
Point with maximum y
co-ordinate in South-west
quadrant of point pi is its
next maximal point
Every point in set S is
pointing to its next maximal
point in array next
Construct range successor
data structure D on this
points.
Takes preprocessing
O(n log n) time.
Infy
InfyInfy
Figure: Preprocessing 3-sided
maxima queries

Introduction
Problems Solved
Planar Range Maxima
3-sided Reporting query
Report the skyline inside S ∩ Q. Here Q = [a, +∞] × [c, d].
Infy
InfyInfy
(a,d)
(a,c)
Figure: 3-sided query to ﬁnd maxima

Introduction
Problems Solved
Planar Range Maxima
bottom.
Infy
InfyInfy
(a,d)
(a,c)
pt
Figure: 3-sided Range Reporting requires point pt

Introduction
Problems Solved
Planar Range Maxima
Keep reporting all the points from pt until we encounter some
point whose x co-ordinate is greater than a.
Infy
InfyInfy
(a,d)
(a,c)
pt
Figure: 3-sided Range Reporting requires point pt
Query Time complexity:O(log n + k)

Introduction
Problems Solved
Planar Range Maxima
Report then count is not output sensitive.
Infy
InfyInfy
(a,d)
(a,c)
pt
pb

Introduction
Problems Solved
Planar Range Maxima
Preprocessing for 3-sided Counting
We store a level count of maximal as shown in ﬁgure.
Infy
InfyInfy
1
2
1
2
3
4
5
6
34
5
3
4
5
4
56
123

Introduction
Problems Solved
Planar Range Maxima
Range Successor Query for Counting
We ﬁnd point pb using range successor query as shown.
pb

Introduction
Problems Solved
Planar Range Maxima
count =
count[pt] − count[pb] + 1
takes constant O(1) time.
In this example count is
3 = 4 − 1 + 1.
Infy
InfyInfy
1
2
1
2
3
4
5
6
3
4
5
3
4
5
4
56
12
3
pt
pb
(a,d)
(a,c)

Introduction
Problems Solved
Planar Range Maxima
Our Results
Query Type Query Time Storage Space
2-sided Range Reporting O(log n + k) O(n)
2-sided Range Counting O(log n) O(n)
3-sided Range Reporting O(log n + k) O(n)
3-sided Range Reporting O(log n) O(n)
Table: Our Results of Planar Range Skyline Problem
Our results holds only for queries which are either unbounded on
top(and right) or right.

Introduction
Problems Solved
Planar Range Maxima
Conjecture
Conjecture
Given a set S of n points in R2, it may be possible to pre-process
S into a data structure of size O(n log n) in time O(n log n) such
that, given an 4-sided orthogonal query Q = [a, b] × [c, d], we can
report and count the number of maximal points of S ∩ Q in time
O(log n + k) and O(log n + k) respectively.

Introduction
Problems Solved
Planar Range Maxima
Green Nails on a board
Given point set of size n in a plane.
Figure: Finite set of points
Consider green Nails as shown in the ﬁgure. Stretch and leave a
rubber band on this green nails.

Introduction
Problems Solved
Planar Range Maxima
Convex Hull
Points on the rubber band forms Convex Hull.
Figure: Finite set of points

Introduction
Problems Solved
Planar Range Maxima
Problem Statements
We are given a set P of n points in R2 and a query
q = [xlt, xrt] × [yb, yt]. We wish to pre-process P into a data
structure such that given an orthogonal query region q, we can
eﬃciently
Problem 1: Report the points on convex hull of P ∩ q.
Problem 2: Count the number of points on the convex hull of
P ∩ q.
Problem 3: Find the area of the convex hull of P ∩ q.
Problem 4: Find the perimeter of the convex hull of P ∩ q.

Introduction
Problems Solved
Planar Range Maxima
Planar Range Convex Hull
q
Figure: Range Convex hull for a given query

Introduction
Problems Solved
Planar Range Maxima
Challenges
Output sensitivity
Reporting and computing convex hull takes time
O(log n + m log h) where m is total number of point on q and
h is points on the convex hull.
Counting/Area/Perimeter by actually reporting all points on
hull is not output sensitive
Output size is O(1)

Introduction
Problems Solved
Planar Range Maxima
Challenges
Output sensitivity
Reporting and computing convex hull takes time
O(log n + m log h) where m is total number of point on q and
h is points on the convex hull.
Counting/Area/Perimeter by actually reporting all points on
hull is not output sensitive
Output size is O(1)
Space constraint
Total number of possible query ranges is O(n4
)

Introduction
Problems Solved
Planar Range Maxima
Preliminaries
Combining two disjoint convex
hulls:
Only the common outer
tangents(bridges) need to be
computed
Can be done in O(log n)
(Overmars and van Leeuwen
1994)

Introduction
Problems Solved
Planar Range Maxima
Preliminaries
Convex hull can be divided
into 4 parts based on
extreme points
We will compute the ”Upper
Right Convex Hull”
(urc-hull)

Introduction
Problems Solved
Planar Range Maxima
Preprocessing
Use a standard 2D range
tree.
In each node, store the
convex hull of all points in
that subtree
A query is split into O(log n)
horizontal regions, each
further split into O(log n)
vertical regions (”blocks”)

Introduction
Problems Solved
Planar Range Maxima
Crux Idea
Total of O(log2
n) nodes per
query
Do polygon version of
Graham’s scan
Takes time O(log3
n + h)
time per query, where h is
the number of points on the
hull

Introduction
Problems Solved
Planar Range Maxima
Eﬃcient processing
Many blocks can be
discarded from the urc-hull
without processing
Non
Empty
Block

Introduction
Problems Solved
Planar Range Maxima
Worst Case
Deﬁne ”candidate block” as
a non-empty block which is
not discarded
Worst case we still have
O(log2
n) candidate blocks

Introduction
Problems Solved
Planar Range Maxima
Analysis
Range tree divides query into O(log2
n) region
Merging two convex hull takes O(log n) time
Linear Merging of O(log2
n) hulls in query region takes
O(log3
n)
Reporting takes O(h) time where h is total number points on
the convex hull
Therefore it takes O(log3
n + h) for convex hull points in S ∩ q

Introduction
Problems Solved
Planar Range Maxima
Convex Hull as an Cyclic Array
Points on the convex hull
are stored as an array A.
Points in the array are
stored in Anti-clockwise
direction as shown in ﬁgure.
Starting from Maximum y
co-ordinate to minimum y
co-ordinate and vice versa.
p1
p2
p3
p4
p6
p5
p7
p10
p8
p9
p11
p4 p2 p1 p5 p9 p11 p8
A

Introduction
Problems Solved
Planar Range Maxima
Counting points on a Convex hull
Each candidate convex hull
contribute some part from
array A.
Diﬀerence of indices at this
boundaries gives us the
count.
For example
index[p8] − index[p9] + 1 as
shown in ﬁgure.
Therefore computing count
for candidate convex hull
takes O(1) time.
p1
p2
p3
p4
p6
p5
p7
p10
p8
p9
p11
p4 p2 p1 p5 p9 p11 p8
A

Introduction
Problems Solved
Planar Range Maxima
Perimeter of the convex hull
We store commutative sum
of perimeter in Array P.
Each candidate convex hull
contribute some perimeter
to total perimeter A.
Diﬀerence of indices at this
boundaries gives us the
perimeteri .
P[p8] − P[p9] + 1 gives
perimeter as shown in ﬁgure.
Therefore computing
perimeteri for candidate
convex hull takes O(1) time.
p1
p2
p3
p4
p6
p5
p7
p10
p8
p9
p11
p4 p2 p1 p5 p9 p11 p8
A
p42 p41 p43 p46 p411 p48 p44P

Introduction
Problems Solved
Planar Range Maxima
Area of the convex hull
We store commutative sum of areas in Array Ar.
Each candidate convex hull contribute some area to total area
A.
Diﬀerence of indices at this boundaries gives us the areai .
Therefore computing areai for candidate convex hull takes
O(1) time.
1
2
3
4
5
6
7
1
2
3
4
5
6
7
Figure: Finite set of pointsJatin Agarwal Eﬃcient Data Structures and Algorithms for Range Aggregate Pr

Introduction
Problems Solved
Planar Range Maxima
Previous Work
Brass et. al. gave ﬁrst results on Planar Range Convex hull
problem.
Reporting O(log5
n + h) O(n log2
n)
Area O(log5
n) O(n log2
n)
Table: Previous Work on Convex hull and Related problems

Introduction
Problems Solved
Planar Range Maxima
Our Results
We studied Counting and Perimeter problem for the ﬁrst time in
the literature.
Reporting O(log3
n + h) O(n log2
n)
Counting O(log3
n) O(n log2
n)
Area O(log3
n) O(n log2
n)
Perimeter O(log3
n) O(n log2
n)
Table: Our Results of Planar Range Convex hull and Related problems

Introduction
Problems Solved
Conclusion
Skyline and Convex Hull provides ordering in higher
dimensions.
Planar Range provides searching in higher dimensions
Our structures are eﬃcient for large number of queries.
Our structure provides information like count, area and
perimeter without processing individual data elements.

Introduction
Problems Solved
Future Work
Generalization: Categorization can be made by assigning
colors to each point and ﬁnd distinct colors.
Other models of Computation: Studying both planar range
maxima and planar range convex hull in the cell-probe model
to understand lower bound.
Higher Dimensions: Interesting to see this problems in
dimensions greater than two.
Dynamism: Solving on dynamic point sets.

Introduction
Problems Solved
Related Work
Improved bounds for Smallest Enclosing Disk Range
Queries(Accepted as full paper), Sankalp Khare, Jatin
Agarwal, Nadeem Moidu, and Kannan Srinathan Conference:
To appear in the proceedings of the 16th Japan Conference on
Discrete and Computational Geometry and Graphs
(JCDCG22013)
Eﬃcient Range Reporting of Convex Hull(Submitted as
e-print), Jatin Agarwal, Nadeem Moidu, Kishore Kothapalli
and Kannan Srinathan Conference: Computing Research
Repository(CoRR)

Introduction
Problems Solved
Related Work
Planar Convex Hull Range Query and Related
Problems(Accepted as full paper), Nadeem Moidu, Jatin
Agarwal and Kishore Kothapalli Conference: To appear in
proceedings of the 25th Canadian Conference on
Computational Geometry(CCCG 2013)
On Counting Range Maxima Points in Plane (Accepted as full
paper), Anil Kishore Kalavagattu, Jatin Agarwal, Ananda
Swarup Das, and Kishore Kothapalli. Conference: The 23rd
International Workshop on Combinatorial Algorithms(IWOCA
2012)
Range Aggregate Maximal Points in the Plane. (Accepted as
full paper), Ananda Swarup Das, Prosenjit Gupta, Anil
Kishore Kalavagattu, Jatin Agarwal, Kannan Srinathan,
Kishore Kothapalli. Conference: The 6th Workshop on
Algorithms and Computation(WALCOM 2012)Jatin Agarwal Eﬃcient Data Structures and Algorithms for Range Aggregate Pr

Introduction
Problems Solved
Thank You!
Questions?

MS Thesis

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (15)

Similar to MS Thesis

Similar to MS Thesis (20)

Recently uploaded

Recently uploaded (20)

MS Thesis