R-Trees are an excellent data structure for managing geo-spatial data. Commonly used by mapping applications and any other applications that use the location to customize content. Minimum Bounding Rectangle (MBR) is a commonly used concept in R-trees, which are a modified form of B-trees.
Presentation On Binary Search Tree using Linked List Concept which includes Traversing the tree in Inorder, Preorder and Postorder Methods and also searching the element in the Tree
this presentation is made for the students who finds data structures a complex subject
this will help students to grab the various topics of data structures with simple presentation techniques
best regards
BCA group
(pooja,shaifali,richa,trishla,rani,pallavi,shivani)
Presentation On Binary Search Tree using Linked List Concept which includes Traversing the tree in Inorder, Preorder and Postorder Methods and also searching the element in the Tree
this presentation is made for the students who finds data structures a complex subject
this will help students to grab the various topics of data structures with simple presentation techniques
best regards
BCA group
(pooja,shaifali,richa,trishla,rani,pallavi,shivani)
Indexing is used to speed up access to desired data.
E.g. author catalog in library
A search key is an attribute or set of attributes used to look up records in a file. Unrelated to keys in the db schema.
An index file consists of records called index entries.
An index entry for key k may consist of
An actual data record (with search key value k)
A pair (k, rid) where rid is a pointer to the actual data record
A pair (k, bid) where bid is a pointer to a bucket of record pointers
Index files are typically much smaller than the original file if the actual data records are in a separate file.
If the index contains the data records, there is a single file with a special organization.
a. Concept and Definition✓
b. Inserting and Deleting nodes ✓
c. Linked implementation of a stack (PUSH/POP) ✓
d. Linked implementation of a queue (Insert/Remove) ✓
e. Circular List
• Stack as a circular list (PUSH/POP) ✓
• Queue as a circular list (Insert/Remove) ✓
f. Doubly Linked List (Insert/Remove) ✓
For more course related material:
https://github.com/ashim888/dataStructureAndAlgorithm/
Personal blog
www.ashimlamichhane.com.np
Breadth First Search & Depth First SearchKevin Jadiya
The slides attached here describes how Breadth first search and Depth First Search technique is used in Traversing a graph/tree with Algorithm and simple code snippet.
Indexing is used to speed up access to desired data.
E.g. author catalog in library
A search key is an attribute or set of attributes used to look up records in a file. Unrelated to keys in the db schema.
An index file consists of records called index entries.
An index entry for key k may consist of
An actual data record (with search key value k)
A pair (k, rid) where rid is a pointer to the actual data record
A pair (k, bid) where bid is a pointer to a bucket of record pointers
Index files are typically much smaller than the original file if the actual data records are in a separate file.
If the index contains the data records, there is a single file with a special organization.
a. Concept and Definition✓
b. Inserting and Deleting nodes ✓
c. Linked implementation of a stack (PUSH/POP) ✓
d. Linked implementation of a queue (Insert/Remove) ✓
e. Circular List
• Stack as a circular list (PUSH/POP) ✓
• Queue as a circular list (Insert/Remove) ✓
f. Doubly Linked List (Insert/Remove) ✓
For more course related material:
https://github.com/ashim888/dataStructureAndAlgorithm/
Personal blog
www.ashimlamichhane.com.np
Breadth First Search & Depth First SearchKevin Jadiya
The slides attached here describes how Breadth first search and Depth First Search technique is used in Traversing a graph/tree with Algorithm and simple code snippet.
Slides from our PacificVis 2015 presentation.
The paper tackles the problems of the “giant hairballs”, the dense and tangled structures often resulting from visualiza- tion of large social graphs. Proposed is a high-dimensional rotation technique called AGI3D, combined with an ability to filter elements based on social centrality values. AGI3D is targeted for a high-dimensional embedding of a social graph and its projection onto 3D space. It allows the user to ro- tate the social graph layout in the high-dimensional space by mouse dragging of a vertex. Its high-dimensional rotation effects give the user an illusion that he/she is destructively reshaping the social graph layout but in reality, it assists the user to find a preferred positioning and direction in the high- dimensional space to look at the internal structure of the social graph layout, keeping it unmodified. A prototype im- plementation of the proposal called Social Viewpoint Finder is tested with about 70 social graphs and this paper reports four of the analysis results.
Companion slides to the "Geospatial Applications with Elasticsearch" webinar on June 3, 2015. Video is available on the elastic.co website at https://www.elastic.co/webinars/geospatial-applications-with-elasticsearch
Mapping and classification of spatial data using machine learning: algorithms...Beniamino Murgante
Mapping and classification of spatial data using machine learning: algorithms and software tools
Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne (Switzerland)
Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)
An overview of typical queries on a temporal database, e.g., temporal natural join, temporal coalescing, or temporal set operators. Examples are provided using valid-time, transaction-time and bitemporal databases.
Furnish an Index Using the Works of Tree Structuresijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Trie (aka radix tree or prefix tree), is an ordered tree data structure where the keys are usually strings. Tries have tremendous applications from all sorts of things like dictionary to
Convex Hull - Chan's Algorithm O(n log h) - Presentation by Yitian Huang and ...Amrinder Arora
Chan's Algorithm for Convex Hull Problem. Output Sensitive Algorithm. Takes O(n log h) time. Presentation for the final project in CS 6212/Spring/Arora.
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaAmrinder Arora
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana. Presentation for CS 6212 final project in GWU during Fall 2015 (Prof. Arora's class)
Stopping Rule for Secretory Problem - Presentation by Haoyang Tian, Wesam Als...Amrinder Arora
Stopping Rule for Secretory Problem - Presentation by Haoyang Tian, Wesam Alshami and Dong Wang. Final Presentation for P4, in CS 6212, Fall 2015 taught by Prof. Arora.
How multiple experts can be leveraged in a machine learning application without knowing apriori who are "good" experts and who are "bad" experts. See how we can quantify the bounds on the overall results.
NP completeness. Classes P and NP are two frequently studied classes of problems in computer science. Class P is the set of all problems that can be solved by a deterministic Turing machine in polynomial time.
Euclid's Algorithm for Greatest Common Divisor - Time Complexity AnalysisAmrinder Arora
Euclid's algorithm for finding greatest common divisor is an elegant algorithm that can be written iteratively as well as recursively. The time complexity of this algorithm is O(log^2 n) where n is the larger of the two inputs.
Dynamic Programming design technique is one of the fundamental algorithm design techniques, and possibly one of the ones that are hardest to master for those who did not study it formally. In these slides (which are continuation of part 1 slides), we cover two problems: maximum value contiguous subarray, and maximum increasing subsequence.
Dynamic Programming is one of the most interesting design techniques. The concise idea is to avoid recomputations. Matrix Chain Multiplication and All Pairs Shortest Paths are two interesting applications of this design technique
Divide and Conquer Algorithms - D&C forms a distinct algorithm design technique in computer science, wherein a problem is solved by repeatedly invoking the algorithm on smaller occurrences of the same problem. Binary search, merge sort, Euclid's algorithm can all be formulated as examples of divide and conquer algorithms. Strassen's algorithm and Nearest Neighbor algorithm are two other examples.
This is the second lecture in the CS 6212 class. Covers asymptotic notation and data structures. Also outlines the coming lectures wherein we will study the various algorithm design techniques.
Introduction to Algorithms and Asymptotic NotationAmrinder Arora
Asymptotic Notation is a notation used to represent and compare the efficiency of algorithms. It is a concise notation that deliberately omits details, such as constant time improvements, etc. Asymptotic notation consists of 5 commonly used symbols: big oh, small oh, big omega, small omega, and theta.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
2. Instructor
Prof. Amrinder Arora
amrinder@gwu.edu
Please copy TA on emails
Please feel free to call as well
TA
Iswarya Parupudi
iswarya2291@gwmail.gwu.edu
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 2
3. CS 6213
Basics
Record / Struct
/ Arrays / LLs
Stacks /
Queues
Graphs / Trees
/ BSTs
Heaps and
PQs
Advanced
Trie, B-Tree
Splay Trees
R-Tree
Union Find
Applications
Databases
Spatial
String
In Memory
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 3
4. Antonin Guttman, U. C. Berkeley
K. A. Mohamed
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 4
6. Given a city
map, „index‟
all university
buildings in
an efficient
structure for
quick
topological
search.
6L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
7. 7
“Index”
buildings in
an efficient
structure for
quick search
Spatial object:
Contour (outline) of the area
around the building(s).
Minimum bounding region
(MBR) of the object.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
8. 8
MBR of the city
neighbourhoods.
MBR of the city
defining the
overall search
region.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
9. Mostly involves 2D regions.
Need to support 2D range queries.
Multiple return values desired: Answering a query region by reporting
all spatial objects that are fully-contained-in or overlapping the query
region (Spatial-Access Method – SAM).
In general:
Spatial data objects often cover areas in multidimensional spaces.
Spatial data objects are not well-represented by point-location.
An „index‟ based on an object‟s spatial location is desirable.
9L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
Problem Summary: To retrieve data items quickly and efficiently
according to their spatial locations.
10. A B-Tree is an ordered, dynamic, multi-way structure of order m (i.e. each
node has at most m children).
The keys and the subtrees are arranged in the fashion of a search tree.
Each node may contain a large number of keys, and the number of subtrees
in each node, then, may also be large.
The B-Tree is designed (among other objectives):
to branch out this large number of directions, and
to contain a lot of keys in each node so that the height of the tree is relatively short.
10
M
P T X
B D F G K L N O Q S V W Y ZI
E H
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
11. A height-balanced tree, similar to a B-Tree.
Index records in the leaf nodes contain pointers to the actual
spatial-objects (entries) they represent.
Each entry has a unique identifier that points to one spatial object,
and its MBR; i.e., entry = (MBR, pointer).
Spatial searching requires visiting only a small number of nodes.
The index is completely dynamic: inserts and deletes can be
intermixed with searches. (No periodic reorganization is
required.)
11L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
12. Let M be the maximum number of entries that will fit in one node.
Let m ≤ M/2 be a parameter specifying the minimum number of entries in one
node.
Then an R-Tree must satisfy the following properties:
1. Every leaf node contains between m and M index records, unless it is the
root.
2. For each index-record Entry (I, tuple-identifier) in a leaf node, I is the MBR
that spatially contains the n-dimensional data object represented by the
tuple-identifier.
3. Every non-leaf node has between m and M children, unless it is the root.
4. For each Entry (I, child-pointer) in a non-leaf node, I is the MBR that
spatially contains the regions in the child node.
5. The root has two children unless it is a leaf.
6. All leaves appear on the same level.
12L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
13. An entry E in a leaf node is defined as:
E = (I, tuple-identifier)
Where I refers to the smallest binding n-dimensional region
(MBR) that encompasses the spatial data pointed to by its tuple-
identifier.
I is a series of closed-intervals that make up each dimension of
the binding region.
Example. In 2D, I = (Ix, Iy),
where Ix = [xa, xb], and Iy = [ya, yb].
13L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
14. [Not limited to 2D – higher dimensions are certainly possible.]
In general I = (I0, I1, …, In-1) for n-dimensions, and that Ik = [ka, kb].
If either ka or kb (or both) are equal to , this means that the
spatial object extends outward indefinitely along that dimension.
14L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
15. An entry E in a non-leaf node is defined as: E = (I, child-pointer)
Where the child-pointer points to the child of this node, and I is
the MBR that encompasses all the regions in the child-node‟s
pointer‟s entries.
15
I(A) I(B) … I(M)
I(a) I(b) I(c) I(d)
B
a
b
c
d
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
17. a b c d e f g h i j k l
m n o p
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 17
18. a
b
c
d
m
a b cd e f g h i j k l
m n o p
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 18
19. a
b
c
d
m
e f
n
a b cd e f g h i j k l
m n o p
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 19
20. a
b
c
d
m
e f
n
h
g
i
o p
a b cd e f g h i j k l
m n o p
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 20
21. 21
Typical query:
Find and report
all university
building sites that
are within 5km of
the city centre.
Approach:
i.Build the R-Tree
using rectangular
regions a, b, … i.
ii.Formulate the
query range Q.
iii.Query the R-
Tree and report
all regions
overlapping Q.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
22. Let Q be the query region.
Let T be the root of the R-Tree.
Search all entry-records whose regions overlaps Q.
Search sub-trees:
If T is not leaf, then apply Search on ever child-node entry E
whose I overlaps Q.
Search leaf nodes:
If T is leaf, then check each entry E in the leaf and return E if E.I
overlaps Q.
22L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
23. 23
r2
e
r5 r8
r3 r4r1 r7r0
ic gf hba d
@ r6
@ r2 @ r5 @ r8
@ r0 @ r1 @ r7 @ r3 @ r4
R-Tree settings:
M =
m =
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
24. 24
The search algorithm descends the tree from the root in a manner
similar to a B-Tree.
More than one subtree under a node visited may need to be
searched.
Cannot guarantee good worst-case performance.
Countered by the algorithms during insertion, deletion, and update
that maintain the tree in a form that allows the search algorithm to
eliminate irrelevant regions of the indexed space.
So that only data near the search area need to be examined.
Emphasis is on the optimal placement of spatial objects with respect
to the spatial location of other objects in the structure.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
25. A Node-Overflow happens when a new Entry is added to a fully
packed node, causing the resulting number of entries in the node
to exceed the upper-bound M.
The „overflow‟ node must be split, and all its current entries, as
well as the new one, consolidated for local optimum arrangement.
A Node-Underflow happens when one or more Entries are
removed from a node, causing the remaining number of entries in
that node to fall below the lower-bound m.
The underflow node must be condensed, and its entries
dispersed for global optimum arrangement.
25L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
26. 26
New index entry-records are added to the leaves.
Nodes that overflow are split, and splits propagate up the tree.
A split-propagation may cause the tree to grow in height.
The main Insert routine
Let E = (I, tuple-identifier) be the new entry to be inserted.
Let T be the root of the R-Tree.
[Ins_1] Locate a leaf L starting from T to insert E.
[Ins_2] Add E to L. If L is already full (overflow), split L into L and L‟.
[Ins_3] Propagate MBR changes (enlarged or reduced) upwards.
[Ins_4] Grow tree taller if node split propagation causes T to split.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
27. Similar to insertion into B+-tree but may insert into any leaf; leaf
splits in case capacity exceeded.
Which leaf to insert into? (Choose Leaf)
How to split a node? (Node Split)
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 27
28. m
n
o p
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 28
29. 29
[Ins_1] Locate a leaf L starting from T to insert E = (I, tuple-identifier).
Notion (i): Select the path that would require the least enlargement to include E.I.
Notion (ii): Resolve ties by choosing the child-node with the smallest MBR.
Invoke: L = ChooseLeaf (E, T).
A B C
@rN
A
C
B
E.I
rN
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
30. 30
Algorithm: ChooseLeaf (E, N)
Inputs: (i) Entry E = (I, tuple-identifier), (ii) A valid R-Tree node N.
Output: The leaf L where E should be inserted.
If N is leaf Then Return N
Let FS be the set of current entries in the node N
Let F = (I, child-pointer) FS, so that F.I satisfies the Insertion-
Notions
Return ChooseLeaf (E, F.child-pointer)
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
31. 31
[Ins_2] Add E to L.
Notion (i): If L has room for another entry, install E.
Notion (ii): Otherwise split L to obtain L and L‟, which between
them, will contain all previous entries in L and the new E
(consolidated for local optima).
[Ins_3] Propagate MBR changes upwards by invoking
AdjustTree (L, L‟).
Notion (i): Ascend from leaf L to the root T while adjusting the
covering rectangles MBR.
Notion (ii): If L‟ exists, propagate node splits as necessary; i.e.
attempt to install a new entry in the parent of L to point to L‟.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
32. 32
Example. Found L = @Y to insert new E =
e. R-Tree settings: M = 3, m = 1.
K
@G
a b c
@Y
X Y Z
@K
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
33. 33
Algorithm: AdjustTree (N, N’)
Inputs: (i) A node N that has had its contents modified, (ii) The
resultant split node N‟, if not NULL, that accompanies N.
Outputs: (i) N as above, (ii) N‟ as above.
If N is the root Then Return {(i) N, (ii) N‟}
Let PN be the parent node of N.
Let EN = (I_N, child-pointer_N) in PN, where child-pointer_N points
to N.
Adjust I_N so that it tightly encloses all entry regions in N.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
34. 34
If N‟ is Not NULL Then
If number of entries in PN < M-1 Then
Create a new Entry EN‟ = (I_N’, child-pointer_N’)
Install EN‟ in PN
Return AdjustTree (PN, NULL)
Else
Set {PN, PN‟} = SplitNode (PN, EN‟)
Return AdjustTree (PN, PN‟)
End If
Else
Return AdjustTree (PN, NULL)
End If
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
35. [Ins_4] Grow Tree taller.
Notion: If the recursive node split propagation causes the root to
split, then create a new root whose children are the two resulting
nodes.
35
A B C
@T (root)
E F
@C
G H
@C’
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
36. 36
The height of the R-Tree containing n entry-records is at most
logm n – 1, because the branching factor of each node is at
least m.
The maximum number of nodes is:
Worst case space utilisation for all nodes except the root is:
Nodes will tend to have more than m entries, and this will:
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
37. 37
Current index entry-records are removed from the leaves.
Nodes that underflow are condensed, and its contents redistributed
appropriately throughout the tree.
A condense propagation may cause the tree to shorten in height.
The main Delete routine
Let E = (I, tuple-identifier) be a current entry to be removed.
Let T be the root of the R-Tree.
[Del_1] Find the leaf L starting from T that contains E.
[Ins_2] Remove E from L, and condense „underflow‟ nodes.
[Ins_3] Propagate MBR changes upwards.
[Ins_4] Shorten tree if T contains only 1 entry after condense propagation.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
38. [Del_1] Find the leaf L starting from T that contains E.
Algorithm: FindLeaf (E, N)
Inputs: (i) Entry E = (I, tuple-identifier), (ii) A valid R-Tree node N.
Output: The leaf L containing E.
If N is leaf Then
If N contains E Then Return N
Else Return NULL
Else
Let FS be the set of current entries in N.
For each F = (I, child-pointer) FS where F.I overlaps E.I Do
Set L = FindLeaf (E, F.child-pointer)
If L is not NULL Then Return L
Next F
Return NULL
End If
38L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
39. [Del_2] Remove E from L, and condense „underflow‟ nodes.
[Del_3] Propagate MBR changes upwards.
Notion (i): Ascend from leaf L to root T while adjusting covering
rectangles MBR.
Notion (ii): If after removing the entry E in L and the number of
entries in L becomes fewer than m, then the node L has to be
eliminated and its remaining contents relocated.
39L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
40. Propagate these notions upwards by invoking CondenseTree (N,
QS), where N is an R-Tree node whose entries have been modified,
and QS is the set of eliminated nodes.
Start the propagation by setting N = L, and QS = .
Re-insert the entries from the eliminated nodes in QS back into the
tree.
Entries from eliminated leaf nodes are re-inserted as new entries
using the Insert routine discussed earlier.
Entries from higher-level nodes must be placed higher in the tree so
that leaves of their dependent subtrees will be on the same level as
the leaves on the main tree.
40L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
41. Example: Delete the index entry-record b. R-Tree settings: M = 4,
m = 2.
Spatial constraint: a.I will form smallest MBR with r4.
41
r2 r6
@ r7
a b
@ r0
r0 r1
@ r2
r3 r4 r5
@ r6
c d e
@ r1
f g h
@ r3
i j
@ r4
k l m
@ r5
n
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
42. 42
Algorithm: CondenseTree (N, QS)
Inputs: (i) A node N whose entries have been modified, (ii) A set of
eliminated nodes QS.
If N is NOT the root Then
Let PN be the parent node of N.
Let EN = (I_N, child-pointer_N) in PN.
If N.entries < m Then
Delete EN from PN
Add N to QS
Else
Adjust I_N so that it tightly encloses all entry regions in N.
End If
CondenseTree (PN, QS)
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
43. 43
Else If N is root AND Q is NOT Then
For each Q QS Do
For each E Q Do
If Q is leaf Then Insert (E)
Else Insert (E) as a node entry at the same node level as
Q
End If
Next E
Next Q
End If
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
44. Why ‘re-insert’ orphaned entries?
Alternatively, like the delete routine in B-Tree (Rosenberg & Snyder, 1981),
an „underflow‟ node can be merged with whichever adjacent sibling that will
have its area increased the least, or its entries re-distributed among sibling
nodes.
Both methods can cause the nodes to split.
Eventually all changes need to be propagated upwards, anyway.
Re-insertion accomplishes the same thing, and:
It is simpler to implement (and at comparable efficiency).
It incrementally refines the spatial structure of the tree.
It prevents gradual deterioration if each entry was located permanently under
the same parent node.
44L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
45. 45
A high value of m, nearer to M, is useful when the underlying
database represented by the R-Tree is mostly used for search
inquiries with very few updates.
The height of the tree will be kept to a minimum.
High search performance is maintained.
However, the risk of overflow and underflow is also high.
A small value of m is good when frequent updates and
modifications of the underlying database is required.
The nodes are less dense.
Maintenance is less costly.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora
46. Avoids multiple paths during searching.
Objects may be stored in multiple nodes
MBRs of nodes at same tree level do not overlap
On insertion/deletion the tree may change downward or upward in
order to maintain the structure
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 46
R-TreeVariants
48. Similar to other R-Trees except that the Hilbert
value of its rectangle centroid is calculated.
That key is used to guide the insertion
On an overflow, evenly divide between two nodes
Experiments has shown that this scheme
significantly improves performance and decreases
insertion complexity.
Hilbert R-tree achieves up to 28% saving in the
number of pages touched compared to R*-tree.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 48
R-TreeVariants
49. The Hilbert value of an object is found by interleaving the bits of
its x and y coordinates, and then chopping the binary string into 2-
bit strings.
Then, for every 2-bit string, if the value is 0, we replace every 1 in
the original string with a 3, and vice-versa.
If the value of the 2-bit string is 3, we replace all 2‟s and 0‟s in a
similar fashion.
After this is done, you put all the 2-bit strings back together and
compute the decimal value of the binary string;
This is the Hilbert value of the object.
http://www-users.cs.umn.edu/research/shashi-
group/CS8715/exercise_ans.doc
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 49
R-TreeVariants
50. Proposed by Norbert Beckmann, Hans-Peter Kriegel, Ralf
Schneider, and Bernhard Seeger in 1990
Same algorithm as the regular R-tree for query and delete
operations.
When inserting, the R*-tree uses a combined strategy.
For leaf nodes, overlap is minimized
For inner nodes, enlargement and area are minimized.
When splitting, the R*-tree uses a topological split that chooses a
split axis based on perimeter, then minimizes overlap.
In addition to an improved split strategy, the R*-tree also tries to
avoid splits by reinserting objects and subtrees into the tree,
inspired by the concept of balancing a B-tree.
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 50
R-TreeVariants
51. MBR: Minimum Bounding Rectangle
R-Trees are an extremely compelling data structure for spatial
data.
Largely based on B-Tree (Can be considered a generalization of
B-Tree)
Can support more than two dimensions
Support same basic operations (deletion, searching, insertion,
update, etc.)
Many variants of R-Trees are available
L7 - R-TreesCS 6213 - Advanced Data Structures - Arora 51