Advanced data structures vol. 1

Instructor
Mr. S.Christalin Nelson
AP(SG)/SoCSE
ADVANCED DATA STRUCTURES

At a Glance
• Overview
• Hash Tables
• Tries
• Binary Trees
• Binary Search Trees
• Threaded Binary Trees
• AVL Trees
17-Apr-17
INFO121 - Advanced Data Structures
Instructor: Mr.S.Christalin Nelson
2 of 96

Overview of Data Structures (1/8)
• Data Structures provides better means to store, organize,
fetch, and manage huge amounts of data efficiently.
• Examples of Data Structures
– Default Types: Array, Structure, Union
– Others Basic Types: Stack, Queue, Linked List
– Advanced Types: Hash Tables, Trees, Graphs
• Efficient data structures are key to design efficient algorithms.
• Different data structures are suited to different applications.
• Some data structures are highly specialized to specific tasks.
– Examples:
• Database Implementation (B-trees)
• Compiler Implementation (Hash Tables)
17-Apr-17
Stack Tree
4 of 96

• Arrays
– Stores multiple data elements of similar data type in
contiguous memory locations.
– Note: Usage of Arrays (Refer previous lectures)
– Logical memory representation of 1-D Array
• A - Array Name , X - Array Starting Memory Location, V - Size of
each element in Bits
17-Apr-17
5 of 96

• Structures
– Stores multiple data elements of different data types in
contiguous memory locations (based on system).
– Note: Usage of Structures (Refer previous lectures)
– Memory Layout
– Memory Layout of “struct PART”
17-Apr-17
struct COST {
int amount; char currency_type;
}
struct PART {
struct COST cost; char id[2]; int num_avail;
}
idamount num_avail
cost currency_type
6 of 96

• Unions
– Stores multiple data elements of different types.
– Note: Usage of Unions (Refer previous lecture)
– Memory Allocation
• Total memory allocated for a union’s variable is equal to the
memory required by a data element which is comparatively
bigger than other elements.
17-Apr-17
7 of 96

• Stack
– Stores a set of elements in a particular order (Last in, First out).
– Primitive Operations: Push, Pop
– Note: Access to elements (vs. Arrays)
• Arrays - any element can be accessed
• Stack - only top element can be accessed
– Implemented with Arrays (Static), List (Dynamic) & ADTs
– Applications:
• Parenthesis matcher (Balancing symbols)
• Infix to Postfix & Postfix to Infix Expressions, etc.17-Apr-17
B
A
D
C
B
A
C
B
A
D
C
B
A
E
D
C
B
Atop
top
top
top
top
A
top
8 of 96

• Queue
– Stores a set of elements in a particular order (Last in, First out).
– Primitive Operations: Enqueue, Dequeue
– Implemented with Arrays (Static), List (Dynamic) & ADT
– Variants
• Circular Queue
• Doubly Ended Queue
• Priority Queue
– Real-time Applications
• Server requests (Instant messaging servers queue up incoming
messages, Database requests)
• Print queues (One printer for dozens of computers)
• OS use queues to schedule CPU jobs
17-Apr-17
9 of 96

• Linked list
– Linear collection of nodes connected by pointer links and
accessed via the link-pointer member of the current node.
– Link pointer in the last node is set to null to mark the list’s end
– Use a linked list (instead of an array)
• Unpredictable number of data elements
• List needs to be sorted/grow quickly
– Variants
• Single & Double Linked List
• Circular Linked List (Single/Double)
17-Apr-17
A
data pointer
node
10 of 96

• Array vs. Linked lists
– Linked lists are more complex to code and manage than arrays,
but they have some distinct advantages.
• Dynamic: A linked list can easily grow and shrink in size
– In contrast, the size of a C++ array is fixed at compilation time
• Easy and fast insertions and deletions
– With a linked list, it is not required to move other nodes. It is ONLY
required to reset some pointers.
– In contrast, with an array, it is required to copy to temporary
variables to make room for new elements (insertion) or close the
gap caused by deleted elements.
17-Apr-17
11 of 96

Hashing
• Hashing is a method of storing elements in a table in a way
that reduces the time for search.
• Elements are assumed to be records with several fields.
– One of the fields is called "key“, used for search.
• Idea: Map the keys to indices in an array (table)
– Array elements are accessed by index.
– If mapping process is defined, then each record in the element
can be stored with the corresponding index. Thus each
element would be found with one operation only.
17-Apr-17
13 of 96

Direct Addressing
• The most elementary form of hashing.
• Assumption:
– Direct one-to-one correspondence between the keys and
numbers 0, 1, …, m-1. Here, m is not very large.
• Restrictions
– Keys must be integer
– Range of keys must be small
• Pros & Cons
– Searching is fast, but there is cost – array size is determined by
the largest key. Not very useful if only a few keys are widely
distributed.
• Note: Hash table is a generalization of Direct Addressing
table which removed these restrictions.
17-Apr-17
14 of 96

Hash Functions (1/4)
• It is a function that transforms the search key into a number
(table address) within a predetermined interval. These
numbers are then used as indices in an array (hash table) to
store the records (keys and pointers).
• For a m-sized hash table, a possible hash function h(key) can
be h(key) = key % m.
– Note: Here, “key” can be a number or a string.
• Case 1: A key can be a number
– Hash function, h(key) will map each key (number) into a
suitable number/index within the interval [0, m-1]. The key is
then stored in the hash table corresponding to the index.
17-Apr-17
15 of 96

• Case 2: A key can be a string of characters
– Consider alternative representation (e.g. binary) of a key as a
number and then apply the hash function.
• Each character is represented with p-bits, then the string (key) can
be treated as a (base-2)p
number.
– Example: Find hash code of the key/string “AKEY”. Assume: Each
character is represented by 5-digits, size of table is 5, and Hash
function is h(key) = key%m.
• The string is treated as base-32 (i.e. 25) number.
• Consider the decimal representation: A = 1, B = 2, ……., Z = 26.
• Now, key = (1x323) + (11x322) + (5x321) + (25x320) = 44271.
• Calculate the hash code with h(key) = key % m = 44271 % 5 = 1
• Insert given key/string “AKEY” corresponding to index [1] of the
hash table.
17-Apr-17
16 of 96

• Sample Code: Hashing of the n-sized key (a string ‘tab’) with
the hash function, hash() yields an index (h). This key (string)
is then stored corresponding to index (h) in the m-sized hash
table.
int hash (char tab[n])
{
int h=0, int i;
for (i=0; i<n; i++)
{
h = (32 * h + tab[i]) % m;
}
return h;
}
17-Apr-17
17 of 96

• Properties of a Good Hash Function
– Makes use of all information provided by the key.
– Uniformly distribute records across the hash table [reduced
length of linked lists]
– Maps similar keys to very different hash values.
– Uses very fast operations.
17-Apr-17
18 of 96

Hash Tables (1/2)
• Assumption:
– (1) Size of table (M) can be different from the number of
records/keys to be stored (N).
– (2) M must be prime number to obtain more even distribution
of the keys over the table.
– (3) Suitable Hash function should be defined.
• Hence, the integer (0 to M-1) generated by the hash function is
used as an index in a hash table of M elements.
• Note: Selection of M depends on the collision resolution method
opted for.
• Load factor (λ) = N/M
17-Apr-17
19 of 96

Hash Tables (2/2)
• Operations on Hash Table
– Assumption: A hash function is defined prior to any operation.
– Insert record
• (1) A hash function is used to generate an address/index for each
key value.
• (2) Insert the key value in the corresponding address/index.
– Search for record
• (1) Generate an address/index for each key value using the same
Hash function.
• (2) Search is successful if a matching key value is available in the
corresponding address/index of the hash table.
– Delete record (with a given key)
• (1) Search the key to be deleted.
• (2) Delete the record if search is successful.
17-Apr-17
20 of 96

Collision Resolution Techniques
• Collision
– A hash function should map a given key into an unique index.
Collision occurs when the a hash function maps more than one
key into an index/slot.
• Collision Resolution Techniques
– Open Addressing Method [Invented by A. P. Ershov and W. W.
Peterson, 1957]
• Linear Probing
• Quadratic Probing
• Double Hashing
– Separate Chaining Method [Invented by H. P. Luhn, IBM
engineer, January 1953]
17-Apr-17
21 of 96

Open Addressing (1/5)
• The keys that hash to produce the same address/index are
kept in the hash table itself by probing.
• Probing is done as follows:
– hi(x) = (hash(x) + f(i)) mod TableSize
– where:
• hi(x) is an index in the table to insert x, hash(x) is the hash
function, f(i) is the collision resolution function, i - the current
attempt to insert an element
• Open addressing methods are distinguished by type of f(i)
– Linear probing : f(i) = i
– Quadratic probing : f(i) = i2
– Double hashing : f(i) = i * hash2(x)
• where hash2(x) – 2nd hash function, Value of ‘i' starts with zero
17-Apr-17
22 of 96

• Insertion
– When collision occurs examine each successive table entry
following the collided position using a function, f(i). If it is
unoccupied the key is stored – else, continue probing next slot
• Linear Probing as Collision Avoidance Tech. i.e. f(i) = i
– If ith position is occupied, then check (i+1)th position. If (i+1)th
position is also occupied then check (i+2)th position, next check i+3,
etc.
• Quadratic Probing as Collision Avoidance Tech. i.e. f(i) = i2
– If ith position is occupied, then check (i+12)th position. If (i+12)th
position is also occupied then check (i+22)th position, next check
i+32, etc.
• Double Hashing as Collision Avoidance Tech. i.e. f(i) = i * hash2(x)
– hash2(x) should be chosen so that the increment & M are co-prime.
Otherwise there will be slots that would remain unexamined.
17-Apr-17
23 of 96

• Search
– A key hashes to a position. Remember position.
– Search for key in this position – If match – Search is successful
– There is no match in this position – probe the next position.
• When the end of the table is reached, probing continues from
the beginning, until the original starting position is reached.
– Empty position – unsuccessful search
17-Apr-17
24 of 96

• Disadvantage
– Primary clustering: Large clusters tend to build up if an empty
slot is preceded by filled slots.
– Linear probing runs slowly for nearly full tables.
• Example
– Construct a Hash table for given keys [A, S, E, A, R, C, H, I, N, G,
E, X, A, M, P, L, E] using Linear Probing. Assume M=19.
17-Apr-17
Key A S E A R C H I N G E X A M P L E
Hash Value 1 0 5 1 18 3 8 9 14 7 5 5 1 13 16 12 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S A A C A E E G H I X E L M N P R
25 of 96

• Time to hash a key to an address in table is a constant O(1).
– If there were no collisions, the search/insert time is O(1).
– In case of collisions we will have to count all positions in the
hash table that have to be probed in order to find the wanted
record. Hence,
• For Unsuccessful search, the run time is O(1/(1- λ ))
• For Successful search, the run time is O((1/ λ)ln(1/(1- λ )))
• Note:
– Search/Insertion is faster for λ<1.
– Good strategy: λ < 0.5 (i.e. M > 2N)
– If the table is close to full (i.e. for max. N) then, search time
grows and may become equal to M (table size).
– If λ > 0.5, then re-hashing is preferable [with bigger table]
17-Apr-17
26 of 96

Separate Chaining Method (1/4)
• The keys that hash to produce the same address/index are
kept in linked-lists attached to that address.
– Hash table has M lists with M list header nodes.
– List search/insert procedures are used for sequential search.
• Choice of M (Table Size or No. of indices)
– This method is used in cases where M cannot be predicted in
advance. The choice of M basically depends on other factors
such as available memory.
– Typically M is chosen to be relatively small so as not to use up
a large area of contiguous memory, but large enough to have
short lists for more efficient sequential search.
– In general, M can vary from 0.1N to N. i.e. λ = 1 to 10.
17-Apr-17
27 of 96

• Example:
– Construct a Hash table for given keys [A, S, E, A, R, C, H, I, N, G,
E, X, A, M, P, L, E] using separate chaining technique. Assume
M=11. [Note: Each character is a key]
17-Apr-17
Key A S E A R C H I N G E X A M P L E
Hash Value 1 8 5 1 7 3 8 9 3 7 5 2 1 2 5 1 5
0
1 L A A A
2 M X
3 N C
4
5 E P E E
6
7 G R
8 H S
9 I
10
28 of 96

• Advantages
– Used when memory availability is of concern.
– This method is useful for highly dynamic situations, where
• Number of search keys (N) cannot be predicted in advance OR
Number of the records (M) to be stored is unknown in advance.
– Simpler to implement.
• Disadvantage
– It requires additional time for list processing.
– In case of unevenly distributed keys there may be long lists and
many empty spaces in the table.
17-Apr-17
29 of 96

• Analysis of Separate Chaining Method
– The time to compute the index of a given key is constant, O(1).
Thereafter the search for a record in list starts. Hence, the time
depends on the list length (On average list length is λ).
– Runtime complexity of separate chaining is O(λ).
• On average the number of comparisons in
– Unsuccessful searches is λ (search up to the end of list)
– Successful search is λ/2.
17-Apr-17
30 of 96

Extendible Hashing (1/2)
• Used when the amount of data/key is too large to fit in main
memory and external storage is used.
• Search in Extendible Vs. Ordinary hashing
– In extendible hashing no more than two blocks are examined.
– In contrast, ordinary hashing would examine several disk
blocks and is also a time consuming process.
• Procedure
– Keys are grouped (wrt first m-bits) and each group is stored in
one disk block.
– If the block becomes full and no more records can be inserted,
certain groups can be split into two, and first (m+1) bits are
considered to determine the location of a record.
17-Apr-17
31 of 96

Extendible Hashing (2/2)
• Example:
– (Q1-a) Find the no. of blocks required & Store the keys [00010,
01001, 10001, 11000, 00100, 01010, 01100, 10100, 11010] using
Extendible hashing. Each block can hold only 3 records.
• Assumption: Use first 2 bits [00/01/10/11] of key for grouping.
– [00010, 00100], [01001, 01010, 01100], [10001, 10100], [11000,
11010]
• Hence, 4 disk blocks are required to store these groups.
– (Q1-b) Using Extendible hashing, Insert new key 01011.
• 01011 should be stored in Block-2. But Block-2 is full. Hence re-
grouping is mandatory.
• Regroup using first 3 bits [000/001/010/011/100/101/110/111]
– [00010, 00100], [01001, 01010, 01011], [01100], [10001, 10100],
[11000, 11010]
• Hence, 5 disk blocks are required to store these groups.
17-Apr-17
32 of 96

Tries (1/4)
• Why one more Data Structure?
• Few data structures handle the mapping of key-value pairs
– Array
• Key is the index, Value is the data at that location.
– Hash Table
• Key is the hash code of data, Value is the linked list of data
hashing to that hash code.
• Combination of Arrays and Linked Lists to store data.
• Complexity in designing a perfect hash function that does not
lead to collision (no guarantee).
17-Apr-17
Can we have
Operations with
constant time
i.e. O(1) !!
34 of 96

Tries (2/4)
• What’s new in a Trie?
– Combination of Structures and Pointers to store data.
– No Collision.
– Key is guaranteed to be unique, value of which could be simple
as a boolean that says whether data is available in the structure.
• Operations
– Insertion: Build the correct path from root to leaf.
– Search: Traverse the path from root to leaf and compare data.
17-Apr-17
35 of 96

Tries (3/4)
• Example
– Try to map key-value pairs where keys are 4-digit years [YYYY]
and values are names of verticals of CIT founded during these
years.
– Every branch (path) from a central root node to a leaf node
(Vertical name) would be labelled with digits of the year.
– Each node on the path from root to leaf could have 10 pointers
emanating from it, one for each digit.
– Sample code for creation of a new node (try1)
typedef struct trys
{
char vertical[20]; //data
struct trys *path[10]; //array of pointer
}try1;
17-Apr-17
36 of 96

Tries (4/4)
• Example (contd.)
– Insert BFSI, key is 1636.
– Insert BAO, key is 1701.
– Insert ECRA, key 2004.
– Search for ECRA with key 2004.
– Search for OGI with key 2010.
• Advantage: Constant time operations.
• Disadvantage: Consumes more space.
• Note:
– Why 10 pointers per node?
– Consider creating a dictionary using tries.
17-Apr-17
37 of 96

Terminologies (1/3)
• Root
• Parent
• Child
• Leaf
• Subtree
• Level
• Key
• Degree
• Size
• Internal nodes
17-Apr-17
39 of 96

Terminologies (2/3)
• Path: Sequence of adjacent elements.
• Path length: No. of adjacent connections.
• Depth of node: Length of its path up to root.
• Height of tree: Greatest depth among all its nodes.
• Empty Tree & Singleton Tree
• Degenerate or pathological tree (Degree = 1)
• Full Tree: Internal nodes have same degree.
• Perfect Tree: Full Tree with all leaf nodes in same depth/level
• Complete Tree
– Full tree with all leaf nodes located as far left as possible.
– Has a provision to be converted to a Full Tree by adding leaves
in an uninterrupted way (satisfies “natural mapping” into an
array).17-Apr-17
40 of 96

Terminologies (3/3)
• Examples
17-Apr-17
41 of 96

Binary Tree (1/2)
• Trees are data structures used for data storage purposes.
• Specific Condition: Degree is 2.
• A binary tree has the benefits of both an ordered array and a
linked list
– Insertion/Deletion operation are as fast as in linked list.
– Search operation is as quick as in a sorted array
• Definition of a Tree node
struct node
{
int data;
struct node *leftChild, *rightChild;
};
17-Apr-17
42 of 96

Binary Tree (2/2)
• Properties of Binary Tree
– In a Full binary tree of height (h)
• No. of nodes (n) = at least 2h+1 & at most 2h+1 – 1
• A tree consisting of only a root node has a height of 0.
– In a Perfect binary tree
• No. of leaf nodes (l) = (n+1)/2 = 2h
• No. of nodes(n) = 2l – 1 = 2h+1 – 1
• No. of Internal nodes = l – 1
– In a Complete binary tree of n nodes
• No. of Internal nodes = floor value of (n/2)
– For any non-empty binary tree with leaf nodes (n0) and nodes
of degree-2 (n2), n0 = n2 + 1
17-Apr-17
43 of 96

Storage of Binary Tree in Array (1/2)
• Storage of Binary trees in Array by Natural Mapping
– Root is located at index 0 and hence, a node located at index[i]
has its
• Left child is located at index [2i+1]
• Right child is located at index [2i+2]
• Parent is located at index [ceiling of (i-1)/2]
– This method benefits from more compact storage and better
locality of reference, particularly during a preorder traversal.
– This method of storage is often used for binary heaps.
– No space is wasted because nodes are added in breadth-first
order.
– A Complete binary tree wastes no space.
– However, it is expensive to grow and wastes space
proportional to (2h-n) for a tree with nodes(n) and depth(h).
17-Apr-17
44 of 96

Storage of Binary Tree in Array (2/2)
• Example:
– Store the elements of the tree into an array. (Hint: use natural
mapping).
17-Apr-17
27 14 35 10 19 31 42
45 of 96

Binary Tree Traversal (1/8)
• Process of visiting all the nodes of a tree starting from the
head node to search/locate a given item/key in the tree or
to print all the values it contains.
– Note: Nodes cannot be accessed Randomly.
17-Apr-17
Binary Tree Traversal
Breadth First Traversal
(based on Levels, Process nodes
left to right)
Depth First Traversal
(based on processing descendants before
moving to next child)
Inorder Preorder Postorder
46 of 96

• Inorder Traversal (1/2)
– Until all nodes are traversed
• (1) Recursively traverse left subtree.
• (2) Visit root node.
• (3) Recursively traverse right subtree.
– Note: Every node may represent a subtree itself.
17-Apr-17
47 of 96

• Inorder Traversal (2/2)
• D → B → E → A → F → C → G
17-Apr-17
Leftmost leaf
Rightmost leaf
48 of 96

• Preorder Traversal (1/2)
17-Apr-17
49 of 96

• Preorder Traversal (2/2)
• A → B → D → E → C → F → G
17-Apr-17
Root
Rightmost leaf
50 of 96

• Postorder Traversal (1/2)
17-Apr-17
51 of 96

• Postorder Traversal (2/2)
• D → E → B → F → G → C → A
17-Apr-17
Leftmost leaf
Root
52 of 96

• Question-1: Find the Inorder, Preorder, and Postorder
traversal patterns of the given expression tree.
17-Apr-17
-
*
+
A /
B C
D
E
Preorder: – * + A / B C D E
Inorder: A + B / C * D – E
Postorder: A B C / + D * E –
Postfix, Infix &
Postfix Expressions
53 of 96

Binary Search Tree (1/4)
• A Binary tree can be considered as a Binary Search Tree (BST)
when the following special condition is satisfied
– Value (Left child) < Value (Parent) < Value (Right child)
• Example
• Operations
– Insert
– Search
– Traversal
17-Apr-17
55 of 96

• Q1: Elements of Binary Tree are stored in an array in
Breadth-First Order {7, 1, 0, 3, 2, 5, 4, 6, 9, 8, 10}. Create a
BST & find the different traversal patterns.
17-Apr-17
Preorder: 7, 1, 0, 3, 2, 5, 4, 6, 9, 8, 10
Inorder: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Postorder: 0, 2, 4, 6, 5, 3, 1, 8, 10, 9, 7
7
1 9
0 3 8 10
2 5
4 6
56 of 96

• Q2: The Preorder Traversal pattern for a BST is 12, 9, 5, 4, 7,
10, 15, 13, 19, 16. Find the Postorder Traversal pattern.
17-Apr-17
Postorder: 4, 7, 5, 10, 9, 13, 16, 19, 15, 12
12
9 15
5 10 13 19
4 7 16
57 of 96

• Q3: The Postorder Traversal pattern for a BST is 4, 7, 5, 10, 9,
13, 16, 19, 15, 12. Find the Inorder Traversal pattern.
17-Apr-17
Inorder: 4, 5, 7, 9, 10, 12, 13, 15, 16, 19
12
9 15
5 10 13 19
4 7 16
58 of 96

Binary Tree Vs. Threaded Bin. Tree
• In Binary Trees, as leaf nodes are not connected further to
other nodes their pointers are not used (Null). Probably the
space is wasted.
• Example
– The figure given below shows wasted memory in green color.
• Threaded Binary Trees makes use of these wasted storage
spaces by pointing to other nodes for quick traversal.
17-Apr-17
1
10
9 32
13
20 2
Null
60 of 96
Null

Construction
• Store Inorder Predecessor and Successor at Leaf Nodes
• Example: Consider the following Binary Tree
– Inorder Traversal Pattern of given binary tree is: 9, 10, 32, 1,
20, 13, 2
– Note: The pointers of right/left-most nodes can be made to
point to a head node. Root is attached to this head node.
17-Apr-17
1
10
9 32
13
20 2
Null
1
10
9 32
13
20 2
Null
61 of 96

Self-Balancing BSTs (1/4)
• BSTs can become unbalanced/skewed after insert or delete
operations
– E.g. Insertion of sorted elements into a BST
17-Apr-17
63 of 96

• Self-Balancing BSTs stores information pertaining to balance
in their nodes and automatically restores balance at key
insertion/deletion times
– Perform transformations on the tree (i.e. tree rotations) in
order to keep the height or weight within a constant factor of
the lower bound of log2(n).
– Classification:
• Height balanced BST
• Weight balanced BST
• Variants
– Height balanced BSTs: AVL tree, Splay tree, Red Black tree
– Weight balanced BSTs & Adam’s Tree of Bounded Balance
17-Apr-17
64 of 96

• Comparison
– Height-balanced trees maintain less information than the
weight-balanced trees (WBT).
– WBTs have some extra flexibility that are needed for some
applications.
– Applications of Height Balanced BSTs
• Construct and maintain ordered lists, such as priority queues.
• Key-value pairs are simply inserted with an ordering based on the
key alone, like Associative Arrays.
– Applications of Weight Balanced BSTs
• Dynamic Sets, Dictionaries, and Sequences.
17-Apr-17
65 of 96

• Vs. Hash Tables
– Advantages of Self-Balancing BSTs
• Allows fast enumeration of the items in key order.
• Better worst-case lookup performance. i.e. O(log n) compared to
O(n).
– Disadvantages of Self-Balancing BSTs
• Lookup algorithms get more complicated when there may be
multiple items with the same key.
• Poor average-case performance. i.e. O(log n) compared to O(1)
because of tree-balancing overhead and cache access patterns.
17-Apr-17
66 of 96

AVL Tree (1/29)
• Height-balanced BST named after its two Soviet inventors,
Georgy Adelson-Velsky and Evgenii Landis.
• Lookup, insertion, and deletion all take O(log n) time in both
the average and worst cases, where n is the number of
nodes in the tree prior to the operation.
• Most operations on a BST take time directly proportional to
the height of the tree. Hence, height (max. no. of levels
below the root) is kept small.
• Insertions and deletions may require the tree to be
rebalanced by one or more tree rotations.
17-Apr-17
67 of 96

AVL Tree (2/29)
• Special Property
– The balance factor (difference in the heights of the two child
subtrees of any node) could differ by at most one. If at any
time they differ by more than one, rebalancing is done to
restore this property.
• Left/Right Heavy Nodes & Balanced Nodes
17-Apr-17
Left heavy node
Right heavy node
Balanced node
68 of 96

AVL Tree (3/29)
• Rotations
– Performed for balancing the AVL tree automatically
– Types
• Single Rotation
– Left Rotation
– Right Rotation
• Double Rotation
– Left-Right Rotation
– Right-Left Rotation
17-Apr-17
69 of 96

AVL Tree (4/29)
• Left Rotations
– Performed when a BST becomes unbalanced and right
skewed/heavy while inserting a node to the right sub-tree of a
right sub-tree.
– Rotation is performed around child of the unbalanced node.
– Example: Insert 4, 5, 6 into AVL tree
• In the following figure consider A=3, B=4, C=5
17-Apr-17
70 of 96

AVL Tree (5/29)
• Right Rotations
– Performed when a BST becomes unbalanced and left
skewed/heavy while inserting a node to the left sub-tree of a
left sub-tree.
– Rotation is performed around child of the unbalanced node.
17-Apr-17
71 of 96

AVL Tree (6/29)
• Left-Right Rotations
– Performed when a BST becomes unbalanced while inserting a
node to the right sub-tree of a left sub-tree.
– Rotation is performed in two steps
• (1) Left rotate the Left sub-tree of the unbalanced node.
• (2) Right rotate around the recent child of the unbalanced node.
17-Apr-17
First Rotation Second Rotation
72 of 96

AVL Tree (7/29)
• Right-Left Rotations
– Performed when a BST becomes unbalanced while inserting a
node to the left sub-tree of a right sub-tree.
– Rotation is performed in two steps
• (1) Right rotate the Right sub-tree of the unbalanced node.
• (2) Left rotate around the recent child of the unbalanced node.
17-Apr-17
First Rotation Second Rotation
73 of 96

AVL Tree Example:
• Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree
14
1711
7 53
4
17-Apr-17
74 of 96

AVL Tree Example:
• Insert 14, 17, 11, 7, 53, 4, 13 into an empty AVL tree
14
177
4 5311
13
17-Apr-17
75 of 96

AVL Tree Example:
• Now insert 12
14
177
4 5311
13
12
17-Apr-17
76 of 96

AVL Tree Example:
• Now insert 12
14
177
4 5311
12
13
17-Apr-17
77 of 96

AVL Tree Example:
• Now the AVL tree is balanced.
14
177
4 5312
1311
17-Apr-17
78 of 96

AVL Tree Example:
• Now insert 8
14
177
4 5312
1311
8
From unbalanced node,
traverse 2 nodes
downward in the
direction of insertion
17-Apr-17
79 of 96

AVL Tree Example:
• Now insert 8
14
177
4 5311
128
13
17-Apr-17
80 of 96

AVL Tree Example:
• Now the AVL tree is balanced.
14
17
7
4
53
11
12
8 13
17-Apr-17
81 of 96

AVL Tree Example:
• Now remove 53
14
17
7
4
53
11
12
8 13
17-Apr-17
82 of 96

AVL Tree Example:
• Now remove 53, unbalanced
14
17
7
4
11
12
8 13
traverse 2 nodes
downward in the
direction of next leaf
(with minimal rotation)
17-Apr-17
83 of 96

AVL Tree Example:
• Balanced! Remove 11
14
17
7
4
11
128
13
17-Apr-17
84 of 96

AVL Tree Example:
• Remove 11, replace it with the largest in its left branch
14
17
7
4
8
12
13
17-Apr-17
85 of 96

AVL Tree Example:
• Remove 8, unbalanced
14
17
4
7
12
13
traverse 2 nodes
downward in the
direction of unbalance
17-Apr-17
86 of 96

AVL Tree Example:
• Remove 8, unbalanced
14
17
4
7
12
13
17-Apr-17
87 of 96

AVL Tree Example:
• Balanced!!
14
174
7
12
13
17-Apr-17
88 of 96

AVL Tree (23/29)
• Build an AVL tree with the following values:
15, 20, 24, 10, 13, 7, 30, 36, 25
17-Apr-17
89 of 96

15
15, 20, 24, 10, 13, 7, 30, 36, 25
20
24
15
20
24
10
13
15
20
24
13
10
13
20
24
1510
17-Apr-17
90 of 96

13
20
24
1510
15, 20, 24, 10, 13, 7, 30, 36, 25
7
13
20
2415
10
7
30
3613
20
3015
10
7
3624
17-Apr-17
91 of 96

13
20
3015
10
7
3624
15, 20, 24, 10, 13, 7, 30, 36, 25
25
13
20
30
15
10
7
36
24
2513
24
36
20
10
7
25
30
15
17-Apr-17
92 of 96

Remove 24 and 20 from the AVL tree.
13
24
36
20
10
7
25
30
15
13
20
36
15
10
7
25
30
13
15
36
10
7
25
30
13
30
36
10
7
25
15
17-Apr-17
93 of 96

AVL Tree (28/29)
• Exercises
– Q1: Insert the elements 5, 10, 15, 12, 20, 18, 19 into AVL tree
by performing the necessary rotations.
– Q2: Insert the elements 21, 26, 30, 9, 4, 14, 28, 18, 15, 10, 2, 3,
7 into AVL tree by performing the necessary rotations.
17-Apr-17
94 of 96

AVL Tree (29/29)
Solution to Question-1 Solution to Question-2
17-Apr-17
14
21
30
4
3
26
289 15
2 107 18
15
1910
5 2012 18
95 of 96

More of Advanced Data Structures (Heap Trees, m-way Trees,
Graphs, etc.) in my next Upload

Advanced data structures vol. 1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Advanced data structures vol. 1

Similar to Advanced data structures vol. 1 (20)

More from Christalin Nelson

More from Christalin Nelson (20)

Recently uploaded

Recently uploaded (20)

Advanced data structures vol. 1