Upcoming SlideShare
×

# 358 33 powerpoint-slides_13-graphs_chapter-13

669
-1

Published on

Reema Thareja Data Structure on Graphs

Published in: Education, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total Views
669
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide

### 358 33 powerpoint-slides_13-graphs_chapter-13

1. 1. © Oxford University Press 2011. All rights reserved. Data Structures Using CData Structures Using C Reema TharejaReema Thareja, Assistant Professor, Institute of Information Technology and Management, New Delhi
3. 3. © Oxford University Press 2011. All rights reserved. INTRODUCTION • A graph is an abstract data structure that is used to implement the graph concept from mathematics. A graph is basically, a collection of vertices (also called nodes) and edges that connect these vertices. A graph is often viewed as a generalization of the tree structure, where instead of a having a purely parent-to-child relationship between tree nodes, any kind of complex relationships between the nodes can be represented. Why graphs are useful? • Graphs are widely used to model any situation where entities or things are related to each other in pairs; for example, the following information can be represented by graphs: • Family trees in which the member nodes have an edge from parent to each of their children. • Transportation networks in which nodes are airports, intersections, ports, etc. The edges can be airline flights, one-way roads, shipping routes, etc.
4. 4. © Oxford University Press 2011. All rights reserved. Definition • A graph G is defined as an ordered set (V, E), where V(G) represent the set of vertices and E(G) represents the edges that connect the vertices. • The figure given shows a graph with V(G) = { A, B, C, D and E} and E(G) = { (A, B), (B, C), (A, D), (B, D), (D, E), (C, E) }. Note that there are 5 vertices or nodes and 6 edges in the graph. A B C D E A graph can be directed or undirected. In an undirected graph, the edges do not have any direction associated with them. That is, if an edge is drawn between nodes A and B, then the nodes can be traversed from A to B as well as from B to A. The above figure shows an undirected graph because it does not gives any information about the direction of the edges. The given figure shows a directed graph. In a directed graph, edges form an ordered pair. If there is an edge from A to B, then there is a path from A to B but not from B to A. The edge (A, B) is said to initiate from node A (also known as initial node) and terminate at node B (terminal node). A B C D E
5. 5. © Oxford University Press 2011. All rights reserved. Graph Terminology • Adjacent Nodes or Neighbors: For every edge, e = (u, v) that connects nodes u and v; the nodes u and v are the end-points and are said to be the adjacent nodes or neighbors. • Degree of a node: Degree of a node u, deg(u), is the total number of edges containing the node u. If deg(u) = 0, it means that u does not belong to any edge and such a node is known as an isolated node. • Regular graph: Regular graph is a graph where each vertex has the same number of neighbors. That is every node has the same degree. A regular graph with vertices of degree k is called a k regular graph or regular graph of degree‑ k. O regular graph 1 regular graph 2 regular graph Path: A path P, written as P = {v0, v1, v2,….., vn), of length n from a node u to v is defined as a sequence of (n+1) nodes. Here, u = v0, v = vn and vi-1 is adjacent to vi for i = 1, 2, 3, …, n. Closed path: A path P is known as a closed path if the edge has the same end-points. That is, if v0 = vn. Simple path: A path P is known as a simple path if all the nodes in the path are distinct with an exception that v0 may be equal to vn. If v0 = vn, then the path is called a closed simple path.
6. 6. © Oxford University Press 2011. All rights reserved. • Cycle: A closed simple path with length 3 or more is known as a cycle. A cycle of length k is called a k – cycle. • Connected graph: A graph in which there exists a path between any two of its nodes is called a connected graph. That is to say that there are no isolated nodes in a connected graph. A connected graph that does not have any cycle is called a tree. • Complete graph: A graph G is said to be a complete, if all its nodes are fully connected, that is, there is a path from one node to every other node in the graph. A complete graph has n(n-1)/2 edges, where n is the number of nodes in G. • Labeled graph or weighted graph: A graph is said to be labeled if every edge in the graph is assigned some data. In a weighted graph, the edges of the graph are assigned some weight or length. Weight of the edge, denoted by w(e) is a positive value which indicates the cost of traversing the edge. • Multiple edges: Distinct edges which connect the same end points are called multiple edges. That is, e = {u, v) and e’ = (u, v) are known as multiple edges of G. • Loop: An edge that has identical end-points is called a loop. That is, e = (u, u). • Multi- graph: A graph with multiple edges and/or a loop is called a multi-graph. • Size of the graph: The size of a graph is the total number of edges in it. (a) Multi-graph (b) Tree (c) Weighted Graph A B C e1 e2 e3 e4 e5 e6 e7 A D E B C F BA D C B 3 2 4 7 1 3
7. 7. © Oxford University Press 2011. All rights reserved. Directed Graph A directed graph G, also known as a digraph, is a graph in which every edge has a direction assigned to it. An edge of a directed graph is given as an ordered pair (u, v) of nodes in G. For an edge (u, v)- • the edge begins at u and terminates at v • u is known as the origin or initial point of e. Correspondingly, v is known as the destination or terminal point of e • u is the predecessor of v. Correspondingly, v is the successor of u • nodes u and v are adjacent to each other. Terminology of a directed graph • Out-degree of a node: The out degree of a node u, written as outdeg(u), is the number of edges that originate at u. • In-degree of a node: The in degree of a node u, written as indeg(u), is the number of edges that terminate at u. • Degree of a node: Degree of a node written as deg(u) is equal to the sum of in-degree and out- degree of that node. Therefore, deg(u) = indeg(u) + outdeg(u) • Source: A node u is known as a source if it has a positive out-degree but an in-degree = 0. • Sink: A node u is known as a sink if it has a positive in degree but a zero out-degree. • Reachability: A node v is said to be reachable from node u, if and only if there exists a (directed) path from node u to node v. • Strongly connected directed graph: A digraph is said to be strongly connected if and only if there exists a path from every pair of nodes in G. That is, if there is a path from node u to v, then there must be a path from node v to u. • Unilaterally connected graph: A digraph is said to be unilaterally connected if there exists a path from any pair of nodes u, v in G such that there is a path from u to v or a path from v to u but not both. • Parallel/Multiple edges: Distinct edges which connect the same end points are called multiple edges. That is, e = {u, v) and e’ = (u, v) are known as multiple edges of G. • Simple directed graph: A directed graph G is said to be a simple directed graph if and only if it has no parallel edges. However, a simple directed graph may contain cycle with an exception that it cannot have more than one loop at a given node
8. 8. © Oxford University Press 2011. All rights reserved. (a): Directed Acyclic Graph (b): Strongly Connected Directed Acyclic Graph A B C D e8 e5 e6 e3 e2 e7 e4 A B C D TRANSITIVE CLOSURE OF A DIRECTED GRAPH A transitive closure of a graph is constructed to answer reachability questions. That is, is there a path from a node A to node E in one or more hops? A binary relation indicates only whether the node A is connected to node B, and that node B is connected to node C, etc. But once the transitive closure is constructed as shown below, we can easily determine in O(1) time that whether node E is reachable from node A or not. Like the adjacency list, transitive closure is also stored as a matrix T, so if T[1][5] = 1, then it is the case that node 1 can reach node 5 in one or more hops. A B C D E Graph G A B C D E Transitive Closure G*
9. 9. © Oxford University Press 2011. All rights reserved. Definition: For a directed graph G = (V,E), where V is the set of vertices and E is the set of edges, the transitive closure of G is a graph G* = (V,E*). In G*, for every vertex v, w in V there is an edge (v, w) in E* if and only if there is a valid path from v to w in G. Why and where is it needed? Finding the transitive closure of a directed graph is an important problem in many computational tasks that listed below. • Transitive closure is used to find the reachability analysis of transition networks representing distributed and parallel systems • It is used in the construction of parsing automata in compiler construction • Recently, transitive closure computation is being used to evaluate recursive database queries (because almost all practical recursive queries are transitive in nature). Algorithm The algorithm to find transitive enclosure of a graph G is given in figure below. In order to determine the transitive closure of a graph, we define a matrix t where tkij = 1, (for i, j, k = 1, 2, 3, …n) if there exists a path in G from the vertex i to vertex j with intermediate vertices in the set (1, 2, 3, .., k) and 0 otherwise. That is, G* is constructed by adding an edge (i, j) into E* if and only if tkij = 1. When k = 0 T0 ij = 0 if (i, j) is not in E 1 if (I, j) is in E When k ≥ 1 Tk ij = Tk-1 ij V (Tk-1 ik Λ Tk-1 kj )
10. 10. © Oxford University Press 2011. All rights reserved. Transitive_Closure( A, t, n) Step 1: SET i=1, j=1, k=1 Step 2: Repeat Steps 3 and 4 while i<=n Step 3: Repeat Step 4 while j<=n Step 4: IF (A[i][j] = 1), then SET t[i][j] = 1 ELSE SET t[i][j] = 0 INCREMENT j INCREMENT i Step 5: Repeat Steps 6 to 11 while k<=n Step 6: Repeat Step 7 to 10 while i<=n Step 7: Repeat Step 8 and 9 while j<=n Step 8: SET t[i,j] = t[i][j] V ( t[i][k] Λ t[k][j]) Step 9: INCREMENT j Step 10: INCREMENT i Step 11: INCREMENT k Step 12: END
11. 11. © Oxford University Press 2011. All rights reserved. BI-CONNECTED GRAPHS • A vertex v of G is called an articulation point if removing v along with the edges incident to v, results in a graph that has at least two connected components. • A bi-connected graph is defined as a connected graph that has no articulation vertices. That is, a bi-connected graph is connected and non-separable, in the sense that even if we remove any vertex from the graph, the resultant graph is still connected. By definition, • A bi-connected undirected graph is a connected graph that is not broken into disconnected pieces by deleting any single vertex • In a bi-connected directed graph for any two vertices v and w there are two directed paths from v to w which have no vertices in common other than v and w. A CB D FE H J I A B D FE H J I Note that the graph shown on the left is not a bi-connected graph as deleting vertex C from the graphs results in two connected components of the original graph (figure on right)
12. 12. © Oxford University Press 2011. All rights reserved. A B D C Bi-connected graph. Removal of any vertex from the graph will not disconnect the graph Like graphs, there is a related concept for edges. An edge in a graph is called a bridge, if removing that edge results in a disconnected graph. Also, an edge on the graph that does not lie on a cycle is a bridge. This means that a bridge has at least one articulation point at its end, although it is not necessary that the articulation point is linked in a bridge. A B C D E REPRESENTATION OF GRAPHS There are two common ways of storing graphs in computer’s memory. They are:  Sequential representation by using an adjacency matrix  Linked representation by using an adjacency list that stores the neighbors of a node using a linked list
15. 15. © Oxford University Press 2011. All rights reserved. From adjacency matrix A1, we have learnt that an entry 1 in the ith row and jth column means that there exists a path of length 1 from vi to vj. Now consider, A2, A3 and A4 aij 2 = ∑ aik akj Any entry aij = 1 if aik = akj = 1. That is, if there is an edge (vi, vk) and (vk, vj). This implies that there is a path from vi to vj of length 2. Similarly, every entry in the ith row and jth column of A3 gives the number of paths of length 3 from node vi to vj. In general terms, we can conclude that every entry in the ith row and jth column of An (where n is the number of nodes in the graph) gives the number of paths of length n from node vi to vj.
17. 17. © Oxford University Press 2011. All rights reserved. A B C D A B C D C XB B D X D X B XA Graph G and its adjacency list For a directed graph, the sum of lengths of all adjacency lists is equal to the number of edges in G. However, for an undirected graph, the sum of lengths of all adjacency lists is equal to twice the number of edges in G because an edge (u, v) means an edge from node u to v as well as an edge v to u. The adjacency list can also be modified to store weighted graphs. A B C D E A B C D E B D X A D XC B E X A E XB C D X
18. 18. © Oxford University Press 2011. All rights reserved. GRAPH TRAVERSAL ALGORITHMS By traversing a graph, we mean the method of examining the nodes and edges of the graph. There are two standard methods of graph traversal which we will discuss in this section. These two methods are- • Breadth first search • Depth first search While breadth first search will use a queue as an auxiliary data structure to store nodes for further processing, the depth-first search scheme will use a stack. But both these algorithms will make use of a variable STATUS. During the execution of the algorithm, every node in the graph will have the variable STATUS set to 1, 2 or depending on its current state. Table I shows the value of status and its significance. STATUS STATE OF THE NODE DESCRIPTION 1 Ready The initial state of the node N 2 Waiting Node N is placed on the queue or stack and waiting to be processed 3 Processed Node N has been completely processed
19. 19. © Oxford University Press 2011. All rights reserved. Breadth First Search • Breadth-first search (BFS) is a graph search algorithm that begins at the root node and explores all the neighboring nodes. Then for each of those nearest nodes, the algorithm explores their unexplored neighbor nodes, and so on, until it finds the goal. • That is, we start examining the node A and then all the neighbors of A are examined. In the next step we examine the neighbors of neighbors of A, so on and so forth Algorithm for breadth-first search in a graph G beginning at a starting node A Step 1: SET STATUS = 1 (ready state) for each node in G. Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state) Step 3: Repeat Steps 4 and 5 until QUEUE is empty Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state). Step 5: Enqueue all the neighbors of N that are in the ready state (whose STATUS = 1) and set their STATUS = 2 (waiting state) [END OF LOOP] Step 6: EXIT Example: Consider the graph G given below. The adjacency list of G is also given. Assume that G represents the daily flights between different cities and we want to fly from city A to H with minimum stops. That is, find the minimum path P from A to H given that every edge has length = 1.
22. 22. © Oxford University Press 2011. All rights reserved. Algorithm for depth-first search in a graph G beginning at a starting node A Step 1: SET STATUS = 1 (ready state) for each node in G. Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state) Step 3: Repeat Steps 4 and 5 until STACK is empty Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state). Step 5: Push on to the stack all the neighbors of N that are in the ready state (whose STATUS = 1) and set their STATUS = 2 (waiting state) [END OF LOOP] Step 6: EXIT Example: Consider the graph G given below. The adjacency list of G is also given. Suppose we want to print all nodes that can be reached from the node H (including H itself). One alternative is to use a Depth- First Search of G starting at node H. the procedure can be explained as below. A DCB GFE IH Adjacency Lists A: B, C, D B: E C: B, G D: C, G E: C, F F: C, H G: F, H, I H: E, I I: F
24. 24. © Oxford University Press 2011. All rights reserved. Pop and Print the top element of the STACK, that is, E. Push all the neighbors of E on to the stack that are in the ready state. Since there are no neighbors of E that are in the ready state no Push operation is performed. The STACK now becomes empty: PRINT: E Since the STACK is now empty, the depth-first search of G starting at node H is complete and the nodes which were printed are- H, I, F, C, G, B E. These are the nodes which are reachable from the node H.
25. 25. © Oxford University Press 2011. All rights reserved. TOPOLOGICAL SORTING • Topological sort of a directed acyclic graph (DAG) G, is defined as a linear ordering of its nodes in which each node comes before all nodes to which it has outbound edges. Every DAG has one or more number of topological sorts. • That is, a topological sort of a directed acyclic graph (DAG) G is an ordering of the vertices of G such that if G contains an edge (u, v), then u appears before v in the ordering. • Note that topological sort is possible only on directed acyclic graphs that do not have any cycle. For a DAG that contains cycle, no linear ordering of its vertices is possible. • In simple words, a topological ordering of a DAG G, is an ordering of its vertices such that any directed path in G, traverses vertices in increasing order. • Topological sorting is widely used in scheduling applications, jobs or tasks. The jobs that have to be completed are represented by nodes, and there is an edge from node u to v if job u must be completed before job v can be started. A topological sort of such a graph gives an order in which the given jobs must be performed.
26. 26. © Oxford University Press 2011. All rights reserved. Algorithm to find a Topological Sort T of a Directed Acyclic Graph, G Step 1: Find the indegree INDEG(N) of every node in the graph Step 2: Enqueue all the nodes with a zero in-degree Step 3: Repeat Steps 4 and 5 until the QUEUE is empty Step 4: Remove the front node N of the QUEUE by setting FRONT = FRONT + 1 Step 5: Repeat for each neighbor M of node N: a) Delete the edge from N to M by setting INDEG(M) = INDEG(M) – 1 b) IF INDEG(M) = 0, then Enqueue M, that is, add M to the rear of the queue [END OF INNER LOOP] [END OF LOOP] Step 6: Exit
27. 27. © Oxford University Press 2011. All rights reserved. MINIMUM SPANNING TREE • A spanning tree of a connected, undirected graph G, is a sub-graph of G which is a tree that connects all the vertices together. A graph G can have many different spanning trees. • When we assign weights to each edge (which is a number that represents how unfavorable the edge is), and use it to assign a weight to a spanning tree by calculating the sum of the weights of the edges in that spanning tree. • A minimum spanning tree (MST) is defined as a spanning tree with weight less than or equal to the weight of every other spanning tree. In other words, a minimum spanning tree is a spanning tree that has weights associated with its edges, and the total weight of the tree (the sum of the weights of its edges) is at a minimum. • Example: Consider an un-weighted graph G given below. From G we can draw many distinct spanning trees. (Eight of them are given here). For an un-weighted graph, every spanning tree is a minimum spanning tree. A B C D A B C D A B C D A B C D A B C D Unweighted Graph A B C D A B C D
28. 28. © Oxford University Press 2011. All rights reserved. A B C D A B C D A B C D Example: Consider a weighted graph G given below. From G we can draw three distinct spanning trees. But only a single minimum spanning tree can be obtained, that is, the one that has minimum weight (cost) associated with it. 3 3 A B C D A B C D A B C D A B C D 3 4 2 6 7 5 7 2 4 2 A B C D 7 6 2 3 5 2 A B C D A B C D A B C D A B C D A B C D 3 6 2 4 5 2 4 7 3 3 7 5 3 5 6 Thus, we see that of all the given spanning trees, the tree given below is called the minimum spanning tree as it has the lowest cost associated with it. 3A B C D 4 2 Total Cost = 9
29. 29. © Oxford University Press 2011. All rights reserved. PRIM’S ALGORITHM FOR CREATING A MINIMUM SPANNING TREE • Tree vertices: Vertices that are a part of the minimum spanning tree T. • Fringe vertices: Vertices that are currently not a part of T, but are adjacent to some tree vertex. • Unseen Vertices: Vertices that are neither tree vertices nor fringe vertices fall under this category. The steps in the Prim algorithm can be given as shown in figure. • Choose a starting vertex • Branch out from the starting vertex and during each iteration select a new vertex and edge. Basically, during each iteration of the algorithm, we have to select a vertex from the fringe vertices in such a way that the edge connecting the tree vertex and the new vertex has minimum weight assigned to it Prim Algorithm Step 1: Select a starting vertex Step 2: Repeat Steps 3 and 4 until there are fringe vertices Step 3: Select an edge e connecting the tree vertex and fringe vertex that has minimum weight Step 4: Add the selected edge and the vertex to the minimum spanning tree T [END OF LOOP] Step 5: EXIT
30. 30. © Oxford University Press 2011. All rights reserved. Example: Construct a minimum spanning tree of the graph given below. A B E C D 3 4 7 9 10 11 Choose a starting vertex, A for example. A Add the fringe vertices (that are adjacent to A). The edges connecting the vertex and fringe vertices are shown with dotted lines. A B C 3 7 Select an edge connecting the tree vertex and fringe vertex that has minimum weight and add the selected edge and the vertex to the minimum spanning tree T. Since the edge connecting A and C has less weight, so add C to the tree. Now C is not a fringe vertex but a tree vertex. A C 3 B 7 Add the fringe vertices (that are adjacent to C). A B C D 3 4 10 7