• Save
Mathematical analysis of Graph and Huff amn coding
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Mathematical analysis of Graph and Huff amn coding

  • 3,578 views
Uploaded on

Mathematical analysis of Graph and Huff amn coding

Mathematical analysis of Graph and Huff amn coding

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,578
On Slideshare
3,570
From Embeds
8
Number of Embeds
2

Actions

Shares
Downloads
0
Comments
0
Likes
2

Embeds 8

http://www.slideshare.net 7
http://translate.googleusercontent.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Mathematical Analysis of Graph and Huffman Problems Anjan.K II Sem M.Tech CSE M.S.R.I.T 05/22/09 DAA: Analysis of Graph and Huffman Problem 1
  • 2. Outline  Key Points -Recap  Graph Problems – Minimum Spanning Tree(MST) ◦ Prim-Jarnik Algorithm ◦ Kruskal Algorithm  Data Compression using Huffman Coding ◦ Need For Huffman Codes ◦ Huffman’s Algorithm  Summary  References DAA: Analysis of Graph and Huffman 05/22/09 Problem 2
  • 3. Key Points – Recap  Algorithmic Complexity depends on many factors, few of major factor are  Underlying Data Structures  Design Strategy used  Pattern of input size  Any algorithm falls to anyone of the efficiency class  Algorithm may fall under three cases best, average and worst asymptotically represented Ω , Θ ,O(Big Oh) and o(Small Oh) respectively DAA: Analysis of Graph and Huffman 05/22/09 Problem 3
  • 4. Graph Problems-MST  Origin of MST was from Cornerstone problem in combinatorial optimization by Otakar Boruvka.  MST is fundamental Problem with diverse application ◦ Network Design ◦ Cluster Analysis - Bioinformatics ◦ Approximation for NP Hard problems – TSP, Steiner Tree ◦ Indirect Applications – LDPC, Image processing  Minimum Spanning Tree Algorithm ◦ Prim-Jarnik’s Algorithm (Jarnik,Prim,Dijkstra) ◦ Kruskal Algorithm ◦ Boruvka’s Algorithm DAA: Analysis of Graph and Huffman 05/22/09 Problem 4
  • 5. Prim-Jarnik’s MST  Discovered by three people – Jarnik, Prim and Dijkstra and commonly referred to as Prim’s MST Algorithm.  Employs greedy strategy – “Nearest Neighbor”.  Working Principle: ◦ Given graph G=(V,E), tree starts from an arbitrary vertex ‘r’ and grows until tree spans all the vertices in V. ◦ At each step a vertex ‘s’ joins to its nearest neighbor ‘y’ such that edge(s,y) has smallest weight. One Vertex at time. ◦ Algorithm terminates when all vertices in V is reached. DAA: Analysis of Graph and Huffman 05/22/09 Problem 5
  • 6. Prim-Jarnik’s Algorithm MST-Prim(G,w,r) 01 Q ← V[G]  //Q – vertices out of T, Q is priority queue 02 for each u ∈ Q  03    key[u] ← ∞ 04 key[r] ← 0 05 π[r] ← NIL 06 while Q ≠ ∅ 07 do u ← EXTRACT-MIN(Q)//making u part of T 08    for each v ∈ Adj[u] 09    do  if v ∈ Q and w(u,v) < key[v] 10  then  π[v] ← u 11            key[v] ← w(u,v) DAA: Analysis of Graph and Huffman 05/22/09 Problem 6
  • 7. Prim-Jarnik’s – Proof of Correctness Theorem: Upon termination of Algorithm , T is a MST Proof is by induction. Given a spanning tree T if an unique edge f is not in the tree, adding which the tree form a unique cycle and e some edge like f then T = T U {f} – {e}. Proof: A - There exists T’ such that it contains all the edges of T. Basis: T = ∅ => Every MST satisfies Inductive step: A is true at start of iteration Let f be the that is chosen by algorithm If f ∈ T’ then T’ still satisfies A else a cycle is C formed that does not satisfy MST constraint DAA: Analysis of Graph and Huffman 05/22/09 Problem 7
  • 8. Prim-Jarnik’s Algorithm Analysis  Run time efficiency depends on the how Priority Queue is implemented.  Q is implemented as Binary Heap ◦ Lines 1-5 to perform initialization takes O(V). ◦ EXTRACT-MIN takes total call of O(V log V). ◦ For loop is executed O(E) times.  Total time for algorithm is O(V log V + E log V) = O(E log V) DAA: Analysis of Graph and Huffman 05/22/09 Problem 8
  • 9. DAA: Analysis of Graph and Huffman 05/22/09 Problem 9
  • 10. Different Ways To Implement Priority Queue DAA: Analysis of Graph and Huffman 05/22/09 Problem 10
  • 11. Kruskal’s MST  Discovered by J.B.Kruskal  Employs greedy strategy – “Smallest- Edge- First”.  Working Principle: ◦ Given graph G=(V,E), Sort the edges innon- decreasing order of their weights. ◦ At each step, add an safe edge to forest by examining the order i.e., smallest to largest. One edge at time. ◦ Make sure that forest is connect and there is no isolation. ◦ Algorithm terminates when required n-l edges are present in the forest. DAA: Analysis of Graph and Huffman 05/22/09 Problem 11
  • 12. Kruskal’s Algorithm MST-Kruskal(G,w) 01 A ← ∅ 02 for each vertex v ∈ V[G] do 03    MAKE-SET(v) 04 sort the edges of E by non-decreasing weight w 05  for  each  edge  (u,v)  ∈  E,  in  order  by  non- decreasing weight 06 do if FIND-SET(u) ≠ FIND-SET(v) 07    then A ← A ∪ {(u,v)} 08         UNION(u,v) 09 return A DAA: Analysis of Graph and Huffman 05/22/09 Problem 12
  • 13. Kruskal’s – Proof of Correctness Theorem: Upon termination of Algorithm , forest F is a MST Proof is by induction. Given a set of nodes S if an unique edge f is not in the tree, adding which the forest form a unique cycle and e some edge like f then F = S U {f} – {e}. Proof: A - There exists F’ such that it contains all the edges of F. Basis: F = ∅ => Every MST satisfies Inductive step: A is true at start of iteration Let f be the that is chosen by algorithm If f ∈ F’ then F’ still satisfies A else a cycle is C formed that does not satisfy MST constraint DAA: Analysis of Graph and Huffman 05/22/09 Problem 13
  • 14. Kruskal’s Algorithm Analysis  Run time efficiency depends on the how disjoint set S is implemented.  S is implemented as union-by-rank and path- compression ◦ Time taken to sort the edges is O(E log E) ◦ FIND-SET and UNION OPERATION on S takes O(E) along with MAKE-SET operation running for |V| times. Total of O((V+E) . ß(V)) time ◦ |E| >= |V|-1 therefore O(E . ß(V)) and ß(V) = O(log V)= O(log E) ◦ Then total time for algorithm is O(E log E) ◦ If |E|< |V|² then log |E| = O(log V) hence running time for the algorithm is O(E log V) DAA: Analysis of Graph and Huffman 05/22/09 Problem 14
  • 15. DAA: Analysis of Graph and Huffman 05/22/09 Problem 15
  • 16. Data Compression using Huffman Coding  Proposed by Dr. David A. Huffman in 1950’s.  A method for construction of minimum redundancy codes.  Also known as probabilistic Variable length coding.  Used in many compression algorithms like gzip, bzip, jpeg (as option), fax compression.  Properties: ◦ Generates optimal prefix codes ◦ Low cost for generate codes ◦ Low cost on encode and decode ◦ Optimal entropy DAA: Analysis of Graph and Huffman 05/22/09 Problem 16
  • 17. Information Theory - Entropy  Entropy – measure of Information content  Other forms of entropy are Conditional and English language entropy.  For a set of messages S with probability p(s), s ∈S, the self information of s and entropy H(S) is: 1 1 H ( S ) = ∑p( s) log i ( s) = log = − log p( s) s∈S p ( s) p ( s)  An Example p( S ) = {.25,.25,.25,.125,.125} H ( S ) = 3⋅.25 log 4 + 2⋅.125 log 8 = 2.25 DAA: Analysis of Graph and Huffman 05/22/09 Problem 17
  • 18. Huffman Algorithm DAA: Analysis of Graph and Huffman 05/22/09 Problem 18
  • 19. Huffman Coding – Proof of Correctness  Is to prove the optimal code prefixes exhibit greedy-choice and optimal sub- structure property  Proof by Induction ◦ Basis: n= 1, where n is the number of code words, the algorithm finds an optimal code ◦ Inductive Step: We know that it true for n and now consider n+1. ◦ Lemma : Given a tree T, we find T’ with two minimum cost leaves as siblings and C(T’)<=C(T) DAA: Analysis of Graph and Huffman 05/22/09 Problem 19
  • 20. Proof (Cont’d)  We need to show that C(T’)<=C(X’) where X in any code and T’ and X’ are trees from the lemma.   C(X’)<=C(X)  Next T” and X” are the trees with the minimum cost leaves x and y removed. then C(X’’) = C(X’) – x – y C(T’’) = C(T’) – x – y C(T’’) <= C(X’’) C(T) = C(T’) = C(T’’) + x + y<= C(X’’) + x + y = C(X’) <= C(X) DAA: Analysis of Graph and Huffman 05/22/09 Problem 20
  • 21. Huffman’s Algorithm Analysis  Run time efficiency depends on the how Queue ‘Q’ is implemented.  Q is implemented as Binary Min Heap then for set C of n characters ◦ Lines 2 to perform initialization takes O(n). ◦ Lines 3-8 is executed n-1 times and each heap operation requires O(log n). ◦ For loop contributes O(n log n) times.  Total time for algorithm is O(g V + E log V) = O(E log V) DAA: Analysis of Graph and Huffman 05/22/09 Problem 21
  • 22. DAA: Analysis of Graph and Huffman 05/22/09 Problem 22
  • 23. Summary  Graph Problem- Prim’s and Kruskal’s MST ◦ Data Structure used ◦ Algorithm ◦ Proof of Correctness ◦ Mathematical Analysis  Huffman Coding ◦ Need for Huffman techniques for data compression ◦ Algorithm ◦ Proof of Correctness ◦ Mathematical Analysis DAA: Analysis of Graph and Huffman 05/22/09 Problem 23
  • 24. References [1] Thomas.H Cormen et.al., “Introduction to Algorithms”,2nd Edition by PHI [2] Anany Levitin, “Design and Analysis of Algorithms” , 2004 Reprint by Pearson Education [3] Sartaj Sahini and Narasingh Deo “Handbook on Data Structures and Applications”, 2005 Reprint by Chanman & Hall [4] Documents and Internet Resources from popular Universities across globe DAA: Analysis of Graph and Huffman 05/22/09 Problem 24