មេរៀនៈ Data Structure and Algorithm in C/C++

  • 6,251 views
Uploaded on

នេះជាមេរៀនទាំងអស់នៃមុខវិជ្ជា Data Structure and Algorithm in C/C++

នេះជាមេរៀនទាំងអស់នៃមុខវិជ្ជា Data Structure and Algorithm in C/C++

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,251
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
305
Comments
3
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Give an example, such as database of a company. Or more examples, sound, video games, …..
  • Give an example, such as database of a company. Or more examples, sound, video games, …..
  • But, unlike a linked list, the connections between the nodes are more than a simple one-to-another progression. An example can illustrate the connections in a binary tree.
  • This is an example of a binary tree with nine nodes. Presumably each node contains information about one of the 50 states. In this example, the states are not arranged in any particular order, except insofar as I need to illustrate the different special kinds of nodes and connections in a binary tree.
  • Each node in a binary tree is permitted to have two links downward to other nodes, called the left child and the right child .
  • Some nodes have no children, and those nodes are called leaves . In this example, there are four leaves: Massachusetts, Oklahoma, New Hampshire (or is that Vermont?) and Nebraska. (Yes, that really is Nebraska. Either the author ran out of room on the slide and had to shrink it, or the author is from rival state Colorado.)
  • There are two rules about parents in any tree: 1. The root never has a parent. 2. Every other node has exactly one parent. There is also a related rule which is not written here, but is part of the definition of a tree: If you start at the root, there is always one way to get from the root to any particular node by following a sequence of downward links (from a parent to its child).
  • Two nodes that have the same parent are called siblings , as shown here. In a binary tree, a node has at most one sibling.
  • When a complete binary tree is built, its nodes are generally added one at a time. As with any tree, the first node must be the root.
  • The next node must be the right child of the root.
  • A quick summary . . .
  • The first node of a complete binary tree is always the root...
  • ...then the right child of the root...
  • So, a heap is a complete binary tree. Each node in a heap contains a key, and these keys must be organized in a particular manner. Notice that this is not a binary search tree, but the keys do follow some semblance of order. Can you see what rule is being enforced here?
  • We can add new elements to a heap whenever we like. Because the heap is a complete binary search tree, we must add the new element at the next available location, filling in the levels from left-to-right. In this example, I have just added the new element with a key of 42. Of course, we now have a problem: The heap property is no longer valid. The 42 is bigger than its parent 27. To fix the problem, we will push the new node upwards until it reaches an acceptable location.
  • In general, there are two conditions that can stop the pushing upward: 1. We reach a spot where the parent is >= the new node, or 2. We reach the root. This process is called reheapification upward (I didn't just make up that name, really).
  • We'll fix the problem by pushing the out-of-place node downward. Perhaps you can guess what the downward pushing is called.... reheapification downward .
  • Reheapification downward can stop under two circumstances: 1. The children all have keys that are <= the out-of-place node. 2. The out-of-place node reaches a leaf.
  • Following the usual technique for implementing a complete binary tree, the data from the root is stored in the first entry of the array.
  • As with any partially-filled array, we are only concerned with the front part of the array. If the tree has five nodes, then we are only concerned with the entries in the first five components of the array.
  • A quick summary . . .

Transcript

  • 1. Data Structure and AlgorithmsLecturer: CHHAY Nuppakun E-mail: nuppakunc@yahoo.com Department of Computer Studies Norton University - 2013
  • 2. Chapter 1 Fundamental ideas of data structure and algorithm
  • 3. Read AheadYou are expected to read the lecture notes before the lecture.This will facilitate more productive discussion during class. Also pleaseLike in an proof readEnglish class assignments & tests. 3
  • 4. Programs and programming What is a program?  A set of instructions working with data designed to accomplish a specific task  The “recipe” analogy  Ingredients are the Data  Directions are the Program Statements What is programming  The art and craft of writing programs  The art to control these “idiot servants” and “naïve children” 4
  • 5. Introduction to Programming Programming is to solve problems using computers  How to do it at all ?  How to do it robustly ?  How to do it effectively ? Programming consists of two steps:  Algorithmic design (the architects)  Coding (the construction workers) Programming requires:  A programming language (C/C++/C#) to express your ideas  A set of tools to design, edit, and debug your code  A compiler to translate your programs into machine code  A machine to run the executable code 5
  • 6. Crafting Programs Effectively Program design  design process  stepwise refinement & top-down design  bottom-up design  modularization, interfaces  use of abstractions Programming style  structured programming  readable code  effective use of language constructs  “formatting”  software organization Documentation and comments 6
  • 7. Good Programs There are a number of facets to good programs: they must  run correctly  run efficiently  be easy to read and understand  be easy to debug and  be easy to modify better running times will generally be obtained from use of the most appropriate data structures and algorithms 7
  • 8. Why Data Structure and Algorithms Computer is becoming ubiquitous …  programming gets you more out of computer  learn how to solve problems  dealing with abstractions  be more precise Unfortunately, most people  know little about Computer Science  know little about Programming  write bad or buggy programs  become lost when writing large programs 8
  • 9. Algorithms and Data Structures Algorithm: a strategy for computing something, e.g.,  sorting: putting data in order by key  searching: finding data in some kind of index  finding primes and generating random numbers  string processing  graphics: drawing lines, arcs, and other geometric objects Data structure: a way to store data, e.g.,  arrays and vectors  linked lists Two are related:  data structures organize data  algorithms use that organization 9
  • 10. What are computers? “idiot servants” that can do simple operations incredibly fast if you tell them every step to do like little children in their need for specific and detailed instruction computers are not “brains” & are not “smart” - they only as good as the program they are running 10
  • 11. Computer Environment: Hardware Hardware  the physical, tangible parts of a computer  E.g., CPU, storage, keyboard, monitor chip that executes Monitor Central program commands Processing e.g., Keyboard Unit Intel Pentium IV Sun Sparc Transmetaprimary storage area for programs and data Hard Disk Main also called RAM Memory CD ROM 11
  • 12. Computer Environment: Software Operating System  E.g., Linux, Mac OS X, Windows 2000, Windows XP  manages resources such as CPU, memory, and disk  controls all machine activities Application programs  generic term for any other kind of software  compiler, word processors, missile control systems, games 12
  • 13. Operating System What does an OS do?  hides low level details of bare machine  arbitrates competing resource demands Useful attributes  multi-user  multi-tasking CPU User Operating Program System Disk Network 13
  • 14. Chapter 2 Review of C++ Essentials
  • 15. Main Program and Library Files <preprocessor directives> <global data and function declarations> int main( ) { <local data declarations> <statements> return 0; } <main program function implementation> 15
  • 16. Program Comments/* <multiline comments >*///<end-of-line comments> 16
  • 17. C++ Data Types simple structuredintegral enum floating float double long double address pointer reference 17
  • 18. Simple Data Types char int long, double, float unsigned Variables<data type> <list of identifiers><data type> <identifier> = <initial value> 18
  • 19. Symbolic Constantsconst float PI = 3.141592653589793238;const int UPPER_BOUND = 100;const char BLANK = ` `; 19
  • 20. Expressions and AssignmentOperators: +, -, *, /, %, =, <, <=, >=, = =, !=, &&, ||, !, ( )Examples: a = b = c = 5; ((a = b) = c) = 5; //? a = = 0; a=0 20
  • 21. Type Conversion<type name> (<expression>) or(<type name>) <expression>Example: int (3.14) returns 3 (float) 3 returns 3.0 21
  • 22. Interactive I/Ocout << “Enter an int, a float, and a string, “ << “separated by spaces”cin >> int_value >> float_value >> string 22
  • 23. Functionsdouble pow(double base, double exponent); cout << pow(5,2) << endl; // function call*********************************void hello_world( ){ cout << “Hello World” << endl;}hello_world( ); //call to a void function 23
  • 24. Selection if if … else switch IterationFor (<initiation>, <termination>, <update>)while (<condition>) { <statements>}do {<statements>} while (<condition>) 24
  • 25. User Defined TypeUsing typedeftypedef int boolean;Using enumenum weekday {MON, TUE, WED, THUR, FRI}; enum primary_color {RED, YELLOW, BLUE}; weekday day = MON; primary_color color = RED; 25
  • 26. Structured Data Types Arrays Strings Structs Files 26
  • 27. Why do we need an array? cin >> value0;#include <iostream.h> cin >> value1;int value0; …int value1; cin >> value999;int value2; cout << value0;… cout << value1;int value999; cout << value2; … cout << value999 27
  • 28. Array Declaration <type> <ArrayName>[Size]; Example int value[1000];Multidimensional Array  Declaration <type> <ArrayName> [index0][...][indexN] Example int hiTemp[52][7] int ThreeD[10][10][5] 28
  • 29. Accessing an Array Array initialization for (I = 0; I < = 999, I++) value[I] = 2 *I -1; Each of an array’s elements can be accessed in sequence by varying an array index variable within a loop Multidimensional arrays can be accessed with nested loops. 29
  • 30. Chapter 3 Algorithms
  • 31. AlgorithmDefinition A step-by-step procedure for solving a problem in a finite amount of time Pseudo-code  is a compact and informal high-level description of a computer programming algorithm that uses the structural conventions of a programming language 31
  • 32. Algorithms (Continue)Algorithm is used in computer science to describe aproblem-solving method suitable for implementation as acomputer program:1. Most algorithms of interest involve methods of organizing the data involved in the computation. Objects created in this way are called data structures => algorithms and data structures go hand in hand2. use a computer to help us solve a problem for small or for huge problems - quickly become motivated to devise methods that use time or space as efficiently as possible.3. Careful algorithm design is an extremely effective part of the process of solving a huge problem, whatever the applications area 32
  • 33. Algorithms (Continue)4. Huge or complex computer program is to be developed, a great deal of effort must go into understanding and defining the problem to be solved, In most cases, however, there are a few algorithms whose choice is critical because most of the system resources will be spent running those algorithms5. The sharing of programs in computer systems is becoming more widespread => to reimplement basic algorithms arises frequently, that we are faced with completely new computing environments (hardware and software) with new features that old implementations may not use to best advantage. To make our solutions more portable and longer lasting.6. The choice of the best algorithm for a particular task can be a complicated process, perhaps involving sophisticated mathematical analysis. The branch of computer science that comprises the study of such questions is called analysis of algorithms . 33
  • 34. Analysis of Algorithms Analysis is the key to being able to understand algorithms sufficiently well Analysis plays a role at every point in the process of designing and implementing algorithms which mathematical analysis can play a role in the process of comparing the performance of algorithms The following are among the reasons that we perform mathematical analysis of algorithms:  To compare different algorithms for the same task  To predict performance in a new environment  To set values of algorithm parameters 34
  • 35. Growth of Functions Most algorithms have a primary parameter N that affects the running time most significantly:  The parameter N might be the degree of a polynomial  the size of a file to be sorted or searched  the number of characters in a text string  or some other abstract measure of the size of the problem being considered By using mathematical formulas that are as simple as possible and that are accurate for large values of the parameters 35
  • 36. Growth of Functions (Continue) The algorithms in typically have running times proportional to one of the following functions:  1 Most instructions of most programs are executed once or at most only a few times, that the programs running time is constant  log N When the running time of a program is logarithmic, the program gets slightly slower as N grows. This running time commonly occurs in programs that solve a big problem by transformation into a series of smaller problems  N When the running time of a program is linear, it is generally the case that a small amount of processing is done on each input element  N log N The N log N running time arises when algorithms solve a problem by breaking it up into smaller subproblems, solving them independently, and then combining the solutions 36
  • 37. Growth of Functions (Continue)  N2 When the running time of an algorithm is quadratic, that algorithm is practical for use on only relatively small problems  N3 Similarly, an algorithm that processes triples of data items (perhaps in a triple nested loop) has a cubic running time and is practical for use on only small problems  2N Few algorithms with exponential running time are likely to be appropriate for practical use, even though such algorithms arise naturally as brute-force solutions to problems. The running time of a particular program is likely to be some constant multiplied by one of these terms (the leading term) plus some smaller terms. 37
  • 38. 38
  • 39. Running Time Most algorithms transform input objects into output objects. The running time of an algorithm typically grows with the input size. Average case time is often difficult to determine. We focus on the worst case running time.  Easier to analyze  Crucial to applications such as games, finance and robotics 39
  • 40. Experimental Studies Write a program implementing the algorithm Run the program with inputs of varying size and composition Use a function, like the built-in clock() function, to get an accurate measure of the actual running time Plot the results 40
  • 41. Limitations of Experiments It is necessary to implement the algorithm, which may be difficult Results may not be indicative of the running time on other inputs not included in the experiment. In order to compare two algorithms, the same hardware and software environments must be used 41
  • 42. Algorithm AnalysisC= a + b;Operands: c, a, bOperators: +, =Simple model computation steps :- load operands (fetch time for c, a, b)- perform operations (operates time for + and =)- so above instruction needs 3Tfetch + 1T+ + 1Tstore 42
  • 43. Algorithm Analysisint num= 25;Operands: num, constant: 25, operator: =Time needed: 1Tfetch + 1Tstoren>= I;Operands: n, i, operator: >=Time needed: 2Tfetch + 1T>=++i; i=i+1;Time needed: 2Tfetch + 1T+ + 1Tstore 43
  • 44. Algorithm AnalysisExercises1- cout<< i;2- area= l * w;3- C=5/9 * (F-32);4- return i;5- *p= &a; 44
  • 45. Computing running time Arithmetic series summation (eg.)1- unsignet int Sum (unsigned int n)2- { Statement Time Code3- unsigned int result=0; 3 Tfetch + Tstore result=0;4- for (int i=0; i<=n; i++) 4a Tfetch + Tstore i=0;5- result+=l;6- return result; 4b (2Tfetch + T<) * (n+1) i<=n;7- } 4c (2Tfetch + T+ + Tstore) * n i++; 5 (3Tfetch + T+ + Tstore) * n result+=I; 6 Tfetch + Treturn return result; (7Tfetch + 2T+ + 2Tstore + T<) * n Total + (5Tfetch + 2Tstore + T< + Treturn) Computing running time of the program 45
  • 46. Big-Oh Notation The mathematical artifact that allows us to suppress detail when we are analyzing algorithms is called the O-notation, or "big-Oh notation," Definition 1 A function g(N) is said to be O(f (N)) if there exist constants co and No such that g(N) < co f (N) for all N > No. We use the O-notation for three distinct purposes:  To bound the error that we make when we ignore small terms in mathematical formulas  To bound the error that we make when we ignore parts of a program that contribute a small amount to the total being analyzed  To allow us to classify algorithms according to upper bounds on their total running times 46
  • 47. Big-Oh Notation (Continue) Often, the results of a mathematical analysis are not exact, but rather are approximate in a precise technical sense The O-notation allows us to keep track of the leading terms while ignoring smaller terms when manipulating approximate mathematical expressions For example, if we expand the expression: (N + O (1)) (N + O (log N) + O(1)), we get six terms: N2 + O (N) + O (N log N) + O (log N) + O (N) + O (1), but can drop all but the largest O-term, leaving the approximation N2 + O (N log N). That is, N2 is a good approximation to this expression when N is large. 47
  • 48.  Another Example  What if the input size is 10,000  Algorithm 1: 1,000,000  Algorithm 2: 100,000,000  Conclusion  Algorithm 1 is better! Question:  Who is REALLY better?  Confused! Reason  Too precise! Solution  Big-O notation – Order of the algorithm  Rougher measurement  Measure the increasing speed, ignoring the constants and smaller items  Better algorithms have lower increasing speed Remember  The order of an algorithm generally is more important than the speed of the processor (CPU)  Why? 48
  • 49. 49
  • 50. Chapter 4 Data Structure
  • 51. Data Structure Definition  A data structure is a collection of data, generally organized so that items can be stored and retrieved by some fixed techniques Example  An array  Stored and retrieved based on an index assigned to each item 51
  • 52. Data Structures vs. Software How They Related  Software is designed to help people solve problems in reality  To solve the problems, there are some THINGS, or INFOs in reality to be processed  Those THINGS or INFOs are called DATA  DATA and their RELATIONS can be complicated 52
  • 53. Data Structures vs. Software How They Related  Reasonable organization of DATA helps improving software efficiency, decreasing software design difficulty  Experiences accumulated in the past will be learned in this course, and they are certain DATA STRUCUTRES, such as the linked list and the binary tree  DATA STRUCTURE is a smart way to organize DATA, depends on the features of DATA, and how the DATA are processed 53
  • 54. Phases of Software Development  Phases  Specification of the task  Design of a solution  Implementation of the solution  Analysis of the solution  Testing and debugging  Maintenance and evolution of the system  Obsolescence 54
  • 55. Phases of Software Development  Features of the Phases  NOT a fixed sequence  For example, in a widely used OO DESIGN method, Unified Process (UP), there are many iterations, and in each iteration, there are specification, design, implementation and test involved. Feedback from previous iteration helps improving the next iteration  You can find other examples from textbook  Most phases are independent of programming languages  We will use Java for IMPLEMNTATION  However, most of what we learned in this course applies to other languages 55
  • 56. Arrays The most fundamental data structure is the array An array is a fixed number of data items that are stored contiguously and that are accessible by an index A simple example of the use of an array, which prints out all the prime numbers less than 1000. const int N = 1000; main( ) { int i, j, a[N+1]; for (a[1] = 0, i = 2; i <= N; i++) a[i]=1; for (i = 2; i <= N/2; i++) for (j = 2; j <= N/i; j++) a[i*j] = 0; for (i = 1; i <= N; i++) if (a[i]) cout << i << ‘ ‘ ; cout << ‘n’; } 56
  • 57. Arrays (Continue) The primary feature of arrays is that if the index is known, any item can be accessed in constant time The size of the array must be known beforehand, it is possible to declare the size of an array at execution time Arrays are fundamental data structures in that they have a direct correspondence with memory systems on virtually all computers The entire computer memory as an array, with the memory addresses corresponding to array indices 57
  • 58. Linked Lists The second elementary data structure to consider is the linked list The primary advantage of linked lists over arrays is that:  linked lists can grow and shrink in size during their lifetime  their maximum size need not be known in advance  it possible to have several data structures share the same space 58
  • 59. Linked Lists (Continue) A second advantage of linked lists is that:  they provide flexibility in allowing the items to be rearranged efficiently  This flexibility is gained at the expense of quick access to any arbitrary item in the list A linked list is a set of items organized sequentially, just like an array A L I S T A linked list 59
  • 60. Linked Lists (Continue) Flexible space use  Dynamically allocate space for each element as needed  Include a pointer to the next item Linked list Data Next  Each node of the list contains  the data item (an object pointer in our ADT)  a pointer to the next node object 60
  • 61. Linked Lists (Continue) Collection structure has a pointer to the list head  Initially NULL Collection Add first item Head  Allocate space for node node  Set its data pointer to object Data Next  Set Next to NULL  Set Head to point to new node object 61
  • 62. Linked Lists (Continue) Add second item  Allocate space for node  Set its data pointer to object  Set Next to current Head  Set Head to point to new node Collection Head node node Data Next Data Next object2 object 62
  • 63. Linked Lists (Continue) head z A L I S T A linked list with its dummy nodes. head z A L I S Thead z T A L I S Rearranging a linked list 63
  • 64. Linked Lists (Continue) Xhead z A L I S T head z A L I X S Thead z A L I X S T Insertion into and deletion from a linked list. 64
  • 65. Linked Lists - LIFO and FIFO Single Linked List  One-way cursor  Only can move forward Simplest implementation  Add to head Last-In-First-Out (LIFO) semantics Modifications  First-In-First-Out (FIFO)  Keep a tail pointer head tail 65
  • 66. Linked Lists - Doubly linked Doubly linked lists  Can be scanned in both directions  Two-way cursor  Can move forward and backward head prev prev prev tail 66
  • 67. Linked List vs. Array Arrays are better at random access  What is the 4th element in the list?  Arrays need O(C) time  Linked lists need O(n) time at worst case Linked lists are better at additions and removals at a cursor  Operations at the cursor need O(C) time  Arrays don’t have cursor, so addition and removal operations need O(n) time at worst case 67
  • 68. Linked List vs. Array Resizing can be inefficient for an array  For arrays, capacity must be maintained in an inefficient way  For linked lists, no problem Summary  Array  Frequent random access operations  Linked lists  Operations occur at a cursor  Frequent capacity changes  Operations occur at a two-way cursor (DLL) 68
  • 69. Storage Allocation arrays are a rather direct representation of the memory of the computer direct-array representation of linked lists is to use "parallel arrays“ The advantage of using parallel arrays is that the structure can be built on top of the data: the array key contains data and only data all the structure is in the parallel array next more data can be added with more parallel arrays 69
  • 70. Pushdown Stacks The most important restricted-access data structure is the pushdown stack. Items are added in a: L ast I n F irst O ut (LIFO) approach two basic operations are involved: one can push an item onto the stack (insert it at the beginning) and pop an item (remove it from the beginning) pushdown stacks appear as the fundamental data structure for many algorithms The stack is represented with an array stack and pointer p to the top of the stack the functions push, pop, and empty are straightforward implementations of the basic stack operations 70
  • 71. Stack Example – Math Parser  Define Parser  9 * ( 3 + 5 ) * (4 + 2) = ?  Why not 10 ?  In INFIX notation  Convert to Postfix using a STACK 953+*42+*  Then compute using a STACK  Answer: 71
  • 72. Infix -> Postfix Algorithm 9 * ( 3 + 5 ) * (4 + 2) = ?  Only worrying about +, *, and ()  Initialize Stack  If you get a #, output it  If you get a operand, entries are popped until we get a lower priority  If you get a ‘)’, pop and output operands until you clear a ‘(‘ 72
  • 73. Infix -> PostfixStart 9 * ( 3 + 5 ) * (4 + 2) = ? Output 9 9 73
  • 74. Infix -> PostfixEnd 9 * ( 3 + 5 ) * (4 + 2) = ? Pop until stack is empty 9 3 5 +* 4 2 + *Top 74
  • 75. Calculate Postfix 935+42+** Given a #, push it Given an operand  Pop the top two #s  Apply operand  Push result back onto stack 75
  • 76. Calculate Postfix 935+42+** Push 9Top 9 76
  • 77. Calculate Postfix 935+42+** Push 3Top 3 9 77
  • 78. Calculate Postfix 935+42+** Push 5Top 5 3 9 78
  • 79. Calculate Postfix 935+42+** 5 3 Pop Two Numbers 5 3Top 9 79
  • 80. Calculate Postfix 935+42+** 5 + 3 Apply + 5 3Top 9 80
  • 81. Calculate Postfix 935+42+** Push Result (8)Top 8 9 81
  • 82. Calculate Postfix 935+42+** Push 4Top 4 8 9 82
  • 83. Calculate Postfix 935+42+** Push 2Top 2 4 8 9 83
  • 84. Calculate Postfix 935+42+** 2 + 4 Pop 2 and 4, AddTop 8 9 84
  • 85. Calculate Postfix 935+42+** Push Result (6)Top 6 8 9 85
  • 86. Calculate Postfix 935+42+** 6 * 8 Pop 6 and Pop 8 and MultiplyTop 9 86
  • 87. Calculate Postfix 935+42+** Push Result (48)Top 48 9 87
  • 88. Calculate Postfix 935+42+** 48 * 9 Pop 48 and Pop 9 Multiply Answer: 432Top 88
  • 89. Using Stacks Computer Architecture Operating Systems Event Planning (Networking, OS) Computer Graphics (Scene graphs) Compilers, Parsers 89
  • 90. Queues Another fundamental restricted-access data structure is called the queue two basic operations are involved: one can insert (add) an item into the queue at the beginning and remove an item from the end queues obey a "first in, first out” (FIFO) discipline There is three class variables: the size of the queue and two indices, one to the beginning of the queue (head) and one to the end (tail) If head and tail are equal, then the queue is defined to be empty; but if put would make them equal, then it is defined to be full 90
  • 91. Applications of Queues Direct applications  Waiting lines  Access to shared resources (e.g., printer)  Multiprogramming Indirect applications  Auxiliary data structure for algorithms  Component of other data structures 91
  • 92. Queue Example You: Bank of America employee Boss: How many tellers do I need? How do you go about solving this problem? Simulations!  What are the parameters? 92
  • 93. Bank Teller Example Classes Data structures Input  Time step = 5 sec  Transaction = 2 minutes  Customer Frequency = 50% chance every 15 seconds What questions do we want to know?  Average wait time  Average line length How a simulation would work 93
  • 94. More Queue examples Networking: Router Computer Architecture: Execution Units Printer queues File systems Wal-Mart checkout lines Disney entrance 94
  • 95. Recursion Two Necessary Parts  Recursive calls  Stopping or base cases Infinite recursion  Every recursive call produces another recursive call  Stopping case not well defined, or not reached Very useful technique  Definition of mathematical functions  Definition of data structures  Recursive structures are naturally processed by recursive functions! Recursively defined functions  factorial  Fibonacci  GCD by Euclid’s algorithm  Games  Towers of Hanoi 95
  • 96. RecurrencesFactorial function, defined by the formula N! = N . (N - 1)!, for N > 1 with 0! = 1.This corresponds directly to the following simple recursive program:int factorial(int N){ if (N == 0) return 1; return N * factorial(N-1);}This program illustrates the basic features of a recursiveprogram: it calls itself and it has a termination condition in which it directly computes its result 96
  • 97. Recurrences (Continue)Well-known recurrence relation is the one that defines the Fibonacci numbers:FN = FN- 1 + FN-2 , for N >= 2 with F0 = F1 = 1The recurrence corresponds directly to the simple recursive program: int fibonacci(int N) { if (N <= 2) return 1; return fibonacci(N-1) + fibonacci(N-2); }This is an even less convincing example of the “power" of recursion, that the recursivecalls indicate that FN-1 and FN-2 should be computed independently. 97
  • 98. Recurrences (Continue) The relationship between recursive programs and recursively defined functions is often more philosophical than practical factorial function really could be implemented with a loop and that the Fibonacci function is better handled by storing all precomputed values in an array 98
  • 99. Divide-and-Conquer Most of the recursive programs use two recursive calls, each operating on about half the input - called "divide and conquer " paradigm for algorithm design Divide-and conquer is a general algorithm design paradigm:  Divide: divide the input data S in two or more is joint subsets S1, S2, …  Recur: solve the subproblems recursively  Conquer: combine the solutions for S1, S2, …, into a solution for S 99
  • 100. Divide-and-Conquer (Continue)divide-and-conquer recursive program is a straightforwardWay to accomplish our objective: void rule (int l, int r, int h) { int m = (l+r) /2; if (h > 0) { rule (l,m,h-1); mark (m, h) ; rule (m,r,h-1); } }The idea behind the method is the following: to make the marksin an interval, first make the long mark in the middle 100
  • 101. rule (0,8,3) mark (4,3) rule (0,4,2) mark (2,2) rule (0,2,1) Drawing a ruler (Preorder) mark (1,1) rule (0,1,0) in detail, giving the list of procedure calls rule (1,2,0) rule (2,4,1) and marks resulting from the call mark (3,1) rule (0, 8, 3). We mark the middle and call rule (2,3,0) rule (3,4,0) rule for the left half, then do the same for rule (4,8,2) the left half, and so forth, until a mark of mark (6,2) rule (4,6,1) length 0 is called for. Eventually we return mark (5,1) from rule and mark right halves in the rule (4,5,0) rule (5,6,0) same way. rule (6,8,1) mark (7,1) rule (6,7,0) rule (7,8,0) 101
  • 102. rule (0,8,3) rule (0,4,2) rule (0,2,1) rule (0,1,0) mark (1,1) rule (1,2,0) Drawing a ruler (Inorder version) mark (2,2) rule (2,4,1) In general, divide-and-conquer rule (2,3,0) algorithms involve doing some work mark (3,1) to split the input into two pieces, or rule (3,4,0) mark (4,3) to merge the results of processing rule (4,8,2) two independent "solved" portions rule (4,6,1) of the input, or to help things along rule (4,5,0) mark (5,1) after half of the input has been rule (5,6,0) processed. mark (6,2) rule (6,8,1) rule (6,7,0) mark( 7,1) 102
  • 103. Divide-and-Conquer (Continue) nonrecursive algorithm, which does not correspond to any recursive implementation, is to draw the shortest marks first, then the next shortest, etc. rule(int l, int r, int h); { int i , j , t; for (i=1,j=1; i<=h; i++, j+=j) for (t = 0 ; t<=(l+r)/j; t++) mark (l+j+t*(j+j), i); } combine and conquer - method of algorithm design where we solve a problem by first solving trivial subproblems, then combining those solutions to solve slightly bigger subproblems, etc., until the whole problem is solved. 103
  • 104. Chapter 5 TREES
  • 105. TREES GLOSSARY one item follows the other, which will consider two-dimensional linked structures called trees Trees are encountered frequently in everyday life A tree is a nonempty collection of vertices and edges:  A vertex is a simple object (also referred to as a node)  An edge is a connection between two vertices A path in a tree is a list of distinct vertices in which successive vertices are connected by edges in the tree One node in the tree is designated, as the root the defining property of a tree If there is more than one path between the root and some node, or if there is no path between the root and some node, then what we have is a graph, not a tree 105
  • 106. TREES In computer science, a tree is an abstract model of a hierarchical structure Nodes with no children are sometimes called leaves, or terminal nodes Nodes with at least one child are sometimes called nonterminal nodes nonterminal nodes refer as internal nodes and terminal nodes as external nodes E Applications: A R E  Organization charts File systems  A S T  Programming environments M P L E A sample tree 106
  • 107. TREES (Continue) The nodes in a tree divide themselves into levels - the level of a node is the number of nodes on the path from the node to the root The height of a tree is the maximum level among all nodes in the tree (or the maximum distance to the root from any node) The path length of a tree is the sum of the levels of all the nodes in the tree (or the sum of the lengths of the paths from each node to the root) The tree in figure of slide No 3 is height 3 and path length 21 107
  • 108. Binary Trees A binary tree has nodes , similar to nodes in a linked list structure. Data of one sort or another may be stored at each node. But it is the connections between the nodes which characterize a binary tree. 108
  • 109. 109
  • 110. 110
  • 111. A Binary Tree of StatesIn this example, thedata contained ateach node is one ofthe 50 states.Each tree has aspecial nodecalled its root ,usually drawn atthe top. 111
  • 112. A Binary Tree of States Each node is permitted to have two Arkansas has a Arkansas has a links to other nodes, left child, but no left child, but no right child. called the left child right child. and the right child .Some nodeshave only onechild. 112
  • 113. A Binary Tree of States Washington is the Washington is the parent of Arkansas parent of Arkansas and Colorado. and Colorado. Each node is called the parent of itsA node with no children.children is called aleaf . 113
  • 114. A Binary Tree of StatesTwo rules about parents: The root has no parent. Every other node has exactly one parent. 114
  • 115. A Binary Tree of StatesTwo nodes withthe same Arkansasparent are Arkansas and Colorado and Coloradocalled siblings. are siblings. are siblings. 115
  • 116. Complete Binary TreesA complete binary tree is a When a complete When a complete binary tree is built,special kind of binary tree binary tree is built, its first node must be its first node must bewhich will be useful to us. the root. the root.The second node of a completebinary tree is always the left childof the root... 116
  • 117. Complete Binary TreesThe second node of a complete binarytree is always the left child of the root...... and the third node is always the rightchild of the root. ....The next nodes must alwaysfill the next level from leftto right . 117
  • 118.  Binary Tree  Consists of  Node  Left and Right sub-trees  Both sub-trees are binary trees Each sub-tree is itself a binary tree 118
  • 119. Trees - Performance Find  Complete Tree  Height, h  Nodes traversed in a path from the root to a leaf  Number of nodes, h  n = 1 + 21 + 22 + … + 2h = 2h+1 - 1  h = floor( log n ) 2 119
  • 120. Trees - Performance Find  Complete Tree  Since we need at most h+1 comparisons, find in O(h+1) or O(log n)  Same as binary search 120
  • 121. Summary Binary trees contain nodes. Each node may have a left child and a right child. If you start from any node and move upward, you will eventually reach the root. Every node except the root has one parent. The root has no parent. Complete binary trees require the nodes to fill in each level from left-to-right before starting the next level. 121
  • 122. PROPERTIES Property 1 - There is exactly one path connecting any two nodes in a tree :  Any two nodes have a least common ancestor  that any node can be the root: each node in a tree has the property that there is exactly one path connecting that node with every other node in the tree Property 2 - A tree with N codes has N - 1 edges  each node, except the root, has a unique parent, and every edge connects a node to its parent Property 3 - A binary tree with N internal nodes has N + 1 external nodes  A binary tree with no internal nodes has one external node  the left subtree has k + 1 external nodes and the right subtree has N - k external nodes, for a total of N + 1 122
  • 123. PROPERTIES ( Continue ) Property 4 - The external path length of any binary tree with N internal nodes is 2N greater than the internal path length  start with the binary tree consisting of one external node  The process starts with a tree with internal and external path length both 0 and, for each of N steps, increases the external path length by 2 more than the internal path length Property 5 - The height of a full binary tree with N internal nodes is about 10g2 N  if the height is n, then we must have 2n-1 <N+1 ≤ 2n , since there are N + 1 external nodes 123
  • 124. Representing Binary Trees The most prevalent representation of binary trees is a straightforward use of records with two links per node For the representation corresponds to have two different types of records, one for internal nodes, one for external nodes; for others, it may be appropriate to use just one type of node and to use the links in external nodes for some other purpose The parse tree for an expression is defined by the simple recursive rule: "put the operator at the root and then put the tree for the expression corresponding to the first operand on the left and the tree corresponding to the expression for the second operand on the right 124
  • 125. Representing Binary Trees ( Continue )The parse tree for A B C + D E * * F + * (the same expressionin postfix)-- infix and postfix are two ways to representarithmetic expressions, parse trees are a third * A + * F + * B C D E Parse tree for A * ( ( ( B + C ) * ( D * E ) ) + F ) 125
  • 126. Representing Binary Trees ( Continue )There are two other commonly used solutions. One option is to use adifferent type of node for external nodes, one with no links. Anotheroption is to mark the links in some way (to distinguish them fromother links in the tree), then have them point elsewhere in the tree. + * B C D E * + A + * * * F F + * + * + * B C D E B C D E B C D E Building the parse tree for A B C + D E * * F + * 126
  • 127. TRAVERSING TREES How to traverse tree and how to systematically visit every node - there are a number of different ways to proceed The first method to consider is preorder traversal - The method is defined by the simple recursive rule. "Visit the root, then visit the left subtree, then visit the right subtree ." traverse(struct node *t) { stack.push(t); while ( !stack.empty ( ) ) { t = stack.popo; visit(t); if (t->r != z) stack.push(t->r ) ; if (t->l != z) stack.push(t->l ); } } 127
  • 128. TRAVERSING TREES (Continue) Preorder traversal 128
  • 129. TRAVERSING TREES (Continue) The Second method to consider is inorder traversal - is defined with the recursive rule "visit the left subtree, then visit the root, then visit the right subtree ." , sometimes called symmetric order The implementation of a stack-based program for inorder is almost identical to the above program. This method of traversal is probably the most widely used 129
  • 130. TRAVERSING TREES (Continue) Inorder traversal 130
  • 131. TRAVERSING TREES (Continue) The Third method to consider is postorder traversal - is defined by the recursive rule "visit the left subtree, then visit the right subtree , then visit the root ." Implementation of a stack-based program for postorder is more complicated than for the other two because one must arrange for the root and the right subtree to be saved while the left subtree is visited and for the root to be saved while the right subtree is visited. 131
  • 132. TRAVERSING TREES (Continue) Postorder traversal 132
  • 133. TRAVERSING TREES (Continue) The Fourth method to consider is level-order traversal - is defined not recursive at all - simply visit the nodes as they appear on the page, reading down from top to bottom and from left to right , because all the nodes on each level appear together . level-order traversal can be achieved by using the program above forpreorder, with a queue instead of a stack: traverse(struct node *t) { queue.put(t); while ( !queue.empty( ) ) { t = queue.get( ); visit(t); if (t->l != z) queue.put(t->l); if (t->r != z) queue.put(t->r); } } 133
  • 134. TRAVERSING TREES (Continue) Level order traversal 134
  • 135. Heaps RootA heap is a certainkind of completebinary tree. When a complete When a complete binary tree is built, binary tree is built, its first node must be its first node must be the root. the root. 135
  • 136. HeapsComplete Left childbinary tree. of the Right child root of the root The second node is The second node is always the left child always the left child of the root. The third node is The third node is of the root. always the right child always the right child The next nodes of the root. of the root. The next nodes always fill the next always fill the next level from left-to-right. . level from left-to-right 136
  • 137. Heaps 45A heap is acertain kind 35 23of completebinary tree. 27 21 22 4 19 Each node in a heap Each node in a heap The "heap property" The "heap property" contains a key that contains a key that requires that each requires that each can be compared to can be compared to nodes key is >= the nodes key is >= the other nodes keys. other nodes keys. keys of its children keys of its children 137
  • 138. Adding a Node to a HeapPut the new node in the 45next available spot.Push the new node 42 35 23upward, swapping with itsparent until the new node 42 27 21 22 4reaches an acceptablelocation. 19 42 138
  • 139. Adding a Node to a HeapThe parent has a key that is>= new node, or 45The node reaches the root.The process of pushing the 42 23new node upward iscalled reheapification 35 21 22 4 upward . 19 27 139
  • 140. Removing the Top of a HeapMove the last node onto the 27root.Push the out-of-place nodedownward, swapping with its 42 23larger child until the new nodereaches an acceptable 35 21 22 4location. 19 140
  • 141. Removing the Top of a HeapThe children all have keys <= 42the out-of-place node, orThe node reaches the leaf. 35 23The process of pushing the new 27 21 22 4node downward is calledreheapification 19downward . 141
  • 142. Implementing a Heap 42 Data from the root goes in the first 35 23 location of the array. 27 21 Data from the 42 35 23 next row goes in the next two An array of data array locations. 142
  • 143. Implementing a Heap 42 Data from the next row goes in the next two array locations. 35 23 27 21 42 35 23 27 21 An array of data We dont care whats in this part of the array. 143
  • 144. Summary A heap is a complete binary tree, where the entry at each node is greater than or equal to the entries in its children. To add an entry to a heap, place the new entry at the next available spot, and perform a reheapification upward. To remove the biggest entry, move the last node onto the root, and perform a reheapification downward. 144
  • 145. Chapter 6 SORTING
  • 146. Sorting In numerous sorting applications, a simple algorithm may be the method of choice  often use a sorting program only once, or just a few times  elementary methods are always suitable for small files As a rule, the elementary methods - take time proportional to N2 to sort N randomly arranged items. If N is small, this running time may be perfectly adequate 146
  • 147. SELECTION SORT find the smallest element in the array, and exchange it with the element in the first position find the second smallest element and exchange it with the element in the second position Continue in this way until the entire array is sorted- It works by repeatedly selecting the smallest remaining element- A disadvantage of selection sort is that its running timedepends only slightly on the amount of order already in the file. 147
  • 148. Selection sort For each i from l to r-1, exchange a[i]with the minimum element in a [i], . . . , a[r]. As the index i travels from left to right,the elements to its left are in their finalposition in the array (and will not betouched again), so the array is fully sortedwhen i reaches the right end. template <class Item> void selection(Item a[], int l, int r) { for (int i = l; i < r; i++) { int min = i; for (int j = i+1; j <= r; j++) if (a[j] < a[min]) min = j; exch(a[i], a[min]); } } 148
  • 149. INSERTION SORT often use to sort bridge hands is to consider the elements one at a time inserting each into its proper place need to make space for the element being inserted by moving larger elements one position to the right then inserting the element into the vacated position 149
  • 150. Insertion sort example During the first pass of insertion sort, the S in the second position is larger than the A, so it does not have to be moved. On the second pass, when the O in the third position is encountered, it is ex- changed with the S to put A 0 S in sorted order, and so forth. Un- shaded elements that are not circled are those that were moved one position to the right.The running time of insertion sort primarily depends on the initial orderof the keys in the input. For example, if the file is large and the keys arealready in order (or even are nearly in order), then insertion sort is quickand selection sort is slow. 150
  • 151. Insertion sortFirst puts the smallest element in the array into thefirst position, so that that element can serve as asentinel;For each i, it sorts the elements a [1], . . ., a [i] bymoving one position to the right elements in the sortedlist a [1], . . . , a [i-1] that are larger than a [i],then putting a [i] into its proper position. template <class Item> void insertion(Item a[], int l, int r) { int i; for (i = r; i > l; i--) compexch(a[i-1], a[i]); for (i = l+2; i <= r; i++) {int j = i; Item v = a[il; while (v < a[j-1]) { a[j] = a[j-1]; j--; } a[jl = v; } } 151
  • 152. BUBBLE SORT Keep passing through the file exchanging adjacent elements that are out of order continuing until the file is sorted it is actually easier to implement than insertion or selection sort is arguable Bubble sort generally will he slower than the other two methods 152
  • 153. Bubble Sort (Continue)/* Bubble sort for integers */#define SWAP(a,b) { int t; t=a; a=b; b=t; }void bubble( int a[], int n ) { int i, j; for(i=0;i<n;i++) { /* n passes thru the array */ /* From start to the end of unsorted part */ for(j=1;j<(n-i);j++) {/* If adjacent items out of order, swap */ if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]); } } } 153
  • 154. Bubble sort example Small keys percolate over to the left in bubble sort. As the sort moves from right to left, each key is exchanged with the one on its left until a smaller one is encountered. On the first pass, the E is exchanged with the L, the P, and the M before stopping at the A on the right; then the A moves to the beginning of the file, stopping at the other A, which is alreadyBubble sort : in position. The ith smallest key reaches its O(n2) - Very simple code final position after the ith pass, just as inInsertion sort: selection sort, but other keys are moved Slightly better than bubble closer to their final position, as well. sort Fewer comparisons - Also O(n2) 154
  • 155. 155
  • 156. 156
  • 157. 157
  • 158. Chapter 7 SEARCHING
  • 159. Searching The goal of the search is to find all records with keys matching a given search key Applications of searching are widespread, and involve a variety of different operations Two common terms often used to describe data structures for searching are dictionaries and symbol tables In searching have programs that are in widespread and frequent use to study a variety of methods that store records in arrays that are either searched with key comparisons or indexed by key value. 159
  • 160. Searching (Continue) search algorithms as belonging to packages implementing a variety of generic operations that can be separated from particular implementations, so that alternate implementations can be substituted easily. The operations of interest include:  Initialize the data structure.  Search for a record (or records) having a given key.  Insert a new record.  Delete a specified record.  Join two dictionaries to make a large one.  Sort the dictionary; output all the records in sorted order. 160
  • 161. Searching (Continue) search and insert operation is often included for efficiency in situations where records with duplicate keys are not to be kept within the data structure Records with duplicate keys can be handled in several ways:  the primary searching data structure contain only records with distinct keys  to leave records with equal keys in the primary searching data structure and return any record with the given key for a search  to assume that each record has a unique identifier (apart from the key) and require that a search find the record with a given identifier, given the key  to arrange for the search program to call a specified function for each record with the given key 161
  • 162. Sequential Searching method for searching is simply to store the records in an array:  When a new record is to be inserted, we put it at the end of the array  when a search is to perform, we look through the array sequentially 162
  • 163. Sequential Searching (Continue) Property 1 - Sequential search (array implementation) uses N + 1 comparisons for an unsuccessful search (always) and about N/2 comparisons for a successful search (on the average)  For unsuccessful search, this property follows directly from the code: each record must be examined to decide that a record with any particular key is absent. For successful search, if we assume that each record is equally likely to be sought, then the average number of comparisons is (1 + 2 +…+ N)/N = (N + 1)/2, exactly half the cost of unsuccessful search 163
  • 164. Sequential Searching (Continue) Property 2 - Sequential search (sorted list implementation) uses about N/2 comparisons for both successful and unsuccessful search (on the average)  For successful search, the situation is the same as before. For unsuccessful search, if we assume that the search is equally likely to be terminated by the tail node z or by each of the elements in the list (which is the case for a number of "random" search models), then the average number of comparisons is the same as for successful search in a table of size N + 1, or (N + 2)/2 164
  • 165. Binary Search Binary Search is an incredibly powerful technique for searching an ordered list The basic algorithm is to find the middle element of the list  compare it against the key  decide which half of the list must contain the key  and repeat with that half Two requirements to support binary search:  Random access of the list elements, so we need arrays instead of linked lists.  The array must contain elements in sorted order by the search key 165
  • 166. Binary Search (Continue) Property 3 - Binary search never uses more than lg N + 1 comparisons for either successful or unsuccessful search  This follows from the fact that the subfile size is at least halved at each step: an upper bound on the number of comparisons satisfies the recurrence CN = CN/2 +1 with C, = 1, which implies the stated result.  It is important to note that the time required to insert new records is high for binary search Property 4 - Interpolation search uses fewer than lg lgN + 1 comparisons for both successful and unsuccessful search, in files of random keys  This function is a very slowly growing one, which can be thought of as a constant for practical purposes: if N is one billion, lg lgN < 5. Thus, any record can be found using only a few accesses (on the average), a substantial improvement over binary search 166
  • 167. Binary Tree Search Binary tree search is a simple, efficient dynamic searching method that qualifies as of the most fundamental algorithms in computer science The defining property of a binary tree is that each node has left and right links A binary search tree 167
  • 168. Binary Tree Search (Continue) Property 5 - A search or insertion in a binary search tree requires about 2 lnN comparisons, on the average, in a tree built from N random keys.  For each node in the tree, the number of comparisons used for a successful search to that node is the distance to the root. The sum of these distances for all nodes is called the internal path length of the tree. Dividing the internal path length by N, we get the average number of comparisons for successful search. But if CN denotes the average internal path length of a binary search tree of N nodes, we have the recurrence Property 6 - In the worse case, a search in a binary search tree with N keys can require N comparisons.  For example, when the keys are inserted in order (or in reverse order), the binary- tree search method is no better than the sequential search method that we saw at the beginning of this chapter 168