Data Structures 2004

1,246 views
1,166 views

Published on

Lecture notes of Data Structures by Sanjay Goel, July-Dec, 2004, JIIT Noida

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,246
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data Structures 2004

  1. 1. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 Data Structures (2004-2005)Lecture 1 (2 hr.) (22.07.04) (Class strength : Approx. 230-240 students)1. Question : What is Engineering?2. Question : Is Engineering an Applied Science? Is it enough to say so?3. Question : What is older, Engineering or Science?4. Feedback forms on lecture format.5. Design and execute an algorithm for collection of these forms.6. Macro Analysis : Nobody raised the hand when asked if anybody has rated the textbook explanation as the most important property of the lecture. Announced that I will not follow or explain any text book in the class. Instead, I will do Engineering and create something new with the participation of approx. 300 students.7. Announced programming assignments.8. Course Objectives and description.9. Question : Give examples of clear and precise expressions.10. Tools of expressions: Natural Language, Examples, Diagrams, Graphs, Mathematical expressions, formal expressions, word definitions, Counter examples. Engineers have to create new tools for enhancing the clarity and precision of expression. We will create some for our se in this course and perhaps also for future.11. In class exercise : Every body write one example, some sharing and critique by Instructor. Critique the example of your buddy12. Design an algorithm : Everybody think of a number. Design and execute an algorithm for adding these numbers. Identify the sequential and parallel phases.13.In class exercise : Everybody write this algorithm clearly and precisely Critique your buddy’s algorithm for its clarity and precision. Continue at home. Use examples, counter examples, diagrams and/or mathematical formal expressions for bringing clarity and precision.14. Question : What does creativity require?15. Question : What kind of material/media has been used by man for expressing creativity?16. Question : What is our material for creativity? Computer memory.17. In class exercise : Think of any existing or imaginary thing. Think how can you represent it in computer memory. Write it down. Show it your buddy. Critique your buddy’s work.18. A structured representation of data for computer memory is data structure.19. What is Computer Science? Some Definitions.Lecture 2 (1 hr.) (27.07.04) (Class strength : Approx. 200 students)1. Write down, what new thing, if any, did you learn in the last class?2. Write down, what new Question came to your mind, if any, for which you have not yet found the answer?3. Share your thoughts with your buddy.4. Think of some real existing and imaginary thing (object of interest, OI), How to represent it in computer memory? Can you write a clear and precise representation?5. Question : How to represent __________ in the computer memory? Identify your OI to fill in the blank.6. Write five different OIs for this template.7. Pick up any one of these and start representing it for computer memory.8. The key to find the answer to a question is to ask another more pin pointed question to yourself.9. Question : What question should you ask to answer these questions on “How to represent ….”?10. Question : What is the difference in some thing and its representation?11. Representation represents only some aspect of the real for some specific purpose.12. Question : What is the purpose of representation for each chosen OI (real or imaginary) by you?13. Question : What aspect of chosen OI do you need to represent in order to meet the chosen purpose?14. Refer to Lect_1_01 and Lect_1_02 of 2002.Lecture 3 (2 hr.) (29.07.04) (Class strength : Approx. 200 students) 1. Firm up your group of six.
  2. 2. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 2. Group exercise : Share your OIs. Pick up an OI for further work. 3. Develop the purpose for its representation. Class sharing. 4. Make it clear and precise with examples, counter examples, graphs, drawings, formulas, scenarios and so on. 5. Re-examine, Is engineering an applied science. That’s a scientist’s perspective. Engineers perspective is that it uses science as a tool. E.g. Carpentry is not applied hammering. Instead Carpentry uses hammers. 6. Find out those scientific concepts that were discovered as a result of solving some engineering problem. 7. Keep the purely numerical computational problems out for this course. Work for non- numeric problems or mixed problems. 8. Generic checklist of computational processes (like the Rasa’s in a drama) and try to have more than one of these in your story. Enrich your story by checking if your design has got such element in it or not. 9. Write individual design stories, clearly and precisely. 10. Refer Lect_2_01 of 2002.Lecture 4 (1 hr.) (03.08.04) (Class strength : Approx. 200 students) 1. WAP to shift up the elements of an array after deleting one element from in between. Make the last element as blank. 2. WAP to shift up the elements of a structure after deleting one element from in between. 3. Draw a lesson from the above two examples. A well structured data can help in one or more of following areas : i. Run time memory saving, ii. Run time time saving iii. Cleaner and simpler processing logic. 4. WAP to circularly shift up the elements of an array after deleting one element from in between. 5. WAP to shift up the elements of an array, where each element is a structure, after deleting one array element from in between. Make fields of the last array element as blank.Lecture 5 (2 hr.) (05.08.04) (Class strength : Approx. 120 students) 1. Design your memory representation for an OI (single variable polynomial). Purpose of representation is to evaluate and display a single variable polynomial. Convert your representation into a data structure. Use this data structure to WAP for display and evaluate any user inputted single variable polynomial. 2. Evaluate different options of these representations (data structures) in terms memory requirement, run time and also simplicity of logic (of algorithm) design. 3. A well structured data can help in one or more of following areas : i. Run time memory saving, ii. Run time time saving iii. Cleaner and simpler processing logic. 4. Use this data structure for WAP for adding, subtracting, multiplying, differentiating and integrating single variable polynomials.Lecture 6 (1 hr.) (10.08.04) (Class strength : Approx. 220 students) 1. WAP for deleting an element from a linked list of integers. 2. Writing good programs : i. Give useful names to all variables ii. Prefix variables names with type sybols e.g. all integers variables with i_, all character variables with c_ and so on. iii. Initialize all variables iv. Write readable programs rather than tricky programs v. Consider using one Free memory corresponding to every malloc. 3. Array elements are directly addressable i.e. it takes same amount of time to access any element of an array. Whereas access time varies in a linked list.
  3. 3. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 4. Compare the run time costs of element deletion from an array with element deletion from a linked list. 5. You can use an additional variable i.e. counter of elements to keep the count of elements in the array and linked list. You can also consider initialising array with values out of the permitted range e.g. if the values are expected to be within –20 to 20, you can use –100 or 100 to indicate empty elements. Zero in this case is a legal value within the range and does not indicate emptiness. 6. Design a data storage scheme for storing polynomial functions and series of numbers. Design an algorithm to test if the given series is a Taylor series approximation (at x) for a given polynomial function and x0. Taylor series approx. of a given function is as : 7. While equating real numbers do not use ==, instead, check if abs(n1-n2) is <= e1. where e1 is very small real number and its value depends on application. 8. Design a dynamic data structure for storing a randomly ordered collection of single variable polynomial functions and another static data structure for storing randomly- ordered collection of real number-sequences. The records in both the collections also should have additional provision for storing indices of all the matching entries (if any) in another collection. One entry in any collection may match with none, one or many entries in another. A number-sequence is declared as matching with a polynomial, if all the numbers in the sequence match with corresponding terms of the Taylor series expansion of a function for given x and x0 within the limits of a user-defined ‘permitted-mismatch’. Design an algorithm for updating matching indices in both the collections for a given user-defined input of ‘permitted-mismatch’, x and x0. Lecture 7 (2 hr.) (12.08.04) (Class strength : Approx. 200 students) Post class Assignment: 1. Modify the last assignment of previous lecture to take input from text file and also output the result into a file. 2. WAP to shift up the elements of an array stored in a file after deleting one element from in between. 3. WAP to circularly shift up the elements of an array stored in a file after deleting one element from in between. Lecture 8 (1 hr.) (17.08.04) (Class strength : Approx. 230 students) 1. Write an algorithm for reversing the order of elements in an array. 2. What does the following algorithm do? Illustrate your answer with example. void WhatdoIdo (NodePtr p) { if (p) { WhatdoIdo (p -> next); cout << p -> data; }; } 2.1 What if the order of statements in the inner block is exchanged? 2.2 Transform this algorithm into an equivalent non-recursive algorithm. What are your options ? Evaluate the options. 3. Option 1: Do not use any extra memory for doing the task Option 2: Use an Array to copy the elements and then print the elements of array in reverse order as array elements can be easily printed in any order.
  4. 4. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 Option 3: Use a linked list to copy the elements in reverse order , insert the new node in front of the existing list (rather than at the back). 4. WAP based on each option. 5. As software designers, we use additional memory for designing more efficient algorithms. This additional memory can be used in various forms. Each form has some merits and demerits. 6. Unless specifically permitted or required, changing the input (form or the content) is not acceptable in most cases. Lecture 9 (2 hr.) (19.08.04) (Class strength : Approx. 180 students) 1. Pick one of the following or any other similar popular family game and propose a design for a software version Snakes and Ladders Ludo Chess Any Cards based Game Make-a-word Rubic Cube You can design you software either on the model of ‘play with computer’ or ‘play through computer’ 2. A design team has conceived the following initial specifications of a search engine for a large company’s internal Digital Library Only specially authorised users can upload new documents or new versions of old document. All employees can look at the documents. Information systems department will create, update and maintain a list of keywords for faster search facility. The search engine users can also search by entering any word through the keyword. Searched documents are to be listed as follows: Case A, Faster search on a listed keyword : As per the frequency of occurrence of the word i.e. the documents having higher “density” of the chosen keyword will be listed before the documents having lower density, where, density[k, d] =(Occurrence count of the word k in d)/(word count in d) Case B, Search by entering a word though the keyboard : As per the frequency of usage of a document, where usage is defined as number of times a document is opened by users through the search engine. Draw a design diagram (Concept map) for this problem. Refer to ADS (DSII 2002-03) lecture #6 slide 6 to 27 and Lecture #9 slide no 12-13. 2. Progressive development of software. Programming is only a small part of the process. All Engineering design require and create drawings. It is possible to create software through its architectural drawings. It becomes much simpler a process that is easy to follow and monitor. Stage 1. Write a design story. Stage 2. Prepare a checklist of nouns and verbs in the design story. Stage 3. Draw 1st level Concept map of software. Draw a graph of Nouns and verbs. Stage 4. Draw 2nd level Concept map of software. Put verbs inside oval shape and nouns in rectangular shaped boxes. Stage 5. Draw 3rd level Concept map of software. Put examples of data in the noun boxes. Stage 6. Draw 4th level Concept map of software. Identify data tanks (rectangular boxes with multiple record in the same format but with different values) and put them inside double lined rectangular boxes. Suitably label data tanks. Iterate through above cycle and critique in a group. Check for inconsistencies and incompleteness at each iterative stage and update your concept map. Coding will follow in later stages. 4. While flowchart gives a process centric view of the software, Concept map gives a data centric view.
  5. 5. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 5. The data tanks will be realised through data structures and the verbs will be realised through algorithms. 6. Draw the concept map for your chosen family game as per first exercise of the day. 7. Draw the concept map for your design stories of your earlier assignments.Lecture 10 (1 hr.) (24.08.04) (Class strength : Approx. 180 students) 1. Elements of a design story: - Plot / Theme - Characters with defined roles : (Anything that has a Stimulus-Response behaviour is a character. Identify the stimulus-response pairs for your characters) - Events - Sequence - Time based events/objects - Interactive events/objects - Characters take Actions - Actions have consequences - Consequences are actions by other characters. 2. Design - Regard engineering design practice as a process of "story telling". - Design is story telling Stories Design Concept Prototype ……… Product 3. Some Tips on story writing - Experience lots of stories - Start with the whole and move to the parts: Present the big picture within a whole- system global context and connect to local initiatives. 4. Some Guidelines and hints on construction of Concept maps - This concept map will provide a birds-eye view of a collection of interacting and collaborating data tanks and data items. - This Concept map will be a diagram of inter-connected data tanks via processing units with marked labeled boxes and arrows. - Give indication of what, when and how does some data move or change in any data tank. - Use double line boxes for data tanks containing several homogeneous data items and single line boxes for single data item/packets, if any. Give examples of the data inside each data tank. - Put a small circle on the top right corner of boxes, if it represents dynamic data i.e. the data can change as a result of valid operations. Put another circle on the top left corner, if the data population size can change during processing. This dynamic data is not to be confused with dynamic data structure as this higher level of dynamism can be implemented with dynamic or static data structures at lower layer. - Use oval shape boxes for processing units. - Your concept map should be hierarchical i.e. it should gradually show more details in different diagrams rather than showing all the details in one diagram. Initially focus on most critical aspects. 5. Write a detailed story (half a page to full page) for an automation problem. Create hierarchical Concept Map for it. Document all the versions.Lecture 11 (2 hr.) (26.08.04) (Class strength : Approx. 100 students) 1. Some more Guidelines and hints on construction of Concept maps: - Mention the strategic positions and attributes of typical data item for every data tank. Draw two dotted horizontal lines and write the strategic positions in the central portion and attributes in the lower. - If your data tank is compound i.e. you need some additional ancillary and smaller data tanks (e.g. indices and so on) to support efficient searching of appropriate data items in the principal data tank, include the names of these ancillary data tanks in the principal data tank itself. by dividing your principal data tank in two units by a single horizontal
  6. 6. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 line and listing these names in the lower portion. The Principal data tank in the upper unit will have three portions separated by two dotted horizontal lines as mentioned in ‘8’ - Later expand this compound data tank into principal data tank interconnected with ancillary data tanks in a different diagram. - Name your data tanks as plural nouns (e.g. trains, passengers, books, users and so on) that contains many homogeneous items only. - Name your ancillary data tanks using the name of principal data tank, followed by an underscore and then by another plural noun e.g the ancillary data tanks for data tank ‘trains’ could be named as ‘trains_train-nos’, ‘trains_destinations’ and so on. - Name your processing units as verbs only. - Put the name of the data that flows in/out of data tanks. - Write the properties and functional behaviour for each data tank by giving clear description of the content and also permitted legal operations on each data tank and data item. - Do not mix or confuse this concept mapping technique with any other diagramming technique like DFD or ER diagram and so on. There may be some similarities with some but it is different. 2. Some guidelines for identifying generic data tanks: - Look at each data tank as a collection of homogeneous (similarly structured) data items. - These individual data items could be atomic or compound. - This collection may or may not require some inter data item organisational constraints. - Perform a mathematical reduction by Replacing problem specific nouns in each problem with generic variables like x, y, z and so on. - Examine these data tanks and look for structural similarities like all linear equation are structurally similar to each other irrespective of number of terms, all polynomials are also structurally similar, all first order differential equations are also structurally similar to each other. - Represent compound data items by a single generic macro variable in capital letters like X, Y and so on. Repeat the process till there is scope of more variable grouping into single macro variable i.e. reduce the data tank as a collection of some homogeneous single macro variables. - See if individual data items in such abstracted data tank are required to be arranged in some specific order or not. If yes, define the order. All the data tanks that require same definition of ordering can possibly be termed as similar. - If the data tank is an ordered collection of X, see how the relative position of a specific data item is defined with respect to other similar data items. - Some Possible arrangements. - a. X ; (Linear) b. . X ; (Non linear : Descending Tree) . . c. . X . (Non linear: Ascending Tree) . d. . . (Non linear: Graph) . X . . .
  7. 7. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 - Relative position could be in terms of order of insertion or relative value of some data item . - Examine Relative Positional eligibility for Retrieval : All/ only some strategic relative positions. - Examine Relative Positional eligibility for Manipulation : All/None/ only some strategic relative positions. - Examine Relative Positional eligibility for Insertion : Any empty slot/ only some strategic relative positions. - Examine Relative Positional eligibility for Deletion : All/None/ only some strategic positions. - If retrieval, manipulation, insertion and deletion are dependant on some well defined strategic relative position within the data tank; observe, identify and define these positions. - Some examples of strategic relative positions can be as follows : Based on order of insertion: earliest, latest, after the latest insertion, before the earliest insertion, 3rd earliest, 4th latest, next relative to the current position as per insertion order, previous relative to the current position as per insertion order and so on; Based on the value : Minimum, Maximum, 3rd minimum,in between a given range of values, in between an appropriate range of values. - Find out the similarities based on these observations. - Create the detailed concept map of your design story - Do the group exercise as per the project I details. - Create the concept maps for the data tank representation of book, dictionary, Thesaurus, Eicher road map, Periodic table, Atlas, Railway TimeTableLecture 12 (1 hr.) (31.08.04) (Class strength : Approx. 120 students)1. What are the core competencies for engineering professionals ? Excerpts from a research project (SPINE) in USA and Europe : Problem solving is the most important skill for engineers.2. What is a problem ? What is problem solving ? Option A: Applying well defined and known algorithms to well structured problems e.g given F and A, find M; find integration of x2. (We don’t talk of such problems while talking of problem solving) Option B: Finding/ designing/choosing and then applying a new algorithm (may be adaptation of an existing one) to a well structured (new type) or not so well structured problems e.g. program development Option C: Applying your heuristics (thumb rules) to solve not so well structured problems e.g. playing tic-tac-toe; Rubic Cube, Chess playing, Management, Design In this course, we are mainly learning second class (option B) of problem solving. We are also learning how to define a problem as a more structured one starting from the initial not so structured problem definition. We also intend to develop some heuristics to efficiently solve problems of this class.3. Program = Algorithm + Data Structure + User Interface; The design starts from design of UI, then Data Structures and finally algorithm. This process is open to iteration.4. Software is developed through progressive “zooming” of initial problem statement. Solution lies in problem statement. The problem is “zoomed” into sub-problems and this “zooming” process continues till each final sub-problem can be directly translated into a set of finite and precise instructions (can be represented though flow chart, pseudo-code and finally translated into a machine readable programming language).5. Insertion sort : evaluation of several options6. Examine how creation and disciplined management of some data can help in more efficient algorithms.7. Refer to Lect_2_02 and Lect_2_03 of 2002.
  8. 8. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004Lecture 13 (2 hr.) (02.09.04) (Class strength : Approx. 120 students) 1. Software is developed through progressive “zooming” of initial problem statement. Case study of Insertion sort. Refer to Lect_2_02 and Lect_2_03 of 2002. 2. Defining Algorithmic problem in terms of Input characterisation and output specification. Refer to Lect_2_06 of 2002. 3. Issues involved in analysis and design of algorithms 4. Are algorithm scalable ? 5. Algorithm correctness. 6. Time complexity and space complexity 7. Comparing the run time to execute Linear search and Binary search algorithms.Lecture 14 (1 hr.) (14.09.04) (Class strength : Approx. 110 students) 1. Case Study on Concept Map Design for Homoeopathy reference by Kanisha (B6). - Boger’s Materia Medica has several parts that are interlinked. 2. Comparing the run time to execute Linear search and Binary search algorithms. 3. Mathematical Analysis of step count for estimation of run time for iterative algorithms using the example of Insertion sort. Algorithm. Refer to Lect_2_05 of 2002. 4. Assignment : Using this method, estimate the step count for more iterative algorithms known to you e.g linear search through unsorted array, linear search through unsorted array, binary search, matrix multiplication, bobble sort, selection sort and all the iterative algorithms studied by you in Numerical Analysis course like Gauss Seidel SOR, Power method for finding Eigen values, Jacobi Method for finding Eigen values. 5. Assignment : Write programs for insertion sort and any other two algorithms (at least one for Numerical Analysis set) mentioned in above assignment with a modification of introducing the logic for step counting within each program itself. Run the programs with this added logic for different data size. Generate data for step count Vs data size for each algorithm. Draw Graphs showing this function using Excel. Members of same group should write programs for step count of different algorithms.Lecture 15 (2 hr.) (16.09.04) (Class strength : Approx. 100 students) 1. Run time of program = C x S, where S= Step count of all statement and C depends on computer speed And S = K x S1 where S1 = Step count of only key statement and K >=1 depends on algorithm 2. Worst Case, Best Case and Average Case Time Complexity of iterative and recursive algorithms using the example of Insertion sort, factorial and Fibonacci number Algorithm. Refer to Lect_2_04 and 2_06a of 2002. 3. Assignment : Analyse the time complexity of Ackerman function.Lecture 16 (1 hr.) (21.09.04) (Class strength : Approx. 120 students) 1. Review of Best Case and Worst case Time Complexity Analysis 2. Average Case Time Complexity Analysis example of Linear Search through unordered array and ordered array ; Binary Search 3. Assignment : Best case, Worst Case and Average case Time Complexity Analysis for following algorithms: i. Linear Search through unordered array ii. Linear Search through ordered array iii. Binary Search iv. Insertion Search v. Bubble Sort vi. Selection Sort
  9. 9. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004Lecture 17 (2 hr.) (22.09.04) (Class strength : Approx. 100 students) 1. Selection and Bubble Sort Algorithms. Refer ADS (DS II 2002-03) Lecture #6 Slide 32-37. 2. Average Time Complexity Analysis of Selection and Bubble Sort. 3. Project demonstrations on visualization of wave file by P. Singh & Group. This software is recommended be used by all to do simulation for digital communication in ADC course. 4. Demonstration of Graphics project by Saransh. 5. Space Complexity, Refer DS 2002 Lecture #2-6. 6. Algorithm Visualisation : examples of Bubble, Selection, Insertion sort algorithms. Refer to Refer ADS (DS II 2002-03) Lecture #8 Slide 8-19. 7. Impact of Hardware speedup on the solvable problem size for any given algorithm. Refer DS 2002 Lecture #2-7. 8. Assignment : Write Bubble, Selection and Insertion sort programs with dynamic algorithmic visualizationLecture 18 (1 hr.) (28.09.04) (Class strength : Approx. 130 students) 1. Relationship of Data Tanks and Algorithms. Data tanks have functions like the buttons on the tanks. These functions are realized by software (algorithms). Discrete data flows in/out of the data tanks at discrete time. 2. Categories of data tanks : - Problem specific : Part of problem (explicitly occur in the detailed statement of problem) - Solution specific: Part of solution (not mentioned in the problem but designer conceives these data tanks to create designs to solve the problem . Different designs solutions may have different data tanks of this category). - Often higher level complex processing units in the concept maps may zoom into a sub concept map having lower level simpler processing units and additional solution specific data tanks. This also may require further expansion in some cases until all processing units are at the simplest level. 3. Consolidation of taxonomy of data tank types (as needed by students applications) on the basis of structural properties. 4. Results of last year’s consolidation:Time of insertion (TOI) None Earliest Insertion (EI)Linear ValueNon Value+Time (VT) Anywhere (aw) After the last insertionLinear (ALI)After Last Low Priority Insertion (ALLPI) Before Last High Priority Insertion (BLHPI)Last Low Priority Insertion Last High Priority Insertion(LLPI) (LHPI) Ordered Insert-ion Delet-ion Interrogat-ion Manipul- ADT Name by at at at ation atData tank Types needed for modelling student design 1 Linear Value ap aw aw aw 2 Linear TOI ALI aw aw None Semi-Open Queue 3 Linear Value ap aw aw None 4 Linear None aw aw aw aw 5 Linear None None None aw aw 6 Linear TOI ALI None aw None
  10. 10. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 7 Linear TOI ALI EI LI None Queue with semi-open back 8 Linear TOI ALI aw aw aw Open Queue 9 Linear Value None None aw None 10 Linear VT ap None aw None 11 Linear None aw aw aw aw 12 Linear None aw aw aw None 13 Linear Value ap aw aw aw 14 Linear Value ap None aw aw 15 Linear TOI None None aw None 16 Non VT ap sublists aw None Linear last 17 Linear TOI ALI None aw aw 18 Linear None aw None aw NoneAdditional types (Not needed) for conceivedproblems in 2003 19 Linear TOI ALI LI LI None Stack 20 Linear TOI ALI EI LI, EI None Queue 21 Linear TOI ALI LI, EI LI, EI None Shelf 22 Linear TOI+Priorit ALLPI, LLPI LLPI, LHPI None Scroll/Roll y BLHPI 23 Linear TOI+Priorit ALLPI, LHPI LLPI, LHPI None Scroll/Roll y BLHPI 24 Linear TOI+Priorit ALLPI, LLPI, LLPI, LHPI None Deque y BLHPI LHPI4. Assignment: Label your concept maps with the Data tank type id as per the above list. Identify additional structurally different data tanks for your application that are not part of the above list. Expand this list.
  11. 11. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004Lecture 19 (2 hr.) (30.09.04) (Class strength : Approx. 100 students) 1. Result of new consolidation incorporating the requirements of new 2004 batch:Time of insertion (TOI) None Earliest Insertion (EI)Linear ValueNon Value+Time (VT) Anywhere (aw) After the last insertionLinear (ALI)After Last Low Priority Insertion (ALLPI) Before Last High Priority Insertion (BLHPI)Last Low Priority Insertion Last High Priority Insertion(LLPI) (LHPI)Before Last Insertion (BLI)Data tank Types (Templates) needed for modelling studentdesign(2003 batch) 1 Linear Value ap aw aw aw 2 Linear TOI ALI aw aw None Semi-Open Queue 3 Linear Value ap aw aw None 4 Linear None aw aw aw aw 5 Linear None None None aw aw 6 Linear TOI ALI None aw None 7 Linear TOI ALI EI LI None Queue with semi-open back 8 Linear TOI ALI aw aw aw Open Queue 9 Linear Value None None aw None 10 Linear VT ap None aw None 11 Linear None aw aw aw aw 12 Linear None aw aw aw None 13 Linear Value ap aw aw aw 14 Linear Value ap None aw aw 15 Linear TOI None None aw None 16 Non VT ap sublists last aw None Linear 17 Linear TOI ALI None aw aw 18 Linear None aw None aw NoneAddiotional types (Templates) required by2004 batch 19 Linear None None None aw None 20 Linear VT ap+aw aw aw aw 21 Linear VT ALI aw aw aw 22 Linear TOI BLI EI aw None 23 Linear TOI None None aw aw 24 Non VT ap aw aw None Linear
  12. 12. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 25 Linear Value ap None aw NoneAdditional important types (Templates) but notneeded for students conceived problems (2003 and2004 batch) 26 Linear TOI ALI LI LI None Stack 27 Linear TOI ALI EI LI, EI None Queue 28 Linear TOI ALI LI, EI LI, EI None Shelf 29 Linear TOI+Priority ALLPI, BLHPI LLPI LLPI, None Scroll/Roll LHPI 30 Linear TOI+Priority ALLPI, BLHPI LHPI LLPI, None Scroll/Roll LHPI 31 Linear TOI+Priority ALLPI, BLHPI LLPI, LLPI, None Deque LHPI LHPI 2. Data Storage : Options - Amorphous or Structured 3. Amorphous storage (collapsed structure) makes insertion of data very easy but retrieval of required data is very inefficient. 4. Structured data storage requires more discipline and effort at the time of insertion of data so that retrieval becomes more efficient. 5. Several types of primary data types and Data Structuring facilities are offered by Programming Languages. 6. All application specific data for any data-tank needs to be represented and stored in terms of language specified primary data types using structuring facilities for future usage and processing. 7. Data of one data-tank can be stored on primary/secondary or mixed memory. 8. For storing structured data, addresses of individual record within a data tank can be realised through following optional addressing mechanisms: i. Formula based (Direct addressing) ii. Linked List iii. Indirect addressing (using a directly addressable index) iv. Simulated Pointer. 9. All data tanks can be realised through any of the five mechanisms , amorphous or any of four addressing mechanisms 10. Refer 2002 DS Lectures Lect 02-08, Lect 02-09, Lect 02-10, Lect 02-11, and Lect 03-01. 11. Assignment : Implement a data tank from your individual design story. Store the data on file using amorphous (collapsed structure) storage. Implement the functions for Insertion, deletion, interrogation and modification operations as per the requirement of your design.Lecture 20 (1 hr.) (5.10.04) (Class strength : Approx. 110 students) 1. Review of software design artefacts – design story, concept map, data tanks (collection of records), algorithm, data tank template, data structure. 2. Taxonomy of data tanks : Perspective 1 : Static Vs Dynamic Data Tank.Dynamic data tanks allow changes in the data values and/or adition/deletion of record. (not to be confused with dynamic data structures) Perspective 2 : Structural feature (Linear/Non linear and strategic positions for access, insertion, modification and deletion) as per above table Perspective 3 : Problem specific Vs Solution specific 3. Discussion of last class’s assignment: Implement a data tank from your individual design story. Store the data on file using amorphous (completely collapsed structure i.e. no predictable relative ordering of the records within the data tank and no predictable relative
  13. 13. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 ordering of the fields with a record) storage. Implement the functions for Insertion, deletion, interrogation and modification operations as per the requirement of your design. 4. Complete this assignment before next class.Lecture 21 (2 hr.) (7.10.04) (Class strength : Approx. 70 students) 1. Review of the last Assignment. 2. Program demonstration by Mayank (MCA). 3. Learn about file modes & file operations and string.h. 4. WAP to create a test file of 10,000 records for your individual application for a data tank as identified by you in your individual concept map in following different styles of data storage: 1. Completely amorphous: unordered collection of variable length strings for records with variable number of fields. 2. Semi amorphous : 1. unordered collection of fixed length strings for records with fixed number fields. 2. unordered collection of variable length strings for records with fixed number fields. 5. Formula based : Ordered collection of fixed length records without any missing keys. 6. Indirect addressing : Using Array based Index structure on multiple fields. Use binary search over the index array. Store index in a separate file and make it RAM resident at the run time of application. 5. Write programs for following operations on all the above files: 1. Insertion 2. Deletion 3. Retrieval 4. Updation 6. Integrate the above collection of functions in one application with a common UI. 7. Compare the time performance of the four types of storage formats for your application on the following parameters: 1. Average insertion time as a function of file size (in terms of number of records at the files sizes of 10, 100, 1000, 4000, 6000, 8000,10,000 records). 2. Average deletion time as a function of file size (in terms of number of records at the sizes of 10,100, 1000, 4000, 6000, 8000, 10,000 records). 3. Average retrieval time as a function of file size (in terms of number of records at the sizes of 10, 100, 1000, 4000 , 6000, 8000,10,000 records). 4. Average updation time as a function of file size (in terms of number of records at the sizes of 10, 100, 1000, 4000, 6000, 8000, 10,000 records). Draw four graphs for each comparison. You are encouraged to write a program for drawing your graphs. 8. Compare the results across the different applications within every group. Note: Group Members are encouraged to collaborate but finally every group member has to submit a different application individually created by every student. Lecture 22 (1 hr.) (12.10.04) (Class strength : Approx. 70 students) 1. Formula based addressing: if formula function is 1:1 , we need N memory slots for N potential keys, even if there are only n << N keys under usage e.g. database of records with 6 digit roll numbers will require 106 memory slots even if there are only 2000 record in the database. This results in huge wastage of memory as we need to keep reserved memory for all possible keys even if there is no record with that key in the database. 2. Hashing provides a solution of this problem and gives huge memory saving. We use a address calculation formula function which is M:1 rather than 1:1. So many keys contend for one slot.
  14. 14. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 3. Discussion on Hashing (invented in mid fiftees) as a formula based addressing scheme of data bases. Hash Table, Hash function, Synonym , Collision Detection, Collision Resolution, Open Hashing, Bucket Hashing, Closed Hashing. Refer 2002 DS-II Lect 34 slide 54 to 74. 4. Load factor = Number of record/ Number of slots. For collision avoidance load factor < 1. Load factor of 0.5 to 0.7 results in good performance. 5. The worst hash function maps all keys to the same address. 6. The best hash function maps all keys to distinct addresses. Difficult to design. 7. Data files and Index files can also be stored as a Hash files. 8. Assignment : WAP to copy your structured data file of earlier assignment into a RAM based Hash Table using Open Hashing/ Bucket hashing/ Closed Hashing using appropriate hashing function depending on the following criteria: Enrollment number %3 = 0 : Open Hashing, Enrollment number %3 = 1 : Bucket Hashing Enrollment number %3 = 2 : Closed hashing 9. Assignment: Analyze the best case, worst case and average case time complexity for the three hashing techniques for insertion, deletion, search a record, modification of non key field and modification of key field. Lecture 23 (1 hr.) (26.10.04) (Class strength : Approx. 45 students) 1. Data Storage using Linked Addressing (RAM based storage) 2. Data Storage using Simulated Pointers (File as well RAM based storage) 3. Both these approaches efficiently facilitate ‘logically ordered (sorted) storage’ of data elements without expensive data movement at insertion and deletion operations. Data files as well as index files can be stored using simulated pointer addressing. 4. Forward traversal and Reverse order traversal (iterative and recursive) through Linked as well as Simulated pointer based data storage. 5. Cost of Recursion : very costly in terms of run time as well as memory requirment. Refer 2002 DS Lect 3-17 slide 10 to 18. 6. Recusive algorithms can be converted to iterative algorithms. 7. Simple recursive functions e.g. factorial, forward linted list printing amnd so on (function call is the last excecutable statement within the function) can be easily converted to more efiicient (in terms of run time and also run time memory) iterative algorithms. Other recursive algorithm require most sophisticated approaches by using additional solution specific Data tanks (usually Stacks or Queues) for buffering the accessed but unprocessed data elements from the problem specific data tanks. 8. Revised Individual Assignment (This is expanded and consolidated from the assignment given in lecture 21 and lecture 22) : (last data 5th Nov) 8.1 Identify a data tank in your individual design story and implement it using different store techniques of Completely amorphous file, Semi amorphous file, Formula based file, RAM based Hash table (as per the criteria declared in lecture # 22), Indirect addressing using Array based Index, Indirect addressing using Hashed Table based Index, Indirect addressing using Linked list based Index, Simulated pointer based ordered file, write programs for following operations: 1. Insertion of one record at a time. 2. Deletion of one record at a time. 3. Retrieval of one record at a time. 4. Modification of non key fileds of one record at a time. 5. Modification of Key field of one record at a time. 6. Ordered List Display of all records as per alphabetical ordering of key. 7. Reverse List Ordered Display of all records as per alphabetical ordering of key. 8. Ordered Range Display of all records having key value within a range of Key1 to Key2 as per alphabetical ordering of key. 8.2 Compare the time performance of above storage formats for your application for all above operations by measuring average case performance as a function of file size (in terms of number of records at the files sizes of 10, 100, 1000, 4000, 6000, 8000,10,000 records). Generate data for this experimental comparison by running your program with different data size. 8.3 Draw graphs for each comparison. You are encouraged to write a program for drawing your graphs. You are also encouraged to use Lagrange interpolation for drawing smooth graph.
  15. 15. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 9. Group Assignment: (Last data 5th Nov) 9.1 Compare the results of above individual assignment across the different applications within every group. 9.2 Analyze the best case, worst case and average case time performance for these storage structures for all the above mentioned operations at ‘8’. 10. Individual Assignment in lieu of Minor II: (Last data 10th Nov, 15 marks) Implement the Program for your individual design story (or part of it) with at least 4 data tanks. Use different data storage formats for different data tanks. If your story does not have at least four data tanks, enhance your story and have at least 4 data tanks. 11. Complete your group project as per your group’s original design story (or part of it) before 15th Nov. It will have a weightage of 20 marks for each student. 12. Complete all your Lab Assignments in time. They carry total weightage of 100 marks for each student. This does not include the marks mentioned above at 10 and 11. The total course marks inclusive of all components are 200 which includes 15 marks for Minor I, 35 marks for Major and 15 marks for Tutorials and regularity. Lecture 24 (1 hr.) (1.11.04) (Class strength : Approx. 50 students) 1. Software Design Process: Design story Concept Map of Problem Concept map of Solution (with simpler verbs and new data tanks, if needed) data structures and detailed algorithms performance analysis (using time and space complexity analysis techniques) program Software Testing and evaluation. 2. Usually software design is an iterative process and design and devlopment team may decide to go back to a earlier stage after analysisg the results of some later stage before proceeding further. For example, if performance analysis gives an impression that space utilisation or time performance is not satisfactory then new data structure/algorithm need to be crerated before writing the program. If this also does not improve the situation then a new solution (with a new concept map) has to be designed with a different approach. 3. Popular data structure like Stack, Queue and Deque are not so much used for modeling the problems itself, however they are very useful for designing solutions for a variety of problems. They are used like buffers for temporary storage of produced/arrived/accessed/seen but unprocessed data. 4. Many applications can be modeled as data producer(s), buffer(s) and server(s). 5. Usually buffers are non-empty for most of the time during run time because - often, producers are more in numbers than server(s) or - they produce faster than server can serve or - server can not immediately process certain type(s) of data produced by producer(s) 6. Buffers are temporary data tanks that get data only during run time oif the application. They are empty at the beginning and at the end of the applications. In many algorithms an emptied buffer is a terminating condition for the algorithm. 7. Producer is that processing module which continuously/ periodically inputs a single data into the buffer for processsing by server. 8. Server is that processing module which continuously/ periodically takes a single data out of buffer for processsing and processes it (it could be as simple a processing as ‘print’ or a very comprehensive detailed algorithm could be executed in some applications). 9. Stack is used when the server needs to acess the buffer as LIFO. Many applications require such buffer e.g. undo in editor, back in internet browsers and so on 10. Queue is used when server needs to access the data as FIFO. Many application require such buffer e.g. print server, internet server, job scheduler, process scheduler, file server, data base server, mother dairy booth and so on. 11. Deque is used when server needs to access the data as LIFO and also as FIFO depending on certain conditions. Limited application require such buffer e.g. certain special types of process schedulers.
  16. 16. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 12. Performance comparison of different storage schemes. 13. Implementing Index for Indexed storage: simple ordered array based storage offers good search performance using Binary search but makes insertion and deletion slow as it requires expensive movement of index records. An ordered linear linked list will give flexibility of insertion and deletion but search will be slow as it will require linear search. We want to have both of the following: i. Good search time (similar to Binary search over ordered array) and ii. Good insertion and deletion time (similar to singly linked list) 12. Essentially use a special non-linear Linked structure rather than ordered array such that it also follows (at least tries a close approximation of) a search strategy like Binary search over an ordered array. 13. Binary Search Tree is a special Binary tree to facilitate such search mechanism. It follows some constraints for positioning the nodes within the tree i.e. values on the left side are lesser than parent and values on the right side are more than the parent. 14. BST search requires no computation at run time to find the next location comparison, instead the two next candidate locations are pointed to by current locations (as in linked list in which each location points to one next location). 15. New node insertion algorithm in BST can be fast and very simple i.e. it attaches the new arriving nodes at the leaf level only, appropriate leaf has to be searched within the current BST for linking new node as its left/right child. This simple insertion may make the tree unbalanced and it results into deeper tree requiring more comparisons on an average as compared to binary search through an ordered array. 16. It is possible to design more sophisticated algorithm for insertion into a BST which keep the BST balanced after every insertion by reorganizing part of the BST. This helps in optimizing the number of average comparison during search time. 17. More sophisticated search trees structures are used for storing index structures in real application like DBMS and so on. They essentially follow multi-way search through multiway trees with multiple pointers (even 100 or more) rather than just 2 (left and right) pointers. 18. Indexing facilitates faster retrieval. Key to improving retrieval time often lies in designing better indexing structures. Index Structure Design has been a very active CS research area for several decades and it continues to get new breakthroughs for specialized and newer applications. 19. Ref DS 2002 lect 4_01, 4_02_03 and DS-II 2002 lect 18. 20. Assignment: WAP to convert your linear index structure into BST index. Lecture 25 (1 hr.) (9.11.04) (Class strength : Approx. 50 students) 1. Software Design Process: Design story Concept Map of Problem Concept map of Solution (with simpler verbs and new data tanks, if needed) data structures and detailed algorithms performance analysis (using time and space complexity analysis techniques) program Software Testing and evaluation. 2. Data Tanks will be i. Problem specific (usually persistent) usually stored on well formatted files. ii. Solution specific (often temporary) a. Ancilliary Data tanks for the Problem specific primary data tanks i.e. Index structures. Index structures can be RAM based or can also be on file (if index is also large). Index structures are also often persistent and are stored on files. At run time index )or part of it) is loaded into RAM. Index can be stored as sorted list, hashed list or more often as some kind of a search tree. BST is most simple form of such a search tree. There are more sophisticated single as well as multi-dimensional index structures. Refer 2002 DS II Lect 18. b. Buffer data tanks e.g. stack, queue, deque and so on for temporary buffering of data. 3. Detailed Process for 3rd stage i.e. Concept map of Solution (with simpler verbs and new data tanks, if needed) data structures and detailed algorithms : iii. Detailed modeling of each data tank with examples and identification of fields. iv. Structural abstraction of each Data Tank Abstract Data Type (ADT)
  17. 17. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 v. ADT specification of public view of ADT in terms of list of permitted and required operations for construction, destruction, manipulation and access. Have sufficient access functions to test the preconditions and post conditions for for other operations. vi. Interface design for each operation in terms of function name and parameters list and their description. vii. Pre-conditions (in terms of access functions) for each operation. Refer 2002 DS lect 3_01 to 3_03. viii. Post conditions (in terms of access functions) for each operation. ix. Case specific Data Structure for each ADT . x. Algorithm for each operation sub program (function) for each operation. 4. Array storage single and multi-dimensional array : (refer 2002 DS lect 3_04 and 3_05) i. Single dimensional array : Space overhead (starting location, array length) ii. Multi dimensional array storage: a. Row major : require a single large Contiguous chunk of memory. Used by C, C++. More memory overhead per chunk. Overhead does not depend on the size. It only depends on the number of dimensions. b. Array of array : stores multidimensional array as collection of hierarchically organized (e.g. array of planes within a 3d matrix, array of rows within a plane , array of elements within a row) several multiple single dimensional arrays. Require multiple small contiguous chunks of memory. Used by Java. Less overhead for each such chunk, but over all more overhead as there are several such chunks. Hence, Overhead depends on the size and also on number of dimensions.Lecture 26 (2 hr.) (11.11.04) (Class strength : 21 students) 1. Implemention and performance analysis of ADT Matrix for Diagonal, lower triangular, upper triangular, tridiagonal and sparse matrices using following realization strategies : i. Amorphous, semi amorphous ii. Direct addressing (formula based addressing in a single dimensional array) iii. Linked iv. Indexed v. Hash table: open hashing, bucket hashing, closed hashing vi. Simulated pointer Refer 2002 DS lect 3_05 to 3_14 2. Implementing sparse matrices with orthogonal lists. 3. Implemention and performance analysis of ADT Stack using following realization strategies : i. Amorphous, semi amorphous ii. Direct addressing (formula based addressing in a single dimensional array) iii. Linked iv. Indexed v. Simulated pointer Refer 2002 DS lect 3_15 and 3_19 4. Implementing 2 stacks in a single array. Refer 2002 DS lect 3_16. 5. Implemention and performance analysis of ADT Queue using following realization strategies: i. Amorphous, semi amorphous ii. Direct addressing (formula based addressing in a single dimensional array) iii. Linked iv. Indexed v. Simulated pointer Refer 2002 DS lect 3_21 to 3_24.
  18. 18. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004 6. Implemention and performance analysis of ADT ‘Double Ended Queue’ using following realization strategies: i. Amorphous, semi amorphous ii. Direct addressing (formula based addressing in a single dimensional array) iii. Linked iv. Indexed v. Simulated pointer Refer 2002 DS lect 3_31. 7. Evaluating the buffer requirements (matrix, stack, queue, deque and so on) depending on the application. Examples of print server, parenthesis matching, palindrom checking, infix to postfix. Refer 2002 DS lect 3_16 to 3_20 and 3_31. 8. Assignment: WAP to implement two stacks in a single array. 9. Assignment: WAP to implement a deque using indexed and linked implementation.Lecture 27 (1 hr.) (16.11.04) (Class strength : 40 students) 1. Design and performance analysis of ADT ‘Binary Tree’ using following realization strategies: a. Amorphous, semi amorphous b. Direct addressing (formula based addressing in a single dimensional array) c. Linked d. Indexed e. Simulated pointer Refer 2002 DS lect 4_01 and 4_02. 2. BST Refer 2002 DS II lect 18. 3. Non-evaluative Assignment: WAP to implement a binary tree using Indexed and Hashed storage.Lecture 28 (3 hr.) (18.11.04) (Class strength : 22 students) 1. n-ary tree representation, Forest of n-ary tree representation, Operations on binary tree, operations on Forests. Expression Tree, 2. Modeling 2d/3d/nd spaces connected through neighbours/relationships/.. as n-ary trees. 3. Tree traversals, pre-order, in-order, post-order, level order, non recursive pre-order, non recursive in-order, non recursive post-order. Refer 2002 DS Lect 4_01 to 4_02_03. 4. Tower of Hanoi Refer 2002 DS Lect 3_18. 5. Rat in the Maze, depth first, breadth first, shortest path, recursion tree Refer 2002 DS Lect 3_18 and Lect 3_25. 6. Binary and n-ary Tree traversals using recursion, stack or queue. 7. Designing Simulations as producer(s)-server(s)-buffer(s) model. Refer 2002 DS Lect 3_26 to Lect 3_29. 8. More Sorting algorithms using advanced techniques, radix sort, merge sort, non recursive merge sort, shell sort and quick sort. Refer 2002 DS II Lect 6 to Lect 12. 9. Non-evaluative Assignment: Send me an email about your learning experiences in this course. I would also like to know about your learning that happened by doing individual and group assignment. What are the most important things that your have learnt. What was missing. and so on… (sanjay.goel@jiit.ac.in).Best of luck!!!!!!!!

×