Upcoming SlideShare
×

# Data Structures 2005

682 views
586 views

Published on

Lecture Notes on Data Structures by Sanjay Goel, July-Dec 2005, JIIT, Noida

Published in: Education, Technology, Business
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total views
682
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
4
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Data Structures 2005

1. 1. Lecture Notes, Sanjay Goel, DS, 2005 Lecture Notes: Data Structures 2005 1. 22 July (1 hr.) 1. Maintain the enthusiasm and a notebook for this course. Notebook will carry some marks. 2. Discussion about course objectives 3. What is Problem Solving? - Recollection and narration of personal experiences of problem solving. - Identification of characteristics of real problems and solutions - Some Identified Characteristics of Real Problems: i. There are some missing resources or knowledge. ii. They are difficult, and often do not have straight answers iii. They are Unexpected. iv. Often people are involved. - Some Identified Characteristics of Solutions of Real Problems: i. Many alternate correct solutions are possible. ii. Require information/ knowledge acquisition during problem solving. iii. Require high level of motivation. iv. Real life problems are solvable. v. Often help from others is needed. 4. Knowledge is constructed during problem solving experiences. 5. Best strategy to acquire knowledge is to solve problems. 6. Course strategy : Confront and challenge students with complex problems. 7. Assignment : Write 5 incidences of real life problems faced by you, out of these five, at least two should be from academic experiences. How did you resolve these problems? Identify some common characteristic of these problems and your solution. Discuss your analysis in a group of three and prepare a set of properties to characterize problems and solutions. 2. 26 July (1 hr.) 1. Group wise summarization of : (1 mark for all participants) - Common characteristics of Problems - Common characteristics of Solutions 2. Solution generation begins by describing the problem clearly and precisely. 3. Instruments of bringing clarity and precision in our expressions: - Give examples and analogies - Give measurements - Give references - Avoid jargons - Avoid unnecessary descriptions and details - Use graphical symbols. - Use mathematical symbols and expressions. 4. We often have to learn, adapt and also invent new symbols for clear and precise expressions. All engineering disciplines including CSE and Software engineering have discipline specific notations. It is a formal requirement to use these notations in technical communications. 5. Group Assignment (3 students) : Pick any software, machine or any other such artifact and describe its functionality in clear and precise manner. 6. Programming Assignment : Practice pointers, files and structures. Request your lab instructors and tutors to give you simple assignments to give you this practice. This is non evaluative exercise. But you must do it to develop confidence. 3. 28 July (1 hr.) 1. Counter examples help in bringing precision. 2. Test the clarity and precision of another group’s description of functionality of some software/machine. 3. Prepare a list of the characteristics of the gaps. 4. Assignment: Take two best computer programs written by you so far and describe both in a clear and precise manner without using syntax of programming language. Make a group of ten students and develop a common checklist and style
4. 4. Lecture Notes, Sanjay Goel, DS, 2005 4. Format for developing Test Plan. Be extra careful about planning to test the program at the corners and edges. 5. Corners and Edges of program: i. Loop entry and exits. ii. Worst case inputs iii. Function calls iv. Return from functions. 6. Develop the program for successful testing without knowing the test plan and unfriendly testing process. 7. Assignment (2 marks): WAP to calculate factorial of any +ve integer up to 9999. Continue you earlier assignment. 7. 6 August (1 hr.) 1. Because of no progress on the last assignment by any student, all students get a (–1) mark. 2. Individual Assignment (3 marks): Use this training of clearly and precisely describing problems and solutions to propose a computational system in the assigned domain. The last digit of your enrollment number will determine the domain as follows: 0: Polynomial 1: Genealogical chart 2: Time Table 3: Periodic table 4: Maps 5: Analog Circuits 6: Digital Circuits 7: Musical compositions/ Sketches 8: Dictionary and Thesaurus 9: News paper If you want to work with some other domain outside the above list, you can. However, you are not permitted to work on domain already assigned to other students. You have to propose the functional requirements of software that stores (also creates) the mentioned object(s) in soft form and allows the users to do useful operations with stored soft form of your object. Identify these operations, their effect, their requirements and method of activating them. Propose the data structure for storing the main objects for your software. If you sincerely complete all above mentioned assignments mentioned in the lecture notes till this date within one week, and show good performance, your –ve marks assigned to you till this date will be removed. 8. 9 August (1 hr.) 1. Some common operations that computers are used for with any object: Create/Store Search/Retrieve/Pattern Matching/Traversal Process/Modify Sort/Arrange/Rearrange Generate/Scheduling Render 2. Group work out on the last assignment.
5. 5. Lecture Notes, Sanjay Goel, DS, 2005 3. 9. 11.8.05 1. 1st version of Conceptual Design of Data Storage: Case Study of Genealogical chart by Abhinav (B9) 2. Issues: Memory consumption and others. 3. 2nd version of Conceptual Design of Data Storage Usage of codes and Lookup table to decrease memory requirement. 10. 13.8.05 1. 1st version of Physical Design of Data Storage for Genealogical chart using facilities offered by C language. 2. 3rd version of Conceptual design to address the constraints of physical design facilities. 3. 2nd version of Physical design. (Table as an array of records, Graph as a matrix or Graph as an array of records [i,j, type of relationship]. 4. More versions of Conceptual design and physical design to reduce memory space requirements. 5. Algorithms for information retrieval (explicitly stored): requires accessing many data tanks. 6. Generating possibilities for retrieving information that is not stored explicitly and requires some processing: Algorithms 7. Generating possibilities for automated updating of information on some events: Algorithms. 8. Assignment: Apply this cyclic process for incremental improvement of your design already assigned to you. 9. Assignment [4 marks]: Write a C program to store up to 50000 persons and their relationships. Enter the data for 50. WAP to address following queries: i. How ith person is related to jth person? ii. To how many persons is ith person related? iii. Who are the common persons somehow related to both ith and jth person? iv. Who are the common persons somehow related to all ith, jth and kth person? 10. Assignment [2 marks]: Think how two different family charts can be merged and automatically updated in the event of marriage between the persons belonging to different family charts.
7. 7. Lecture Notes, Sanjay Goel, DS, 2005 15. 25.08.05 1. More discussion on recursion and its analysis. 2. Assignment [2 marks]: Write and analyze (Tabular analysis) recursive programs for: a. Finding the sum of the elements of an array. b. Finding the count of the nodes in a linked list. c. Finding the maximum number and its index in an array. d. Finding the maximum number in a linked list. 16. 30.08.05 1. More discussion on recursion and its tabular analysis. 2. Comparison of recursive and non-recursive solutions: a priori performance estimation in terms of computation time and memory requirement. 3. Analysis of call-return statements. Control transfer, memory demand/release. 4. Identification of data Structures for a given application A design team has conceived the following initial specifications of a search engine for a large company’s internal Digital Library: Only specially authorized users can upload new documents or new versions of old document. All employees can look at the documents. Information systems department will create, update and maintain a list of keywords for faster search facility. The search engine users can also search by entering any word through the keyboard. Searched documents are to be listed as follows: Case A, Faster search on a listed keyword: As per the frequency of occurrence of the word i.e. the documents having higher “density” of the chosen keyword will be listed before the documents having lower density, where density[k, d] = (Occurrence count of the word k in d)/(word count in d) Case B, Search by entering a word though the keyboard: As per the frequency of usage of a document, where usage is defined as number of times a document is opened by users through the search engine. 5. Assignment (Group of three, 4 marks each student) : Design DS and a program for above application. 17. 01.09.05 1. Review of Genealogical database storage. 2. Array of persons, Array of relationship code, Matrix of interpersonal relationships. 3. Queries for which the algorithms were discussed: i. What is the relationship between x and y ? ii. Who all are related to x ? iii. Who all are related to x and how ? iv. Who all are related x as well as y ? v. Who all are related to x or y ? vi. Who is at the root of the family tree ? vii. Find out the Tree position (e.g. 1 (root), 11, 12, 111, 112, 1111, …) for all persons in the person array by processing the relationship matrix. viii. Draw family Tree using stored Tree positions for all persons in the database as computed by above query. 4. Assignment: WAP for processing all above queries. Also WAP for listing the members of last generation of a family. Generate more queries. 18. 03.09.05 1. Review of Genealogical database storage. person_i [ relationship code for person_1, relationship code for person_2, .. , … relationship code for person_N] 2. Matrix of interpersonal relationship consumes a lot of space hence is not sufficiently scalable as well. What is the solution, if N is large ? 3. Several options : a. Store only non null relationships in a limited and same sized array for each person person_i [(person_j, relationship code), (person_k, relationship code), ….]; store only n elements in this array [ ], n<<N 4. Algorithms Design and Analysis (time and space) for some of queries previously discussed in last class with this modified data structure. i. What is the relationship between x and y? ii. Who all are related to x? iii. Who all are related to x and how?
8. 8. Lecture Notes, Sanjay Goel, DS, 2005 iv. Who all are related x as well as y ? v. Who all are related to x, y, and z ? 5. Assignment: (Group of three, 1 mark for each student) WAP for processing all above queries with modified DS for family database. Generate more queries. 6. Other possible applications of relationship matrix (relationships between homogeneous objects) as a data storage structure: i. Inter-city railway/bus/air routes availability ii. Inter-city fares iii. inter-chemical affinities iv. inter-city railway/bus/air time table v. inter-state border sharing vi. inter-currency exchange rates vii. inter-team tournaments schedule viii. inter-country business 7. Assignment: Generate problems for two of the above or other similar applications. 19. 06.09.05 1. Review of Inter-homogeneous object relations matrix storage and query algorithms. 2. More contexts for relationship matrix. 3. Inter-city distance matrix: example queries : j. What is the distance between x and y ? ii. Which cities are directly connected to x ? iii. Which cities are directly connected to x as well as y ? iv. Which cities are directly connected to x, y, and z ? 4. Design algorithms for a new Query: Find out the city (k), which connects two directly unconnected cities (i and j) with minimum total distance between j, and i if they can be connected with one in-between city. 5. Usage of temporary Buffers for designing algorithms. Simple buffer : Array (for storing partially processed data). 6. Limitations of array of array of structure : limited nodes limited scalability 7. Consider array of linked list of structure for enhancing scalability. Variable length of linked list for each person 8. Redesign of algorithms for all queries with further modified DS. 9. Key tradeoff issues: Scalability, memory size and execution time. 10. Consider the option of linked list of linked list of structure. Compare this DS with earlier two structure. 11. Assignment: Generate three more problems for more contexts of inter- homogeneous object relationship and propose their storage and algorithmic solutions. 12. Are their more alternate storage possibilities? What if pointers are not available or are not to be used for some reason? What if objects are not homogeneous but belong to two different types of categories? 20. 13.09.05 1. Using the data structures defined below, the database has been populated with the data as shown in TABLE 1. struct data { char info; int index1; int index2; } struct data database[12] ;
9. 9. Lecture Notes, Sanjay Goel, DS, 2005 info index1 index2 Y 1 11 X 7 10 A 6 -1 M 8 -1 F -1 9 N 4 -1 S -1 -1 C 3 5 G -1 -1 E -1 -1 W 0 2 D -1 -1 TABLE 1 What_is_it (int index) { if index <> - 1 { What_is_it (database[index].index2) ; What_is_it (database[index].index1) ; Output database[index].info ; } } Analyze the above recursive function using the tabular analysis technique & draw the recursion anaylsis table. Also show the output if the “ What_is_it ” function is initially called with index=0. Recursive algorithms have a risk of hidden infinite recursion for some specific cases of data as in above case. 2. A database stores information about hyperlinks across Websites. There are 10 Websites namely A,B, C,… J each having link to other as shown in TABLE 2. The ‘ y’ indicates the presence of a hyperlink. For example: there is a hyperlink from Website ‘A’ to Website ‘E’ and not vice versa. A B C D E F G H I J A y y B y y y C y y y D y y E y y F y y G y y y H y y I y y J y y TABLE 2 QUERY : If it is possible to move from xth Website to yth Website with up to two intermediate Websites in between them , then display “ link exists between xth and yth Websites” and also give sequence of names of intermediate Websites that are visited while moving from xth to yth Website ,otherwise display “No links Possible”. Propose the Data Structures and an appropriate algorithm for processing the above Query. Also demonstrate the working of your algorithm by simulating key-steps of the algorithm for two cases e.g. A E C B.