CS215 - Lec 9 indexing and reclaiming space in files


Published on

Published in: Education, Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

CS215 - Lec 9 indexing and reclaiming space in files

  1. 1. Maintain Indexes. Adding a data record with Indexing. Deleting a data record with Indexing. Reclaiming space. Multilevel Index. Dr. Hussien M. Sharaf 2
  2. 2. Dr. Hussien M. Sharaf 3 Structure of Indexes Indexes must be sorted on ascending or descending order with respect to a (one or more ) field(s). CompanyName offset Google 211Record1 n n IBM 0Record2 n ITE 643Record3 n Microsoft 462Record4 n Apple Mac 985 New record n
  3. 3. Dr. Hussien M. Sharaf 4 Operations needed for an Index: 1. Create an index at memory by looping on all records from the original data file. 2. If the there is an index file, load it into memory before using it. 3. Write the index into file at the closing of the program.
  4. 4. Dr. Hussien M. Sharaf 5 -Now Index is loaded at memory, the following operations are needed: 1. Add: Add data records to the data file and insert an index record at the correct position. 2. Delete: mark the record at data file as deleted and delete the related record from the index. 3. Deleting and updating data records requires updating the offsets of all index records. Is it the same for the adding a data record?
  5. 5. Dr. Hussien M. Sharaf 6 R1 R2 R3 R4 R5 Data records R4 R3 R2 R5 R1 Index on Name R2 R3 R1 R4 R5 Index on Phone
  6. 6. Dr. Hussien M. Sharaf 7 R1 R2 R3 R4 R5 Data records on disk R4 R3 R2 R5 R1 Name Index on RAM R2 R3 R1 R4 R5 Phone Index on RAM R6 R6 R6
  7. 7. Dr. Hussien M. Sharaf 8 1. Go to the end of data file, get current offset. 2. Data record is appended to the end of data file. 3. An index entry is built using offset and key of the new data record. (offset, Key) 4. The new index entry is inserted into its correct position at sorted index list. 5. At the end of the program the index list is saved into disk.
  8. 8. Dr. Hussien M. Sharaf 9 1. Search for index entry by comparing target value with the key field value. 2. Mark the index entry as deleted. 3. Get the offset of the target data record. 4. Seek for the target offset , mark the data record as deleted. NOTE: Data record is not actually deleted immediately. Space reclaiming function is required to run.
  9. 9. Dr. Hussien M. Sharaf 10 R1 R2 R3 R4 R5 Data records on disk R4 R6 R2 R5 R1 Name Index on RAM R2 R6 R1 R4 R5 Phone Index on RAM R6 R3 R3
  10. 10. Dr. Hussien M. Sharaf 11 A. Create a new file stream. B. While not end of records 1. Read a collection of records into buffer. 2. For each record in the buffer: If record is marked deleted, go to the next record. Else copy record to the new file stream. C. End While D. Rebuild all indexes based on the new data file. NOTE: in the process of copying data to the new stream, buffering is used.
  11. 11. Dr. Hussien M. Sharaf 12 When an Index gets very big, it can not be stored in RAM. It should be stored on file, hence another level of index that can be loaded into memory is required. Hence we need multilevel of indexing.
  12. 12. Dr. Hussien M. Sharaf 13 Level #4 Index can be loaded into memory