File Organization


Published on

Published in: Technology

File Organization

  2. 2. FILE ORGANIZATIONFor understanding File/Table Record/Row Field/Column/Attribute
  3. 3. THE PILE A form of file organization where data are collected inthe same order they arrived This organization simply accumulate mass of data andsave it Each field is self-describing, includes a field name and avalue. Length of each field is determined as shown Implicitly indicated by delimiters Explicitly included as subfield or Done by default for that particular field type Records may have different fields and there is no visiblestructure. Hence, record access is through exhaustivesearch where all record or the entire file is examined This type of file uses space efficiently and is updatedeasily. However, retrieval of a single record could betedious
  4. 4. THE PILE ORGANIZATIONVariable-length records DelimiterDelimiterDelimiterDelimiterDelimiterDelimiterVariable-length recordsVariable-length recordsVariable-length recordsVariable-length records
  5. 5. THE SEQUENTIAL FILE Most common form of file structure Fixed format are used for records and records are of thesame length Field names and lengths are attributes of the file One particular field (usually first field) is referred to asthe key field :- Uniquely identifies the record Records are stored in key sequence Alphabetical order or numerical order depending on the key field It is the simplest structure and easy to implement in anydevice. However, in the worst case scenario, accessinga single record takes a long time.
  6. 6. THE SEQUENTIAL FILE ORGANIZATION12…….n-1nKey Field Attributes
  7. 7. THE SEQUENTIAL FILE ORGANIZATION To enable a sequential form of records, newrecords are placed in a log file or transaction file.Then, a batch update is performed to merge the logfile with the master file to produce a new file withthe correct key sequence1 2 n-1 n…RecordTerminators
  8. 8. THE INDEXED SEQUENTIAL FILE A file management system that allows records to beaccessed either sequentially (in the order they wereentered) or randomly (with an index) A secondary set of hash tables known as indexes iscreated that contains pointers to the main file In indexed sequential file, records are organized insequence based on key fields Each file has an index to support random search Overflow file is added such as each record inoverflow file is located by following a pointer fromits predecessor record in main file
  9. 9. DESCRIPTION As an example:- Firstly, a single level of indexing is used. Hence, theindex is a simple sequential file. Each record in the index file consists of:- Key field(same as key field in the main file) Pointer to the main file To find a specific field, the index file is searched formatching key values Then, the pointer indicates the record having thematching key values as depicted by figure below This reduces the average search length
  10. 10.  There is an addition to the organization where each record inthe main file contains an additional field which is a pointer tothe overflow file When a new record is added, it is added into the overflow file.The record in main file which is immediately before the newrecord in a logical sequence to be inserted is updated tocontain a pointer to the new record in the overflow file However, if the record before the new record in logicalsequence is itself in the overflow file, then the pointer in thatrecord is updated The processing of entire file sequentially involves processingof records of main files sequentially until the pointer to theoverflow file is found, then accessing is continued in theoverflow file until a null pointer is encountered Secondly, we visualize the organization using multiple levelsof indexes The lowest level of index points to the main file, whereas theindex files sitting on the level above points to the index filebelow it Hence, the efficiency in access is greatly increased andaverage length of search is greatly reduced as conveyed infigure below
  11. 11. THE INDEXED FILE Uses multiple indexes for different key fields whichmay be the subject of a search Record accessibility are through their indexes May contain an exhaustive index that contains oneentry for every record in the main file. The indexitself is organized in the sequential form for ease ofsearching May contain a partial index. It contains entries torecords where the desired field exists
  12. 12. THE INDEXED FILE ORGANIZATIONPrimary fileExhaustive PartialExhaustiveIndex AttributesPointerIndicator
  13. 13. THE DIRECT/HASHED FILE This file management system that access directly anyblock of a known address. A key field is required for each record. Uses hashing on the key value. No concept of sequential ordering implemented. Such examples are:- Directories Pricing table Schedules Name lists It is used where rapid access is required, where fixed-length records are used and where records areaccessed one-at-a-time. The concept of hashing can be shown as below.
  15. 15. HASHING METHODS Direct method The key is the address a key value might be between 1-100. The address of a certainrecords will be the same as the key. Subtraction method Subtract certain amount of numbers from the key a key value might start with 1001 and ends with 1100. A simplehashing function could subtract 1000 from the key to determinethe address. Modulo-division method Key value is divided by the size of records. the remainder becomes the address of the key. Digit-extraction method Selected digits are extracted from the key As an example:- a field has 10 digits. The hashing function onlyextracts the first 2 digits and the last digit and use them asaddress.