WHAT ARE FILES ? A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage
FILE CONTENTS <ul><li>On most modern operating systems , files are organized into one-dimensional arrays of bytes . The format of a file is defined by its content since a file is solely a container for data, although, on some platforms the format is usually indicated by its file extension, specifying the rules for how the bytes must be organized and interpreted meaningfully. For example, the bytes of a plain text file (.txt in Windows) are associated with either ASCII or UTF-8 characters, while the bytes of image, video, and audio files are interpreted otherwise. Most files also allocate a few bytes for metadata which allows a file to remember some basic information about itself. </li></ul>
Organising the data in the file <ul><li>Information in a computer file can consist of smaller packets of information (often called " records " or "lines") that are individually different but share some trait in common. For example, a payroll file might contain information concerning all the employees in a company and their payroll details; each record in the payroll file concerns just one employee, and all the records have the common trait of being related to payroll—this is very similar to placing all payroll information into a specific filing cabinet in an office that does not have a computer. A text file may contain lines of text, corresponding to printed lines on a piece of paper. Alternatively, a file may contain an arbitrary binary image (a BLOB ) or it may contain an executable . </li></ul>
FILE MANAGEMENT <ul><li>WHAT EXACTLY IS FILE MANAGEMENT ? </li></ul><ul><li>File management is process of placing , naming , and organising files and folders in a seemingly logical manner . </li></ul>
Sequential access vs random access . <ul><li>sequential access means that a group of elements (e.g. data in a memory array or a disk file or on magnetic tape data storage ) is accessed in a predetermined, ordered sequence </li></ul>
Sequential access is sometimes the only way of accessing the data, for example if it is on a tape. It may also be the access method of choice, for example if we simply want to process a sequence of data elements in order. 1 In data structures , a data structure is said to have sequential access if one can only visit the values it contains in one particular order. The canonical example is the linked list . Indexing into a list that has sequential access requires O ( k ) time, where k is the index. As a result, many algorithms such as quicksort and binary search degenerate into bad algorithms that are even less efficient than their naïve alternatives; these algorithms are impractical without random access . On the other hand, some algorithms, typically those that don't index, require only sequential access, such as mergesort , and face no penalty.
Example of Random Access File Storage How you use it is up to you- you could for example have a file full of equal size records. If these records have a size of 100 then record 0 occupies bytes 0-99, record 1 occupies bytes 100-199 etc. More generally record n starts at (n-1)*sizeof(record).
In data structures , Random access implies the ability to access any entry in a list of numbers in constant (i.e. independent of its position in the list and of list's size, i.e. ) time. Very few data structures can guarantee this, other than arrays (and related structures like dynamic arrays ). Random access is critical to many algorithms such as binary search , integer sorting or sieve of Eratosthenes . Other data structures, such as linked lists , sacrifice random access to make for efficient inserts, deletes, or reordering of data. Self-balancing binary search trees may provide an acceptable compromise, where access time is equal for any member of a collection and only grows logarithmically with its size.
DIFFERENCES <ul><li>Sequential file will be stored in a continuous memory locations. But Random Access files will be splitted in to pieces and will be stored whereever spaces available. Sequential file may load faster and random access files may take time but in normal cases you wont feel the difference..... </li></ul>
INDEX FILE <ul><li>The index table is stored in a separate file called the index file , as shown in Figure 2 . This file serves as the fast meta-data index into an encoded file. For a given data file F , we create an index file called F .idx. Many file systems separate data and meta data; this is done for efficiency and reliability. Meta-data is considered more important and so it gets cached, stored, and updated differently than regular data. The index file is separate from the encoded file data for the same reasons and to allow us to manage each part separately and simply. </li></ul>
The Importance of Index Files <ul><li>When the Web server gets a request for something which is the name of a folder, it looks for a list of files with special names. If it finds one of these files, it shows the file, otherwise it shows a directory listing . </li></ul><ul><li>By putting an index file in every directory you create, you not only save on having to type in a filename when you create links to the main file, but you also make it easier for people to get to your site. </li></ul>
<ul><li>For example, let's say you want to create a homepage for the Office of Communications, in the /offices/comm/ folder. If you start with an index.shtml file as the homepage, its URL will be http:// www.mtholyoke.edu/offices/comm/index.shtml , but people will be able to see it by going to http:// www.mtholyoke.edu/offices/comm / as well. If you did not create an index file, anyone who used the second address would be able to see a listing of all the files in that directory. </li></ul><ul><li>The Web server searches for these files, in the order listed: index.menu index.shtml index.shm index.html index.htm default.shtml default.htm. If any of these is found, it is displayed instead of the directory listing. </li></ul>
How does an Index work ? <ul><li>In random access files, each record has a number starting at 1. If you have 0 records then your file length will be zero (or there will be no file at all). What an index file does is keep track of all these record numbers, like in an array, but the numbers are sorted. For example, let's say you have a random access file with one string field, and you wish to index it alphabetically: </li></ul>
Value Rick Jim Bill jill Data file Rec num 1 2 3 4 Value 3 4 2 1 Index file Rec num 1 2 3 4
Index file is just made up of numbers, and the records in the index file are arranged such that they indicate the record of the data file that should hold this position. For instance, if we want to find the first person alphabetically in our data file we would go to the index file and get the first index record. We find the index value to be 3. So we then Get the 3rd record from the data file, and determine that the first record alphabetically is "Bill." The index record value 3 can be thought of as a pointer to the data file record.