File Structure Concepts


Published on

watch this on you tube

Published in: Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Any writer or designer will tell you that 90% of the creative process…
  • File Structure Concepts

    2. 2. One trip
    3. 3. Minimum trips
    4. 4. SendingRelated data
    5. 5. Why not DataStructures???
    6. 6. So when we have a huge data, We can use file structures toaccess them quickly , thereby making it more efficient than data structures.
    7. 7. Data Management in Files USN : 1MS10ISO34 Name : Dileep Kodira College : MSRIT Place : Bangalore
    8. 8. FIXED LENGTH FIELDS - force the fields into predicable (fixed) length FOUR MOSTLENGTH INDICATOR FIELDS -begin COMMON WAYSeach field with the length indicator OF STRUCTURING at theDELIMITED FIELDS - place delimiter FIELDS AREend of each field to separate the fields SELF-DESCRIBING FIELDS- use “keyword=value” expression to identify each field and its contents
    9. 9. Fixing the Length of Fields• This method relies on creating fields of predictable fixed size.• E.G. One may define the following class: class Person { public: char last[11]; char first[11]; char address[16]; char city[16]; char state[3]; char zip[10]; }
    10. 10. Fixing the Length of Fields• Disadvantages: • a lot of wasted space due to “padding” of fields with “blanks” • data values may not fit into the field sizes: » e.g. Michalopoulos is too long to fit in the array char last[11]• Thus the fixed-size field approach is inappropriate for data that inherently contains a large amount of variability in the length of fields such as names or addresses.
    11. 11. Beginning Each Field with a Length Indicator• This method requires that each field data be preceded with an indicator of its length (in bytes).E.G. 04Ames04Mary09123 Maple10StillWater02OK0574075• One of the disadvantages of this method is that it is more complex since it requires extracting of numbers and strings from a single string representing a record.
    12. 12. Separating Fields with Delimiters• This method requires that the fields be separated by a selected special character or a sequence of characters called a delimiter.• E.G. If “|” is used as a delimiter then a sample record would look like this: Ames|Mary|123Maple|StillWater|OK|574075|
    13. 13. Separating Fields with Delimiters• The method of separating fields with a delimiter is often used. However choosing a right delimiter is very important.• In many cases white-space characters (blanks) are excellent delimiters because they provide a clean separation between fields when we list them on the console.
    14. 14. Using a “keyword = value” expression• This method requires that each field data be preceded with the field identifier (keyword).E.G. last=Amesfirst=Maryaddress=123 Maplecity=StillWaterstate=OKzip=574075• Can be used with the delimiter method to mark the field ends. last=Ames|first=Mary|address=123 Maple|City=StillWater|state=OK|zip=574075
    15. 15. Using a “keyword = value” expression• Advantages: • each field provides information about itself • good format for dealing with missing fields• Disadvantages: • In some application a lot of space may be wasted on field keywords (up 50%).
    16. 16. Record Structures• Files may be viewed as collections of records which are sets of fields• Some of the most often used methods for organizing the records of a file are: – require that the records be a predictable (fixed) number of bytes in length – require that the records be a predicable number of fields in length
    17. 17. Organizing the Records of a File – begin each record with its length indicator (count of the of bytes in the record) – use a second file to keep track of the beginning byte address for each record – place a delimiter at the end of each record to separate it from the next record
    18. 18. Fixed-Length Records• This method is a counterpart of is analogous method for organizing files with fix length fields.• Fixing the sizes of fields in a record will produce a fixed-size record.
    19. 19. Fixed-Length Records• E.G. class Person { public: char last[11]; char first[11]; char address[16]; char city[16]; char state[3]; char zip[10]; } Will produce a fixed size record of size 67 bytes.
    20. 20. Fixed-Length Records• The fixed length record structure, however, does NOT imply, the fixed -length field structure.• Fixed-length records are frequently used as “containers” to hold variable numbers of variable-length fields.• Fixed-length record structures are among the most commonly used methods for organizing files.
    21. 21. Records with a Predicable Number of Fields• The method specifies the number of fields in each record.• Regardless of the method for storing fields, this approach allows for relatively easy means for calculating record boundaries.
    22. 22. Records with a Length Indicator• This method requires that each record begin with a length indicator.• This method is commonly used for handling variable-length records.
    23. 23. Index File to Keep Track of Record Addresses• This method uses an index file (or an index block) to keep a byte offset for each record in the original data file. The byte offsets (record addresses) allow us to find the beginning of each successive record and compute the length of each record.
    24. 24. Records Separated with Delimiters• This method is analogous to the use of delimiters to separate fields.• As with fields the delimiter must be well chosen and it cannot be a part of data.• Common delimiter is the end-of-line character ‘n’, since records often are read directly to the console.
    25. 25. A Record Structure that Uses a Length Indicator• Use a memory buffer to store the data that is going to be written to the disk.• Write down the size of the record at the beginning of it.• Write down the buffer contents after writing the size.
    26. 26. Name : Dileep FIELDS RECORDS USN : 1MS10IS034 USN : 1MS10ISO34 Name : Dileep USN : 1MS10IS034 College Dileep Name : : MSRIT Name : Dileep Kodira Place : Bangalore College : MSRIT College : MSRIT Place : Bangalore Place : Bangalore
    27. 27. USN : 1MS10IS034Name : Dileep Kodira College : MSRIT Place : Bangalore
    28. 28. USN : 1MS10IS034 Name : Dileep KodiraCollege : MSRIT Place : Bangalore
    29. 29. UNPACKING Name : Dileep USN : 1MS10IS034 Name : Dileep College : MSRIT Place : Bangalore
    30. 30. RUN LENGTH ENCODING – Represents data using value and run length – Run length defined as number of consecutive equal values RLE 1110011111 130215 Run Lengths Values
    31. 31. RUN LENGTH ENCODINGApplications• Useful for compressing data that contains repeated values – e.g. output from a filter, many consecutive values are 0.• Very simple compared with other compression techniques• Reversible (Lossless) compression – decompression is just as easy
    33. 33. HUFFMAN CODING• Suppose we have a message consisting of 5 symbols, e.g. [ ]• How can we code this message using 0/1 so the coded message will have minimum length (for transmission or saving!)• 5 symbols  at least 3 bits• For a simple encoding, length of code is 10*3=30 bits
    34. 34. HUFFMAN CODING• Intuition: Those symbols that are more frequent should have smaller codes, yet since their length is not the same, there must be a way of distinguishing each code• For Huffman code,length of encoded messagewill be=3*2 +3*2+2*2+3+3=24bits
    35. 35. Thank you Dileep Kodira