12/11/2008 SEG 3550


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

12/11/2008 SEG 3550

  1. 1. SEG3550 Fundamentals of Information System Tutorial 8 Storage and File Structure
  2. 2. Overview <ul><li>Storage Hierarchy </li></ul><ul><li>Physical Storage Media </li></ul><ul><li>RAID </li></ul><ul><li>RAID Levels </li></ul><ul><li>File Organization </li></ul>
  3. 3. Why do we need to know about storage/file structure <ul><ul><li>Many database technologies are developed to utilize the storage architecture/hierarchy </li></ul></ul><ul><ul><li>Data in the database needs to be organized and stored/retrieved efficiently </li></ul></ul>
  4. 4. Physical Storage Media <ul><li>Physical storage media classified according to: </li></ul><ul><li>Data access speed </li></ul><ul><li>Cost per unit of data </li></ul><ul><li>Reliability </li></ul><ul><li>Can differentiate storage into: </li></ul><ul><li>Volatile storage: Loses contents when power is switched off </li></ul><ul><li>Non-volatile storage: Contents persist even when power is switched off </li></ul>
  5. 5. Storage Hierarchy Magnetic Tape Optical Disk Magnetic Disk Flash Memory Cache unit price Memory Volatile primary storage Non-volatile speed Secondary storage Tertiary storage
  6. 6. Primary Storage <ul><li>Cache </li></ul><ul><li>Volatile - managed by hardware </li></ul><ul><ul><li>Speed: 7 to 20 ns ( 1 nanosecond = 10 –9 seconds ) </li></ul></ul><ul><ul><li>Capacity: </li></ul></ul><ul><ul><ul><li>A typical PC level 2 cache 64KB-2 MB. </li></ul></ul></ul><ul><ul><ul><li>Within processors, level 1 cache usually ranges in size from 8 KB to 64 KB. </li></ul></ul></ul><ul><li>Main memory : </li></ul><ul><li>Volatile </li></ul><ul><ul><li>Speed: 10s to 100s of nanoseconds; </li></ul></ul><ul><ul><li>Capacity: Up to a few Gigabytes </li></ul></ul><ul><ul><li>widely used currently </li></ul></ul><ul><ul><ul><li>per-byte costs have decreased roughly </li></ul></ul></ul><ul><ul><ul><li>factor of 2 every 2 to 3 years) </li></ul></ul></ul>
  7. 7. Secondary Storage <ul><li>Flash memory </li></ul><ul><li>Non-volatile </li></ul><ul><ul><li>Speed: read speed similar to main memory. But writes are slow (few microseconds), erase is slower. </li></ul></ul><ul><ul><li>Capacity: 32M to a few Gigabytes currently </li></ul></ul><ul><ul><li>Forms: SmartMedia, memory stick, secure digital, BIOS </li></ul></ul><ul><ul><li>Cost: roughly same as main memory </li></ul></ul><ul><li>Magnetic-disk </li></ul><ul><li>Non-volatile </li></ul><ul><ul><li>Capacities: up to roughly 1 TB(1000 GB) currently </li></ul></ul><ul><ul><li>Data must be moved from disk to main memory for access, and written back for storage. </li></ul></ul><ul><ul><ul><li>Growing constantly and rapidly with technology </li></ul></ul></ul><ul><ul><ul><li>improvements (factor of 2 to 3 every 2 years) </li></ul></ul></ul>
  8. 8. Tertiary Storage (Non-volatile) <ul><li>Optical storage </li></ul><ul><ul><li>CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms </li></ul></ul><ul><ul><li>Reads and writes are slower than with magnetic disk </li></ul></ul><ul><li>Tape storage </li></ul><ul><ul><li>used primarily for backup (to recover from disk failure), and for archival data </li></ul></ul><ul><ul><li>sequential-access – much slower than disk </li></ul></ul><ul><ul><li>very high capacity (40 to 300 GB tapes available) </li></ul></ul>
  9. 9. Between Memory and Disk <ul><li>The permanent residency of database is mostly on disk </li></ul><ul><li>In database, cost is usually measured by the number of disk I/O </li></ul><ul><li>But disks are too slow and we need memory to be the buffers … but memory is volatile </li></ul><ul><ul><li>this introduced a number of issues </li></ul></ul>
  10. 10. RAID <ul><li>RAID: Redundant Arrays of Independent Disks </li></ul><ul><li>Disk organization techniques that manage a large numbers of disks. </li></ul><ul><ul><li>high capacity and high speed by using multiple disks in parallel, and </li></ul></ul><ul><ul><li>high reliability by storing data redundantly </li></ul></ul>
  11. 11. Mean time to failure (MTTF) <ul><li>A verage time the disk is expected to run continuously without any failure. </li></ul><ul><ul><li>Typically 3 to 5 years (1 year = 8,760 hours) </li></ul></ul><ul><ul><li>MTTF = 30,000 to 1,200,000 hours for a new disk </li></ul></ul><ul><ul><ul><li>an MTTF of 1,200,000 hours for a new disk means that given 1000 relatively new disks, on an average one will fail every 1200 hours </li></ul></ul></ul><ul><ul><ul><li>(assuming by Exponential Distribution) </li></ul></ul></ul><ul><li>When number of disks increase, the chance of some disk failure increase proportionally </li></ul>
  12. 12. Parallelism <ul><li>Two main goals of parallelism in a disk system: </li></ul><ul><ul><li>1. Load balance multiple small accesses to increase throughput </li></ul></ul><ul><ul><li>2. Parallelize large accesses to reduce response time. </li></ul></ul><ul><li>Basic strategy: Stripping </li></ul><ul><ul><li>Compare and contrast bit stripping and byte stripping </li></ul></ul>
  13. 13. Redundancy <ul><li>store extra information that can be used to rebuild information lost in a disk failure </li></ul><ul><li>Basic strategy: mirroring, parity </li></ul><ul><li>Mean time to data loss depends on mean time to failure, and mean time to repair </li></ul>10010010 1 Data Parity
  14. 14. RAID Levels Twice the Read transaction rate of single disks, same Write transaction rate as single disks. Due to its cost and complexity, level 2 never really &quot;caught on&quot;. Therefore, much of the information below is based upon theoretical analysis, not empirical evidence.
  15. 15. RAID Levels (cont) Very high read data transfer rate and very high write data transfer rate, but Controller design is fairly complex. Very high read data transaction rate, but quite complex controller design. Difficult and inefficient data rebuild in the event of disk failure.
  16. 16. RAID Levels (cont) Highest read data transaction rate and medium write data transaction rate. Most complex controller design. Difficult to rebuild in the event of a disk failure (compared to RAID level 1) RAID 6 is essentially an extension of RAID level 5 which allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity)
  17. 17. Choice of RAID Levels <ul><li>Level 1 provides much better write performance than level 5 </li></ul><ul><ul><li>Level 5 requires at least 2 block reads and 2 block writes to write a single block, whereas Level 1 only requires 2 block writes </li></ul></ul><ul><ul><li>Level 1 preferred for high update environments such as log disks </li></ul></ul><ul><li>Level 1 had higher storage cost than level 5 </li></ul><ul><ul><li>disk drive capacities increasing rapidly (50%/year) whereas disk access times have decreased much less (x 3 in 10 years) </li></ul></ul><ul><ul><li>I/O requirements have increased greatly, e.g. for Web servers </li></ul></ul><ul><ul><li>When enough disks have been bought to satisfy required rate of I/O, they often have spare storage capacity </li></ul></ul><ul><ul><ul><li>so there is often no extra monetary cost for Level 1! </li></ul></ul></ul><ul><li>Level 5 is preferred for applications with low update rate, and large amounts of data </li></ul><ul><li>Level 1 is preferred for all other applications </li></ul>
  18. 18. Buffer Manag ement <ul><li>Database can not fit entirely in memory, needs memory as a buffer for speed reasons </li></ul><ul><li>LRU is used in many OS </li></ul><ul><ul><li>Spatial and temporal locality due to loops </li></ul></ul><ul><li>Database has a more predictable behavior </li></ul><ul><ul><li>Example: join </li></ul></ul>
  19. 19. File Organization <ul><li>The database is stored as a collection of files . Each file is a sequence of records . A record is sequence of fields </li></ul><ul><li>Approaches: </li></ul><ul><li>– assume record size is fixed </li></ul><ul><li>– each file has records of one particular type only </li></ul><ul><li>– different files are used for different relations </li></ul>
  20. 20. Fixed-Length Records <ul><li>Simple approach: </li></ul><ul><li>– Store record i starting from byte n *(i − 1), where n is the size of each record </li></ul><ul><li>– Record access is simple but records may cross blocks </li></ul><ul><li>Deletion of record i – Several alternatives: </li></ul><ul><li>– move records i + 1, . . . , n to i, . . . , n − 1 </li></ul><ul><li>– move record n to i </li></ul><ul><li>– link all free records on a free list </li></ul>
  21. 21. Free Lists <ul><li>Store the address of the first record whose contents are deleted in the file header </li></ul><ul><li>Use this first record to store the address of the second available record, and so on </li></ul><ul><li>Can think of these stored addresses as pointers since they “point” to the location of a record </li></ul><ul><li>More space efficient representation: reuse space for normal attributes of free records to store pointers. (No pointers stored in in-use records) </li></ul>
  22. 22. Variable-Length Records <ul><li>Variable-length records arise in database systems in several ways: </li></ul><ul><li>– Storage of multiple record types in a file </li></ul><ul><li>– Record types that allow variable lengths for one or more fields </li></ul><ul><li>– Record types that allow repeating fields (used in some older data models) </li></ul><ul><li>Byte string representation </li></ul><ul><li>– Attach an end-of-record ( ) control character to the end of each record </li></ul>
  23. 23. Organization of Records in Files <ul><li>Sequential File Organization </li></ul><ul><li>Suitable for applications that require sequential processing of the entire file. The records in the file are ordered by a search-key. Need to reorganize the file from time to time to restore sequential order. </li></ul><ul><li>Deletion – use pointer chains </li></ul><ul><li>Insertion – must locate the position in the file where the record is to be inserted </li></ul><ul><li>– If there is free space insert there </li></ul><ul><li>– If no free space, insert the record in an overflow block </li></ul><ul><li>– In either case, pointer chain must be updated </li></ul><ul><li>Clustering File Organization </li></ul><ul><li>Simple file structure stores each relation in a separate file </li></ul><ul><li>Can instead store several relations in one file using a clustering file organization </li></ul>
  24. 24. Exercises (1) <ul><li>Show the structure of the file below after each of the following steps: </li></ul><ul><li>a. Insert(Brighton, A-323,1600) b. Delete record 2 </li></ul><ul><li>c. Insert( Brighton, A-626,2000) </li></ul>
  25. 25. Exercises (2) <ul><li>Show the structure of the file below after each of the following steps: </li></ul><ul><li>a. Insert(Mianus, A-101, 2800) b. Insert( Brighton, A-323, 1600) c. Delete( Perryridge, A-102, 400) </li></ul>
  26. 26. <ul><li>Thanks! </li></ul>