Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. File Processing : Storage Media 2008, Spring Pusan National University Ki-Joune Li
  2. 2. Major Functions of Computer <ul><li>Computation </li></ul><ul><li>Storage </li></ul><ul><li>Communication </li></ul><ul><li>Presentation </li></ul>
  3. 3. Storage of Data <ul><li>Major Challenges </li></ul><ul><ul><li>How to store and manage a large amount of data </li></ul></ul><ul><ul><ul><li>Example : more than 100 peta bytes for EOS Project </li></ul></ul></ul><ul><ul><li>How to represent sophisticated data </li></ul></ul>
  4. 4. Modeling and Representation of Real World <ul><li>Example </li></ul><ul><ul><li>Building DB about Korean History </li></ul></ul><ul><ul><li>Very complicated and Depending on viewpoint </li></ul></ul><ul><li>Database Course : 2008 Fall semester </li></ul>Real World Computer World
  5. 5. Managing Large Volume of Data <ul><li>Large Volume of Data </li></ul><ul><ul><li>Cost for Storage Media </li></ul></ul><ul><ul><ul><li>Not very important and negligible </li></ul></ul></ul><ul><ul><li>Processing Time </li></ul></ul><ul><ul><ul><li>Comparison between main memory and disk access time </li></ul></ul></ul><ul><ul><ul><ul><li>RAM : several nanoseconds (10 -9 sec) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Disk : several milliseconds (10 -3 sec) </li></ul></ul></ul></ul><ul><ul><ul><li>Time is the most valuable resource </li></ul></ul></ul><ul><ul><ul><li>Example </li></ul></ul></ul><ul><ul><ul><ul><li>Retrieving a piece of data from 100 peta bytes DB </li></ul></ul></ul></ul>
  6. 6. Managing Large Volume of Data <ul><li>Management of Data </li></ul><ul><ul><li>Secure Management </li></ul></ul><ul><ul><ul><li>From hacking </li></ul></ul></ul><ul><ul><ul><li>From any kinds of disasters </li></ul></ul></ul><ul><ul><li>Consistency of Data </li></ul></ul><ul><ul><ul><li>Example </li></ul></ul></ul><ul><ul><ul><ul><li>Failure during a flight reservation transaction </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Concurrent transaction </li></ul></ul></ul></ul>
  7. 7. Goals of File Systems <ul><li>To provide with </li></ul><ul><ul><li>1. efficient Data Structures for storing large and complex data </li></ul></ul><ul><ul><li>2. Access Methods for rapid search </li></ul></ul><ul><ul><li>3. Query Processing Methods </li></ul></ul><ul><ul><li>4. Robust Management of Transactions </li></ul></ul>
  8. 8. Memory Hierarchy <ul><li>Large Data Volume </li></ul><ul><ul><li>Not be stored in main memory </li></ul></ul><ul><ul><li>But in secondary memory </li></ul></ul><ul><li>Memory Hierarchy </li></ul>Cache Memory 256 K bytes Main Memory 1G bytes Secondary Memory 100 G bytes Tertiary Memory 100 Tera bytes Faster Cheaper
  9. 9. Flash Memory <ul><li>Non-Volatile </li></ul><ul><ul><li>Data survives power failure, but </li></ul></ul><ul><ul><li>Data can be written at a location only once, but location can be erased and written to again </li></ul></ul><ul><ul><ul><li>Can support only a limited number of write/erase cycles. </li></ul></ul></ul><ul><ul><ul><li>Erasing of memory has to be done to an entire bank of memory </li></ul></ul></ul><ul><li>Speed </li></ul><ul><ul><li>Reads are roughly as fast as main memory </li></ul></ul><ul><ul><li>But writes are slow (few microseconds), erase is slower </li></ul></ul><ul><li>Cost per unit of storage roughly similar to main memory </li></ul><ul><li>Widely used in embedded devices such as digital cameras </li></ul>
  10. 10. Optical Storage <ul><li>Non-volatile : </li></ul><ul><ul><li>data is read optically from a spinning disk using a laser </li></ul></ul><ul><ul><li>CD-ROM (800 MB), DVD (4.7 to 17 GB), CD-R, DVD-R </li></ul></ul><ul><ul><li>CD-RW, DVD-RW, and DVD-RAM </li></ul></ul><ul><li>Speed </li></ul><ul><ul><li>Reads and writes are slower than with magnetic disk </li></ul></ul><ul><li>Juke-box systems </li></ul><ul><ul><li>Large numbers of removable disks, </li></ul></ul><ul><ul><li>Few drives, and </li></ul></ul><ul><ul><li>Mechanism for automatic loading/unloading of disks </li></ul></ul><ul><ul><li>For storing large volumes of data </li></ul></ul>
  11. 11. Tape <ul><li>Non-volatile </li></ul><ul><ul><li>Primarily Used for backup </li></ul></ul><ul><li>Speed </li></ul><ul><ul><li>Sequential access : much slower than disk </li></ul></ul><ul><li>Cost </li></ul><ul><ul><li>Very high capacity (40 to 300 GB tapes available) </li></ul></ul><ul><ul><li>Tape can be removed from drive </li></ul></ul><ul><ul><li>Drives are expensive </li></ul></ul><ul><li>Tape jukeboxes </li></ul><ul><ul><li>hundreds of terabytes to even a petabyte </li></ul></ul>
  12. 12. Data Access with Secondary Memory Main Memory Access Request If in main memory Disk If not in main memory How to increase hit ratio ? Get Data Access to Disk Load on main memory Get Data Hit Ratio r h = n h / n a
  13. 13. Why Hit Ratio is so important ? <ul><li>Example </li></ul><ul><ul><li>for(int i=0;i<1000;i++) </li></ul></ul><ul><ul><ul><li>Nbytes=read(fd,buf,100); </li></ul></ul></ul>1000 * 10 -2 sec = 10 sec 1000 * 10 -8 sec = 10 -5 sec 1000 disk accesses ? when r h = 0 when r h = 1
  14. 14. Physical Structure of Disk 512 bytes 200~400 sectors 2 * n DF
  15. 15. Disk Access Time <ul><li>Disk Access Time </li></ul><ul><ul><li>t = t S + t R + t T , where </li></ul></ul><ul><ul><li>t S : Seek Time </li></ul></ul><ul><ul><ul><li>Time to reposition the head over the correct track </li></ul></ul></ul><ul><ul><ul><li>Average seek time is 1/2 the worst case seek time </li></ul></ul></ul><ul><ul><ul><li>4 to 10 milliseconds on typical disks </li></ul></ul></ul><ul><ul><li>t R : Rotational Latency </li></ul></ul><ul><ul><ul><li>Time to reposition the head over the correct sector </li></ul></ul></ul><ul><ul><ul><li>Average rotational latency : ½ r (to find index point) + ½ r = r </li></ul></ul></ul><ul><ul><ul><li>In case of 15000 rpm : r =1*60sec/15000 = 4 msec </li></ul></ul></ul><ul><ul><li>t T : Transfer Time </li></ul></ul><ul><ul><ul><li>Time to transfer data from disk to main memory via channel </li></ul></ul></ul><ul><ul><ul><li>Proportional to the number of sectors to read </li></ul></ul></ul><ul><ul><ul><li>Real transfer time is negligible </li></ul></ul></ul>
  16. 16. Block-Oriented Disk Access <ul><li>Example </li></ul><ul><ul><li>for(int i=0;i<1000;i++) </li></ul></ul><ul><ul><ul><li>Nbytes=read(fd,buf,10); </li></ul></ul></ul>1000 times 10 bytes Buffer in main memory 1024 bytes 10 times 100 times 1 block (e.g. 1024 bytes) Number of Disk Accesses
  17. 17. Disk Block <ul><li>Unit of Disk Access </li></ul><ul><li>Block Size </li></ul><ul><ul><li>Normally multiple of sectors </li></ul></ul><ul><ul><li>1K, 4K, 16K or 64K bytes depending on configuration </li></ul></ul><ul><li>Why not large block ? </li></ul><ul><ul><li>Limited by the size of available main memory </li></ul></ul><ul><ul><li>Too large : unnecessary accesses of sectors </li></ul></ul><ul><ul><ul><li>e.g. only 100 bytes, when block size is given as 64K </li></ul></ul></ul><ul><ul><ul><ul><li>1 block : 128 sectors (about ½ track, ½ rotation, 2 msec) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Too wasteful </li></ul></ul></ul></ul>