• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Memory systems n
 

Memory systems n

on

  • 203 views

 

Statistics

Views

Total Views
203
Views on SlideShare
203
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Memory systems n Memory systems n Presentation Transcript

    • The Memory HierarchyTopicsTopics Storage technologies and trends Locality of reference Caching in the memory hierarchyMemory Systems-1-
    • Main Memory• Main memory provides the main storage for a computer.Main memory provides the main storage for a computer.• Two CPU registers are used to interface the CPU to theTwo CPU registers are used to interface the CPU to themain memory. These aremain memory. These arethe memory address register (MA R) andthe memory address register (MA R) andthe memory data register (MDR).the memory data register (MDR).• The MDR is used to hold the data to be stored and / orThe MDR is used to hold the data to be stored and / orretrieved in / from the memory location whose address isretrieved in / from the memory location whose address isheld in the MARheld in the MAR-2-
    • Traditional ArchitectureUp to 2k  addressableMDRMARFigure  Connection of the memory to the processor.k­bitaddress busn­bitdata busControl lines(          , MFC, etc.)Processor  MemorylocationsWord length =n bitsWR /-3-
    • conceptual internal organization of amemory chip-4-
    • Internal Organization of Memory ChipsFFFigure Organization of bit cells in a memory chip.circuitSense / WriteAddressdecoderFFCScellsMemorycircuitSense / WriteSense / WritecircuitData input/output lines:A0A1A2A3W0W1W15b7 b1 b0WR /b′7 b′1 b′0b7 b1 b0•••••••••••••••••••••••••••-5-
    • size• 16 words of 8 bits each: 16x8 memory org.. It has16 words of 8 bits each: 16x8 memory org.. It has16 external connections: addr. 4, data 8, control: 2,16 external connections: addr. 4, data 8, control: 2,power/ground: 2power/ground: 2• 1K memory cells: 128x8 memory, external1K memory cells: 128x8 memory, externalconnections: ? 19(7+8+2+2)connections: ? 19(7+8+2+2)• 1Kx1:? 15 (10+1+2+2)1Kx1:? 15 (10+1+2+2)-6-
    • Main memory(cont’d)• Visualize a typical internal main memory structure asVisualize a typical internal main memory structure asconsisting of rows and columns of basic cells.consisting of rows and columns of basic cells.• Each cell storing one bit of information.Each cell storing one bit of information.• Cells belonging to a given row can be assumed to form theCells belonging to a given row can be assumed to form thebits of a givenbits of a given memory word.memory word.• Address lines AAddress lines An-1n-1 AAn-n-22 ... A... A11 AA00 are used as inputs to theare used as inputs to theaddress decoder in order to generate the word select linesaddress decoder in order to generate the word select linesWW22n-1n-1... W... W11 WW00-7-
    • Main memory cont’d• A given word select line is common to all memory cells in the sameA given word select line is common to all memory cells in the samerow.row.• At any given time ,the address decoder activates only one word selectAt any given time ,the address decoder activates only one word selectline while deactivating the remaining lines.line while deactivating the remaining lines.• A word select line is used to enable all cells in a row for read or write.A word select line is used to enable all cells in a row for read or write.• Data (bit) lines are used to input or output the content s of cells.Data (bit) lines are used to input or output the content s of cells.• Each memory cell is connected to two data lines. A given data line isEach memory cell is connected to two data lines. A given data line iscommon to all cells in a given column.common to all cells in a given column.-8-
    • What are these “Cells”?• They are just circuits capable of storing 1 bit of information.They are just circuits capable of storing 1 bit of information.• In static CMOS technology, each main memory cell consists ofIn static CMOS technology, each main memory cell consists oftwo inverters connected back to back(six transistors)two inverters connected back to back(six transistors)-9-
    • How the above circuit works?• If A =1, then transistor NIf A =1, then transistor N22 will be ON and point B=0, which in turnwill be ON and point B=0, which in turnwill cause transistor Pwill cause transistor P11 to be ON , thus causing point A=1. Thisto be ON , thus causing point A=1. Thisrepresents a cell stable state, call it state 1represents a cell stable state, call it state 1• If in the figure A =0, then transistor N1 will be ON and point B=1,If in the figure A =0, then transistor N1 will be ON and point B=1,which in turn will cause transistor Pwhich in turn will cause transistor P22 to be ON , thus causing pointto be ON , thus causing pointB=1. This represents a cell stable state, state 2B=1. This represents a cell stable state, state 2• Transistors NTransistors N33 and Nand N44 connect the cell to the two data (bit) lines.connect the cell to the two data (bit) lines.• Normally (if the word select is not activated) transistors are turnedNormally (if the word select is not activated) transistors are turnedoff, thus protecting the cell from the signal values carried by the dataoff, thus protecting the cell from the signal values carried by the datalines.lines.• The two transistors(NThe two transistors(N33 and Nand N44) are turned on when the word select line) are turned on when the word select lineis activated.is activated.• What takes place when the two transistors are turned on will dependWhat takes place when the two transistors are turned on will dependon the intended memory operationon the intended memory operation-10-
    • Read Operation1.Both lines b and b are pre-charged high.1.Both lines b and b are pre-charged high.2. The word select line is activated, thus turning on both2. The word select line is activated, thus turning on bothtransistors N3 and N 4 .transistors N3 and N 4 .3. Depending on the internal value stored in the cell,3. Depending on the internal value stored in the cell,point A( B) will lead to the discharge of line b ( bb’ ).point A( B) will lead to the discharge of line b ( bb’ ).-11-
    • Write operation:1. The bit lines are pre-charged such that b( b’ ) = 1(0)1. The bit lines are pre-charged such that b( b’ ) = 1(0)2. The word select line is activated, thus turning on both2. The word select line is activated, thus turning on bothtransistors N3 and N 4 .transistors N3 and N 4 .3. The bit line pre-charged with 0 will have to force the3. The bit line pre-charged with 0 will have to force thepoint A (B ), which has1, to 0.point A (B ), which has1, to 0.-12-
    • Main draw back of this organization• Consider, a 1K X 4 memory chipConsider, a 1K X 4 memory chip• Using the organization shown above the memory arrayUsing the organization shown above the memory arrayshould be organized as 1K rows of cells, each consisting ofshould be organized as 1K rows of cells, each consisting of4 cells4 cells• The chip will then have to have 10 pins for the address andThe chip will then have to have 10 pins for the address and4 pins for the data.4 pins for the data.• However, this doesn’t not lead to the best utilization of theHowever, this doesn’t not lead to the best utilization of thechip areachip area-13-
    • A Memory ChipFigure   Organization of a 1K × 1 memory chip.CSSense/Writecircuitryarraymemory celladdress5­bit rowinput/outputData5­bitdecoderaddress5­bit columnaddress10­bitoutput multiplexer 32­to­1input demultiplexer32 32×WR/W0W1W31and-14-
    • Impact of using different organization• Consider, for example, the design of a memory subsystem whoseConsider, for example, the design of a memory subsystem whosecapacity is 4K bits.capacity is 4K bits.• Different organization of the same memory capacity can lead to aDifferent organization of the same memory capacity can lead to adifferent number of chip pins requirementdifferent number of chip pins requirement-15-
    • Example• Consider, for example, the design of aConsider, for example, the design of a 4M bytes4M bytes main memorymain memorysubsystem usingsubsystem using 1M bit1M bit chipchip• The number of required chips is 32 chipsThe number of required chips is 32 chips• Note that the number of address lines required for the 4M system isNote that the number of address lines required for the 4M system is22, while the number of data lines is 8.22, while the number of data lines is 8.• The memory subsystem can be arranged in four rows, each havingThe memory subsystem can be arranged in four rows, each havingeight chips.eight chips.• The least significant 20 address lines AThe least significant 20 address lines A1919 toto AA00 are used to address anyare used to address anyof the basic building block 1M single bit chips.of the basic building block 1M single bit chips.• The high- order two address lines AThe high- order two address lines A2121–A–A2020 are used as inputs to a 2 – 4are used as inputs to a 2 – 4decoderdecoder-16-
    • Example(cont’d)-17-
    • Example(cont’d)-18-
    • Static MemoriesThe circuits are capable of retaining their state as long as power isThe circuits are capable of retaining their state as long as power isappliedappliedYXWord lineBit linesFigure   A static RAM cell.bT2T1b′-19-
    • Static MemoriesCMOS cell: lCMOS cell: low power consumptionow power consumption-20-
    • Asynchronous DRAMs• Static RAMs are fast, but they cost more area and are moreStatic RAMs are fast, but they cost more area and are moreexpensive.expensive.• Dynamic RAMs (DRAMs) are cheap and area efficient, but they canDynamic RAMs (DRAMs) are cheap and area efficient, but they cannot retain their state indefinitely – need to be periodically refreshed.not retain their state indefinitely – need to be periodically refreshed.Figure  A single­transistor dynamic memory cell TCWord lineBit line-21-
    • Random-Access Memory (RAM)Key featuresKey features• RAM is packaged as a chipRAM is packaged as a chip• Basic storage unit is a cell (one bit per cell)Basic storage unit is a cell (one bit per cell)• Multiple RAM chips form a memoryMultiple RAM chips form a memoryStatic RAM (Static RAM (SRAMSRAM))• Each cell stores bit with a six-transistor circuitEach cell stores bit with a six-transistor circuit• Retains value indefinitely, as long as it is kept poweredRetains value indefinitely, as long as it is kept powered• Relatively insensitive to disturbances such as electrical noiseRelatively insensitive to disturbances such as electrical noise• Faster and more expensive than DRAMFaster and more expensive than DRAMDynamic RAM (Dynamic RAM (DRAMDRAM))• Each cell stores bit with a capacitor and transistorEach cell stores bit with a capacitor and transistor• Value must be refreshed every 10-100 msValue must be refreshed every 10-100 ms• Sensitive to disturbancesSensitive to disturbances• Slower and cheaper than SRAMSlower and cheaper than SRAM-22-
    • Non-Volatile RAM (NVRAM)Key Feature: Keeps data when powerKey Feature: Keeps data when powerlostlost Several types Reading assignment(read from your txtbook and from theweb)-23-
    • AA busbus is a collection of parallel wires that carryis a collection of parallel wires that carryaddress, data, and control signalsaddress, data, and control signalsBuses are typically shared by multiple devicesBuses are typically shared by multiple devicesmainmemoryI/Obridgebus interfaceALUregister fileCPU chipsystem bus memory busTypical Bus Structure Connecting CPU and Memory-28-
    • CPU places address A on memory busCPU places address A on memory busALUregister filebus interfaceA0Axmain memoryI/O bridge%eaxLoad operation: movl A, %eaxMemory Read Transaction (1)-29-
    • Main memory reads A from memory bus, retrievesMain memory reads A from memory bus, retrievesword x, and places it on busword x, and places it on busALUregister filebus interfacex 0Axmain memory%eaxI/O bridgeLoad operation: movl A, %eaxMemory Read Transaction (2)-30-
    • CPU reads word x from bus and copies it intoCPU reads word x from bus and copies it intoregister %eaxregister %eaxxALUregister filebus interface xmain memory0A%eaxI/O bridgeLoad operation: movl A, %eaxMemory Read Transaction (3)-31-
    • CPU places address A on bus; main memoryCPU places address A on bus; main memoryreads it and waits for corresponding data wordreads it and waits for corresponding data wordto arriveto arriveyALUregister filebus interfaceAmain memory0A%eaxI/O bridgeStore operation: movl %eax, AMemory Write Transaction (1)-32-
    • CPU places data word y on busCPU places data word y on busyALUregister filebus interfaceymain memory0A%eaxI/O bridgeStore operation: movl %eax, AMemory Write Transaction (2)-33-
    • Main memory reads data word y from bus andMain memory reads data word y from bus andstores it at address Astores it at address AyALUregister filebus interface ymain memory0A%eaxI/O bridgeStore operation: movl %eax, AMemory Write Transaction (3)-34-
    • Principle of Locality:Principle of Locality: Programs tend to reuse data and instructions nearthose they have used recently, or that were recentlyreferenced themselves Temporal locality: Recently referenced items are likelyto be referenced in the near future Spatial locality: Items with nearby addresses tend tobe referenced close together in timeLocality Example:• Data– Reference array elements in succession(stride-1 reference pattern):– Reference sum each iteration:• Instructions– Reference instructions in sequence:– Cycle through loop repeatedly:sum = 0;for (i = 0; i < n; i++)sum += a[i];return sum;Spatial localitySpatial localityTemporal localityTemporal localityLocality-35-
    • Some fundamental and enduring properties ofSome fundamental and enduring properties ofhardware and software:hardware and software: Fast storage technologies cost more per byte and haveless capacity Gap between CPU and main memory speed is widening Well-written programs tend to exhibit good localityThese fundamental properties complement eachThese fundamental properties complement eachother beautifullyother beautifullyThey suggest an approach for organizing memoryThey suggest an approach for organizing memoryand storage systems known as aand storage systems known as a memorymemoryhierarchyhierarchyMemory Hierarchy-36-
    • registerson-chip L1cache (SRAM)main memory(DRAM)local secondary storage(local disks)Larger,slower,andcheaper(per byte)storagedevicesremote secondary storage(distributed file systems, Web servers)Local disks hold filesretrieved from disks onremote network serversMain memory holds diskblocks retrieved from localdisksoff-chip L2cache (SRAM)L1 cache holds cache linesretrieved from the L2 cachememoryCPU registers hold wordsretrieved from L1 cacheL2 cache holds cache linesretrieved from main memoryL0:L1:L2:L3:L4:L5:Smaller,faster,andcostlier(per byte)storagedevicesAn Example Memory Hierarchy-37-
    • Cache:Cache: Smaller, faster storage device that acts asSmaller, faster storage device that acts asstaging area for subset of data in a larger,staging area for subset of data in a larger,slower deviceslower deviceFundamental idea of a memory hierarchy:Fundamental idea of a memory hierarchy: For each k, the faster, smaller device at level k servesas cache for larger, slower device at level k+1Why do memory hierarchies work?Why do memory hierarchies work? Programs tend to access data at level k more oftenthan they access data at level k+1 Thus, storage at level k+1 can be slower, and thuslarger and cheaper per bit Net effect: Large pool of memory that costs as little asthe cheap storage near the bottom, but that servesdata to programs at rate of the fast storage near the≈topCaches-38-
    • 0 1 2 34 5 6 78 9 10 1112 13 14 15Larger, slower, cheaper storagedevice at level k+1 is partitionedinto blocks.Data is copied betweenlevels in block-sized transferunits8 9 14 3Smaller, faster, more expensivedevice at level k caches asubset of the blocks from level k+1Level k:Level k+1: 444 101010Caching in a Memory Hierarchy-39-
    • Request14Request12Program needs object d, which is storedProgram needs object d, which is storedin some block bin some block bCache hitCache hit Program finds b in the cache atProgram finds b in the cache atlevel k. E.g., block 14level k. E.g., block 14Cache missCache miss b is not at level k, so level k cacheb is not at level k, so level k cachemust fetch it from level k+1.must fetch it from level k+1.E.g., block 12E.g., block 12 If level k cache is full, then someIf level k cache is full, then somecurrent block must be replacedcurrent block must be replaced(evicted). Which one is the(evicted). Which one is the“victim”?“victim”? Placement policy: where can thewhere can thenew block go? E.g., b mod 4new block go? E.g., b mod 4 Replacement policy: which blockwhich block9 30 1 2 34 5 6 78 9 10 1112 13 14 15Levelk:Levelk+1:141412144*4*12120 1 2 3Request124*4*12General Caching Concepts-40-
    • General Caching Concepts-41-
    • • Cache memory is intended to give memory speedCache memory is intended to give memory speedapproaching that of the fastest memoriesapproaching that of the fastest memoriesavailableavailableCache Memory Principles-42-
    • • When the processor attempts to read a word of memory, a check isWhen the processor attempts to read a word of memory, a check ismade to determine if the word is in the cache. If so, the word ismade to determine if the word is in the cache. If so, the word isdelivered to the processordelivered to the processor• If not, a block of main memory, consisting of some fixed numberIf not, a block of main memory, consisting of some fixed numberof words, is read into the cache and then the word is delivered toof words, is read into the cache and then the word is delivered tothe processorthe processor• Because of the phenomenon of locality of reference, when a blockBecause of the phenomenon of locality of reference, when a blockof data is fetched into the cache to satisfy a single memoryof data is fetched into the cache to satisfy a single memoryreference, it is likely that there will be future references to thatreference, it is likely that there will be future references to thatsame memory location or to other words in the block.same memory location or to other words in the block.Cache Memory Principles(cont’d)-43-
    • Cache Memory principles(cont’d)-44-
    • • Main memory consists of up to 2Main memory consists of up to 2nnaddressable words, with eachaddressable words, with eachword having a unique n-bit address.word having a unique n-bit address.• For mapping purposes, this memory is considered to consist of aFor mapping purposes, this memory is considered to consist of anumber of fixed-length blocks of K words each.number of fixed-length blocks of K words each.That is, there areThat is, there are M =2M =2nn/K/K blocks in main memory.blocks in main memory.• The cache consists of m blocks, calledThe cache consists of m blocks, called lineslines• NOTE:NOTE: In referring to the basic unit of the cache, the termIn referring to the basic unit of the cache, the term lineline isisused, rather than the termused, rather than the term blockblock• Each line containsEach line contains KK words, plus a tag of a few bitswords, plus a tag of a few bits• Each line also includes control bits (not shown), such as a bit toEach line also includes control bits (not shown), such as a bit toindicate whether the line has been modified since being loaded intoindicate whether the line has been modified since being loaded intothe cachethe cacheCache memory principles(cont’d)-45-
    • • The number of lines is considerably less than the number of mainThe number of lines is considerably less than the number of mainmemory blocks(m<=M)memory blocks(m<=M)• If a word in a block of memory is read, that block is transferred toIf a word in a block of memory is read, that block is transferred toone of the lines of the cache.one of the lines of the cache.• Because there are more blocks than lines, an individual line can-notBecause there are more blocks than lines, an individual line can-notbe uniquely and permanently dedicated to a particular blockbe uniquely and permanently dedicated to a particular block• Thus, each line includes aThus, each line includes a tagtag that identifies which particular blockthat identifies which particular blockis currently being stored.is currently being stored.• TheThe tagtag is usually a portion of the main memory addressis usually a portion of the main memory addressCache memory principles(cont’d)-46-
    • Cache Read Operation-47-
    • Elements of cache DesignElements of cache Design-48-
    • • AA logical cachelogical cache, also known as a virtual cache, stores data using, also known as a virtual cache, stores data usingvirtual addressesvirtual addresses• The processor accesses the cache directly,without going through theThe processor accesses the cache directly,without going through theMMU(Memory management Unit)MMU(Memory management Unit)• AdvantageAdvantage of the logical cache is thatof the logical cache is that cache access speed is fastercache access speed is fasterthan for a physical cache, because the cache can respond before thethan for a physical cache, because the cache can respond before theMMU performs an address translationMMU performs an address translation• DisadvantageDisadvantage? same virtual address in two different applications? same virtual address in two different applicationsrefers to two different physical addressesrefers to two different physical addressesCache Addresses(Logical)-49-
    • • A physical cache stores data using main memory physicalA physical cache stores data using main memory physicaladdressesaddresses• The subject of logical versus physical cache is a complex one,The subject of logical versus physical cache is a complex one,and beyond the scope of this bookand beyond the scope of this bookCache Addresses(Physical)-50-
    • 0 1 2 3 4 5 6 70 1 2 3Set NumberCacheFully (2-way) Set DirectAssociative Associative Mappedanywhere anywhere in only intoset 0 block 4(12 mod 4) (12 mod 8)0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 10 1 2 3 4 5 6 7 8 92 2 2 2 2 2 2 2 2 20 1 2 3 4 5 6 7 8 93 30 1MemoryBlock Numberblock 12can be placedMapping functionsMapping functions-51-
    • Main MemoryDirect Mapping Cache OrganizationTag0123CacheIndicates block #Block #0Block #1Block #2Block #3Block #4Block #5Block #6Block #7Block #8Block #9Block #10Block #11Block #12Block #13Block #14Block #15Direct-Mapping Cache OrganizationDestination = (MemBlock Number) modulo(Total number of CacheBlocks)(Size = Log2((Total number ofMemBlocks)/(Total number ofCacheBlocks)))Block #1Block #2Block #3Block #4Tag Size = log2(16/4)= log2(4)0 modulo 4= 0-52-
    • Direct mapping(cont’d)CompareMain MemoryBlock WordTags-r0123Cache Memoryws-rrws wsws = Number of wires for row addressw = Number of wires for column addresss-r = log2(Memblocks/CacheBlock)= Switch11Miss Hit(s+w)-53-
    • •Address length (s+w) bits• Number of addressable units =2s+wwords orbytes• Block size= line size= 2wwords or bytes• Number of blocks in main memory (2s+w)/(2w)• Number of lines in cache= m= 2r• Size of cache 2r+wwords or bytes• Size of tag (s-r) bitDirect mapping(cont’d)-54-
    • Full-Associative Cache OrganizationTag0123CacheIndicates block #Main MemoryBlock #0Block #1Block #2Block #3Block #4Block #5Block #6Block #7Block #8Block #9Block #10Block #11Block #12Block #13Block #14Block #15Full-Associate Cache Organization(Size = Log2(Memory Block#))A memory block can goto any cache block-55-
    • Full Associative Cache Organization(cont’d)CompareWordTagMain Memory0123Cache Memorywss+wsLog2(CacheSize)sws = log2(# of MemoryBlocks)11MissHitw-56-
    • Set Associative Cache OrganizationSet-Associative Cache Organization010101010123“Set Number”(two-way set associative)Block #0Block #1Block #2Block #3Block #4Block #5Block #6Block #7Block #8Block #9Block #10Block #11Block #12Block #13Block #14Block #15Main MemoryBlock #0Block #0Block #0Block #0Block #1Block #1Block #1Block #1Destination = (Memory Block) modulo (# of sets)-57-
    • Set Associative Cache OrganizationCompareMain Memoryws-dWordTag Set023Cache Memory1ds-ds-d = ???sws+ws w11MissHitw-58-
    • Replacement Policy•In an associative cache, which block from a set should beevicted when the set becomes full?• Random• Least-Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree often used for 4-8 way• First-In, First-Out (FIFO)• used in highly associative caches• Not-Most-Recently Used (NMRU)• FIFO with exception for most-recently used block or blocksThis is a second-order effect. Why? Because Replacement onlyhappens on misses-59-
    • Seconday Memory-60-
    • THE MEMORY HIERARCHY(revisitedTHE MEMORY HIERARCHY(revisited)Memory in a computer system is arranged in aMemory in a computer system is arranged in ahierarchy, as shown in Figurehierarchy, as shown in Figure-61-
    • • At the top,we have primary storage, whichAt the top,we have primary storage, whichconsists of cache and main memory, andconsists of cache and main memory, andprovides very fast access to data.provides very fast access to data.• Then comes secondary storage, which consistsThen comes secondary storage, which consistsof slower devices such as magnetic disks.of slower devices such as magnetic disks.• Tertiary storage is the slowest class of storageTertiary storage is the slowest class of storagedevices;devices;• for example, optical disks and tapesfor example, optical disks and tapes-62-
    • • Slower storage devices such as tapes and disksSlower storage devices such as tapes and disksplay an important role in database systemsplay an important role in database systemsbecause the amount of data is typically verybecause the amount of data is typically verylargelarge• Since buying enough main memory to store allSince buying enough main memory to store alldata is prohibitively expensive, we must storedata is prohibitively expensive, we must storedata on tapes and disks and build databasedata on tapes and disks and build databasesystems that can retrieve data from lowersystems that can retrieve data from lowerlevels of the memory hierarchy into mainlevels of the memory hierarchy into mainmemory as needed for processing.memory as needed for processing.-63-
    • Why storing data on secondary memory?Why storing data on secondary memory?COSTCOST• The cost of a given amount of main memory is about 100The cost of a given amount of main memory is about 100times the cost of the same amount of disk space, and tapestimes the cost of the same amount of disk space, and tapesare even less expensive than disks.are even less expensive than disks.PRIMARY STORAGE IS USUALLY VOLATILEPRIMARY STORAGE IS USUALLY VOLATILE• Main memory size is very small but the number of dataMain memory size is very small but the number of dataobjects may exceed this number!objects may exceed this number!Further, data must be maintained across program executions.Further, data must be maintained across program executions.This requires storage devices that retain information whenThis requires storage devices that retain information whenthe computer is restarted (after a shutdown or a crash); wethe computer is restarted (after a shutdown or a crash); wecall such storage nonvolatilecall such storage nonvolatile-64-
    • Primary storage is usually volatile (although itPrimary storage is usually volatile (although itis possible to make it nonvolatile by adding ais possible to make it nonvolatile by adding abattery backup feature), whereas secondarybattery backup feature), whereas secondaryand tertiary storage is nonvolatile.and tertiary storage is nonvolatile.Tapes are relatively inexpensive and can storeTapes are relatively inexpensive and can storevery large amounts of data.very large amounts of data. They are a good choice for archival storage,They are a good choice for archival storage,i.e, when we need to maintain data for a longi.e, when we need to maintain data for a longperiod but do not expect to access it veryperiod but do not expect to access it veryoften.often.-65-
    • DRAWBACKS OF TAPESDRAWBACKS OF TAPES they are sequential access devicesthey are sequential access devicesWe must essentially step through all the data inWe must essentially step through all the data inorder and cannot directly access a given locationorder and cannot directly access a given locationon tapeon tape This makes tapes unsuitable for storing operationalThis makes tapes unsuitable for storing operationaldata , or data that is frequently accessed.data , or data that is frequently accessed. Tapes are mostly used to back up operational dataTapes are mostly used to back up operational dataperiodicallyperiodically..-66-
    • Magnetic disksMagnetic disks• support direct access to a desired locationsupport direct access to a desired location• widely used for database applicationswidely used for database applications• A DBMS provides seamless access to data on disk;A DBMS provides seamless access to data on disk;• applications need not worry about whether data isapplications need not worry about whether data isin main memory or diskin main memory or disk-67-
    • Structure of a DiskStructure of a Disk-68-
    • Disks(cont’d)Disks(cont’d)Data is stored on disk in units called disk blocksData is stored on disk in units called disk blocksA disk block is a contiguous sequence of bytes and isA disk block is a contiguous sequence of bytes and isthe unit in which data is written to a disk and readthe unit in which data is written to a disk and readfrom a diskfrom a diskBlocks are arranged in concentric rings called tracks,Blocks are arranged in concentric rings called tracks,on one or more platterson one or more plattersTracks can be recorded on one or both surfaces of aTracks can be recorded on one or both surfaces of aplatter; we refer to platters as single-sided or double-platter; we refer to platters as single-sided or double-sided accordinglysided accordingly-69-
    • Disks(cont’d)Disks(cont’d)• The set of all tracks with the same diameter is called aThe set of all tracks with the same diameter is called acylindercylinder, because the space occupied by these tracks is, because the space occupied by these tracks isshaped like a cylinder;shaped like a cylinder;• Each track is divided into arcs calledEach track is divided into arcs called sectorssectors, whose size is, whose size isa characteristic of the disk and cannot be changeda characteristic of the disk and cannot be changed• An array of disk heads, one per recorded surface, is movedAn array of disk heads, one per recorded surface, is movedas a unit; when one head is positioned over a block, theas a unit; when one head is positioned over a block, theother heads are in identical positions with respect to theirother heads are in identical positions with respect to theirplattersplatters-70-
    • Disks(cont’d)Disks(cont’d)• To read or write a block, a disk head must be positioned onTo read or write a block, a disk head must be positioned ontop of the blocktop of the block• As the size of a platter decreases, seek times also decreaseAs the size of a platter decreases, seek times also decreasesince we have to move a disk head a smaller distance.since we have to move a disk head a smaller distance.• Typical platter diameters are 3.5 inches and 5.25 inchesTypical platter diameters are 3.5 inches and 5.25 inches• A disk controller interfaces a disk drive to the computer. ItA disk controller interfaces a disk drive to the computer. Itimplements commands to read or write a sector by movingimplements commands to read or write a sector by movingthe arm assembly and transferring data to and from the diskthe arm assembly and transferring data to and from the disksurfacessurfaces-71-
    • Access time to a diskAccess time to a disk• The time to access a disk block has several componentsThe time to access a disk block has several components..SeekSeek timetime is the time taken to move the disk heads to theis the time taken to move the disk heads to thetrack on which a desired block is located.track on which a desired block is located.Rotational delayRotational delay is the waiting time for the desiredis the waiting time for the desiredblock to rotate under the disk head;block to rotate under the disk head;it is the time required for half a rotation on average and isit is the time required for half a rotation on average and isusually less than seek timeusually less than seek time-72-
    • Access time to a disk(cont’d)Access time to a disk(cont’d)Transfer timeTransfer time is the time to actually read or write the data inis the time to actually read or write the data inthe block once the head is positioned, that is, the timethe block once the head is positioned, that is, the timefor the disk to rotate over the block.for the disk to rotate over the block.Example The IBM Deskstar 14GPXExample The IBM Deskstar 14GPX- it is a 3.5 inch diameter,- it is a 3.5 inch diameter,-14.4 GB hard disk,-14.4 GB hard disk,-average seek time of 9.1milliseconds (msec)-average seek time of 9.1milliseconds (msec)- verage rotational delay of 4.17 msec.- verage rotational delay of 4.17 msec.-seek time from one track to the next is just 2.2 ms-seek time from one track to the next is just 2.2 ms-73-
    • Performance Implications of Disk StructurePerformance Implications of Disk Structure1.1. The unit for data transfer between disk and mainThe unit for data transfer between disk and mainmemory is a block;memory is a block;if a single item on a block is needed, the entire block isif a single item on a block is needed, the entire block istransferred.transferred.Reading or writing a disk block is called an I/O (forReading or writing a disk block is called an I/O (forinput/output) operation.input/output) operation.2. The time to read or write a block varies, depending on2. The time to read or write a block varies, depending onthe location of the datathe location of the dataaccess time = seek time + rotational delay +access time = seek time + rotational delay +transfer timetransfer time-74-
    • Virtual MemoryVirtual MemoryUse main memory as a “cache” for secondary (disk)Use main memory as a “cache” for secondary (disk)storagestorage Managed jointly by CPU hardware and the operatingManaged jointly by CPU hardware and the operatingsystem (OS)system (OS)Programs share main memoryPrograms share main memory Each gets a private virtual address space holding itsEach gets a private virtual address space holding itsfrequently used code and datafrequently used code and data Protected from other programsProtected from other programsCPU and OS translate virtual addresses to physicalCPU and OS translate virtual addresses to physicaladdressesaddresses VM “block” is called a pageVM “block” is called a page VM translation “miss” is called a page faultVM translation “miss” is called a page fault-75-
    • Address TranslationAddress TranslationFixed-size pages (e.g., 4K)Fixed-size pages (e.g., 4K)-76-
    • Page Fault PenaltyPage Fault PenaltyOn page fault, the page must be fetched from diskOn page fault, the page must be fetched from disk Takes millions of clock cyclesTakes millions of clock cycles Handled by OS codeHandled by OS codeTry to minimize page fault rateTry to minimize page fault rate Fully associative placementFully associative placement Smart replacement algorithmsSmart replacement algorithms-77-
    • Page TablesPage TablesStores placement informationStores placement information Array of page table entries, indexed by virtual pageArray of page table entries, indexed by virtual pagenumbernumber Page table register in CPU points to page table inPage table register in CPU points to page table inphysical memoryphysical memoryIf page is present in memoryIf page is present in memory Page Table Entry stores the physical page numberPage Table Entry stores the physical page number Plus other status bits (referenced, dirty, …)Plus other status bits (referenced, dirty, …)If page is not presentIf page is not present Page Table Entry can refer to location in swap spacePage Table Entry can refer to location in swap spaceon diskon disk-78-
    • Translation Using a Page TableTranslation Using a Page Table-79-
    • Mapping Pages to StorageMapping Pages to Storage-80-
    • Replacement and WritesReplacement and WritesTo reduce page fault rate, prefer least-recently usedTo reduce page fault rate, prefer least-recently used(LRU) replacement(LRU) replacement Reference bit (aka use bit) in PTE set to 1 on access toReference bit (aka use bit) in PTE set to 1 on access topagepage Periodically cleared to 0 by OSPeriodically cleared to 0 by OS A page with reference bit = 0 has not been used recentlyA page with reference bit = 0 has not been used recentlyDisk writes take millions of cyclesDisk writes take millions of cycles Block at once, not individual locationsBlock at once, not individual locations Write through is impracticalWrite through is impractical Use write-backUse write-back Dirty bit in PTE set when page is writtenDirty bit in PTE set when page is written-81-
    • Fast Translation Using a TLBFast Translation Using a TLBAddress translation would appear to requireAddress translation would appear to requireextra memory referencesextra memory references One to access the PTEOne to access the PTE Then the actual memory accessThen the actual memory accessBut access to page tables has good localityBut access to page tables has good locality So use a fast cache of PTEs within the CPU Called aSo use a fast cache of PTEs within the CPU Called aTranslation Look-aside Buffer (TLB)Translation Look-aside Buffer (TLB) Typical: 16–512 PTEs, 0.5–1 cycle for hit, 10–100 cyclesTypical: 16–512 PTEs, 0.5–1 cycle for hit, 10–100 cyclesfor miss, 0.01%–1% miss ratefor miss, 0.01%–1% miss rate Misses could be handled by hardware or softwareMisses could be handled by hardware or software-82-
    • Fast Translation Using a TLBFast Translation Using a TLB-83-
    • TLB MissesTLB MissesIf page is in memoryIf page is in memory Load the PTE from memory and retryLoad the PTE from memory and retry Could be handled in hardwareCould be handled in hardware Can get complex for more complicated page tableCan get complex for more complicated page tablestructuresstructures Or in softwareOr in software Raise a special exception, with optimized handlerRaise a special exception, with optimized handlerIf page is not in memory (page fault)If page is not in memory (page fault) OS handles fetching the page and updating the pageOS handles fetching the page and updating the pagetabletable Then restart the faulting instructionThen restart the faulting instruction-85-
    • TLB Miss HandlerTLB Miss HandlerTLB miss indicatesTLB miss indicates Page present, but PTE not in TLBPage present, but PTE not in TLB Page not presetPage not presetMust recognize TLB miss before destinationMust recognize TLB miss before destinationregister overwrittenregister overwritten Raise exceptionRaise exceptionHandler copies PTE from memory to TLBHandler copies PTE from memory to TLB Then restarts instructionThen restarts instruction If page not present, page fault will occurIf page not present, page fault will occur-85-
    • Page Fault HandlerPage Fault HandlerUse faulting virtual address to find PTEUse faulting virtual address to find PTELocate page on diskLocate page on diskChoose page to replaceChoose page to replace If dirty, write to disk firstRead page into memory and update page tableRead page into memory and update page tableMake process runnable againMake process runnable again Restart from faulting instruction-86-
    • TLB and Cache InteractionIf cache tag usesIf cache tag usesphysical addressphysical address Need to translateNeed to translatebefore cachebefore cachelookuplookupAlternative: useAlternative: usevirtual address tagvirtual address tag ComplicationsComplicationsdue to aliasingdue to aliasing Different virtualDifferent virtualaddresses foraddresses forshared physicalshared physicaladdressaddress-87-