Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Multics as foutainhead of OS; if understand Multics, understand OS Before you were born, fountainthead of everything going on in MPUs today
  • lecture_18_memory.ppt

    1. 1. Memory technology Lotzi Bölöni Fall 2003
    2. 2. Acknowledgements <ul><li>All the lecture slides were adopted from the slides of David Patterson (1998, 2001) and David E. Culler (2001), Copyright 1998-2002, University of California Berkeley </li></ul>
    3. 3. Standing on shoulders of giants <ul><li>“ Ideally one would desire an indefinitely large memory capacity such that any particular… word would be immediately available… We are… forced to recognize the possibility of constructing a hierarchy of memories, each of which has a greater capacity than the preceding but which is less quickly accessible.” </li></ul><ul><li>A.W.Burks, H.H.Goldstine and J. von Neumann </li></ul><ul><li>Preliminary Discussion of the Logical Design of an Electronic Computing Instrument ( 1946 ) </li></ul>
    4. 4. Elements of Memory Organization <ul><li>The technologies (SRAM, DRAM etc) </li></ul><ul><li>The components </li></ul><ul><ul><li>Cache (L1,L2) </li></ul></ul><ul><ul><li>Main memory </li></ul></ul><ul><ul><li>Virtual memory </li></ul></ul>
    5. 5. Main Memory Background <ul><li>Random Access Memory (vs. Serial Access Memory) </li></ul><ul><li>Different flavors at different levels </li></ul><ul><ul><li>Physical Makeup (CMOS, DRAM) </li></ul></ul><ul><ul><li>Low Level Architectures (FPM,EDO,BEDO,SDRAM) </li></ul></ul><ul><li>Cache uses SRAM : Static Random Access Memory </li></ul><ul><ul><li>No refresh (6 transistors/bit vs. 1 transistor Size : DRAM/SRAM ­ 4-8 , Cost/Cycle time : SRAM/DRAM ­ 8-16 </li></ul></ul><ul><li>Main Memory is DRAM : Dynamic Random Access Memory </li></ul><ul><ul><li>Dynamic since needs to be refreshed periodically (8 ms, 1% time) </li></ul></ul><ul><ul><li>Addresses divided into 2 halves (Memory as a 2D matrix): </li></ul></ul><ul><ul><ul><li>RAS or Row Access Strobe </li></ul></ul></ul><ul><ul><ul><li>CAS or Column Access Strobe </li></ul></ul></ul>
    6. 6. Static RAM (SRAM) <ul><li>Six transistors in cross connected fashion </li></ul><ul><ul><li>Provides regular AND inverted outputs </li></ul></ul><ul><ul><li>Implemented in CMOS process </li></ul></ul>Single Port 6-T SRAM Cell
    7. 7. <ul><li>SRAM cells exhibit high speed/poor density </li></ul><ul><li>DRAM: simple transistor/capacitor pairs in high density form </li></ul>Dynamic RAM Word Line Bit Line C Sense Amp . . .
    8. 8. DRAM Operations <ul><li>Write </li></ul><ul><ul><li>Charge bitline HIGH or LOW and set wordline HIGH </li></ul></ul><ul><li>Read </li></ul><ul><ul><li>Bit line is precharged to a voltage halfway between HIGH and LOW , and then the word line is set HIGH. </li></ul></ul><ul><ul><li>Depending on the charge in the cap, the precharged bitline is pulled slightly higher or lower. </li></ul></ul><ul><ul><li>Sense Amp Detects change </li></ul></ul><ul><li>Explains why Cap can’t shrink </li></ul><ul><ul><li>Need to sufficiently drive bitline </li></ul></ul><ul><ul><li>Increase density => increase parasitic capacitance </li></ul></ul>Word Line Bit Line C Sense Amp . . .
    9. 9. DRAM logical organization (4 Mbit) <ul><li>Square root of bits per RAS/CAS </li></ul>Column Decoder Sense Amps & I/O Memory Array (2,048 x 2,048) A0…A1 0 … 1 1 D Q W ord Line Storage Cell Row Decoder …
    10. 10. So, Why do I freaking care? <ul><li>By it’s nature, DRAM isn’t built for speed </li></ul><ul><ul><li>Response times dependent on capacitive circuit properties which get worse as density increases </li></ul></ul><ul><li>DRAM process isn’t easy to integrate into CMOS process </li></ul><ul><ul><li>DRAM is off chip </li></ul></ul><ul><ul><li>Connectors, wires, etc introduce slowness </li></ul></ul><ul><ul><li>IRAM efforts looking to integrating the two </li></ul></ul><ul><li>Memory Architectures are designed to minimize impact of DRAM latency </li></ul><ul><ul><li>Low Level: Memory chips </li></ul></ul><ul><ul><li>High Level memory designs. </li></ul></ul><ul><ul><li>You will pay $$$$$$ and then some $$$ for a good memory system. </li></ul></ul>
    11. 11. So, Why do I freaking care? <ul><li>1960-1985: Speed = ƒ(no. operations) </li></ul><ul><li>1990 </li></ul><ul><ul><li>Pipelined Execution & Fast Clock Rate </li></ul></ul><ul><ul><li>Out-of-Order execution </li></ul></ul><ul><ul><li>Superscalar Instruction Issue </li></ul></ul><ul><li>1998: Speed = ƒ(non-cached memory accesses) </li></ul><ul><li>What does this mean for </li></ul><ul><ul><li>Compilers?,Operating Systems?, Algorithms? Data Structures? </li></ul></ul>
    12. 12. DRAM Performance <ul><li>A 60 ns ( t RAC ) DRAM can </li></ul><ul><ul><li>perform a row access only every 110 ns ( t RC ) </li></ul></ul><ul><ul><li>perform column access ( t CAC ) in 15 ns, but time between column accesses is at least 35 ns ( t PC ). </li></ul></ul><ul><ul><ul><li>In practice, external address delays and turning around buses make it 40 to 50 ns </li></ul></ul></ul><ul><li>These times do not include the time to drive the addresses off the microprocessor nor the memory controller overhead! </li></ul><ul><li>Can it be made faster? </li></ul><ul><li>Many techniques are trading higher bandwidth, but have higher latency </li></ul><ul><ul><li>The idea that the latency will be taken care of by the cache. </li></ul></ul>
    13. 13. Synchronous DRAM <ul><li>Has a clock input. </li></ul><ul><ul><li>Data output is in bursts w/ each element clocked </li></ul></ul><ul><li>Flavors: SDRAM, DDR </li></ul>PC100: Intel spec to meet 100MHz memory bus designs. Introduced w/ i440BX chipset Write Read
    14. 14. RAMBUS <ul><li>“ Intellectual property company”. </li></ul><ul><ul><li>Located in Los Altos, CA </li></ul></ul><ul><ul><li>Designed a memory architecture </li></ul></ul><ul><ul><li>Licenced to manufacturers </li></ul></ul><ul><ul><li>They have no factories. </li></ul></ul><ul><li>Picked up by Intel, who signed an exclusive deal with them for Pentium 4 motherboards. </li></ul><ul><li>Litigation regarding the intellectual property. </li></ul>
    15. 15. RAMBUS (RDRAM) <ul><li>Protocol based RAM w/ narrow (16-bit) bus </li></ul><ul><ul><li>High clock rate (400 Mhz), but long latency </li></ul></ul><ul><ul><li>Pipelined operation </li></ul></ul><ul><li>Multiple arrays w/ data transferred on both edges of clock </li></ul>RAMBUS Bank RDRAM Memory System
    16. 16. RDRAM Timing
    17. 17. DRAM History <ul><li>DRAMs: capacity +60%/yr, cost –30%/yr </li></ul><ul><ul><li>2.5X cells/area, 1.5X die size in ­3 years </li></ul></ul><ul><li>‘ 98 DRAM fab line costs $2B </li></ul><ul><ul><li>DRAM only: density, leakage v. speed </li></ul></ul><ul><li>Rely on increasing no. of computers & memory per computer (60% market) </li></ul><ul><ul><li>SIMM or DIMM is replaceable unit => computers use any generation DRAM </li></ul></ul><ul><li>Commodity, second source industry => high volume, low profit, conservative </li></ul><ul><ul><li>Little organization innovation in 20 years </li></ul></ul><ul><ul><li>Don’t want to be chip foundries (bad for RDRAM) </li></ul></ul><ul><li>Order of importance: 1) Cost/bit 2) Capacity </li></ul><ul><ul><li>First RAMBUS: 10X BW, +30% cost => little impact </li></ul></ul>
    18. 18. Read-only memory (ROM) <ul><li>Programmed at time of manufacture </li></ul><ul><ul><li>Can not be written by the computer </li></ul></ul><ul><ul><li>It is not erased by loss of power </li></ul></ul><ul><ul><li>Some of them can be erased and rewritten by special hardware (EEPROM) </li></ul></ul><ul><li>One transistor / bit. </li></ul><ul><li>Used in: </li></ul><ul><ul><li>BIOS of desktop computers </li></ul></ul><ul><ul><li>Embedded devices (also serves as a code protection device) </li></ul></ul>
    19. 19. FLASH Memory <ul><li>Floating gate transitor </li></ul><ul><ul><li>Presence of charge => “0” </li></ul></ul><ul><ul><li>Erase Electrically or UV (EPROM) </li></ul></ul><ul><li>Performance </li></ul><ul><ul><li>Reads like DRAM (~ns) </li></ul></ul><ul><ul><li>Writes like DISK (~ms). Write is a complex operation </li></ul></ul>