Your SlideShare is downloading. ×
Lecture2
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Lecture2

312
views

Published on

Published in: Business, Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
312
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit and Architecture Basics • Overview • Terminology • Access Protocol • Architecture Storage element Switching element Bit Line Word Line (capacitor)
  • 2. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit Basics DRAM Cell ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Storage element Switching element Bit Line Word Line Data In/Out Buffers Sense Amps Column Decoder (capacitor)
  • 3. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland Row, Bitlines and Wordlines DRAM Circuit Basics “Row” Defined Bit Lines Word Line “Row” of DRAM Row Size: 8 Kb @ 256 Mb SDRAM node 4 Kb @ 256 Mb RDRAM node
  • 4. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit Basics Sense Amplifier I Sense and Amplify 6 Rows shown 1 2 3 4 5 6
  • 5. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit Basics Sense Amplifier II : Precharged Vcc (logic 1) Gnd (logic 0) Vcc/2 precharged to Vcc/2 Sense and Amplify 1 2 3 4 5 6
  • 6. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit Basics Sense Amplifier III : Destructive Read Sense and Amplify 1 2 3 4 5 6 Vcc (logic 1) Gnd (logic 0) Vcc/2 Wordline Driven
  • 7. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Access Protocol ROW ACCESS AKA: OPEN a DRAM Page/Row RAS (Row Address Strobe) or or ACT (Activate a DRAM Page/Row) BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder
  • 8. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland once the data is valid on ALL of the bit lines, you can select a subset of the bits and send them to the output buffers ... CAS picks one of the bits big point: cannot do another RAS or precharge of the lines until finished reading the column data ... can’t change the values on the bit lines or the output of the sense amps until it has been read by the memory controller DRAM Circuit Basics “Column” Defined “One Row” of DRAM Column: Smallest addressable quantity of DRAM on chip SDRAM*: column size == chip data bus width (4, 8,16, 32) RDRAM: column size != chip data bus width (128 bit fixed) 4 bit wide columns SDRAM*: get “n” columns per access. n = (1, 2, 4, 8) RDRAM: get 1 column per access. #2 #3 #4 #5#0 #1 * SDRAM means SDRAM and variants. i.e. DDR SDRAM
  • 9. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Access Protocol COLUMN ACCESS I READ Command or CAS: Column Address Strobe BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder
  • 10. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland then the data is valid on the data bus ... depending on what you are using for in/out buffers, you might be able to overlap a litttle or a lot of the data transfer with the next CAS to the same page (this is PAGE MODE) DRAM Access Protocol Column Access II note: page mode enables overlap with CAS BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder ... with optional additional CAS: Column Address Strobe Data Out
  • 11. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland NOTE DRAM “Speed” Part I How fast can I move data from DRAM cell to sense amp? RCD (Row Command Delay) BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder tRCD
  • 12. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Part II How fast can I get data out of sense amps back into memory controller? BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder CAS: Column Address Strobe tCAS aka tCL tCASL aka CASL: Column Address Strobe Latency CL: Column Address Strobe Latency
  • 13. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Part III How fast can I move data from DRAM cell into memory controller? RAC (Random Access Delay) BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder tRAC = tRCD + tCAS
  • 14. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Part IV How fast can I precharge DRAM array so I can engage another RAS? RP (Row Precharge Delay) BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder tRP
  • 15. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Part V How fast can I read from different rows? RC (Row Cycle Time) BUS MEMORY CONTROLLERCPU ... Bit Lines... Memory Array RowDecoder ...WordLines... DRAM Data In/Out Buffers Sense Amps Column Decoder tRC = tRAS + tRP
  • 16. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Summary I What do I care about? RC :Row Cycle Time tRC = tRAS + tRP RP :Row Precharge Delay tRP RAC :Random Access Delay tRAC = tRCD + tCAS CAS: Column Address Strobe tCAS RAS: Row Address Strobe tRCD Seen in ads. Easy to explain Embedded systems designers DRAM manufactuers Computer Architect: Latency bound code i.e. linked list traversal Easy to sell RCD: Row Command Delay
  • 17. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “Speed” Summary II DRAM Type Frequency Data Bus Width (per chip) Peak Data Bandwidth (per Chip) Random Access Time (tRAC) Row Cycle Time (tRC) PC133 SDRAM 133 16 200 MB/s 45 ns 60 ns DDR 266 133 * 2 16 532 MB/s 45 ns 60 ns PC800 RDRAM 400 * 2 16 1.6 GB/s 60 ns 70 ns FCRAM 200 * 2 16 0.8 GB/s 25 ns 25 ns RLDRAM 300 * 2 32 2.4 GB/s 25 ns 25 ns DRAM is “slow” But doesn’t have to be tRC < 10ns achievable Higher die cost Not commodity Expensive Not adopted in standard
  • 18. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM “latency” isn’t deterministic because of CAS or RAS+CAS, and there may be significant queuing delays within the CPU and the memory controller Each transaction has some overhead. Some types of overhead cannot be pipelined. This means that in general, longer bursts are more efficient. “DRAM latency” B C D DRAM E2/E3 E1 F A CPU Mem Controller A: Transaction request may be delayed in Queue B: Transaction request sent to Memory Controller C: Transaction converted to Command Sequences (may be queued) D: Command/s Sent to DRAM E1: Requires only a CAS or E2: Requires RAS + CAS or F: Transaction sent back to CPU “DRAM Latency” = A + B + C + D + E + F E3: Requires PRE + RAS + CAS
  • 19. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland NOTE DRAM Architecture Basics PHYSICAL ORGANIZATION This is per bank … Typical DRAMs have 2+ banks x2 DRAM x4 DRAM x8 DRAM ... Bit Lines... Sense Amps .... Data Buffers x8 DRAM RowDecoder Memory Array Column Decoder ... Bit Lines... Sense Amps ....Data Buffers x2 DRAM RowDecoder Memory Array Column Decoder ... Bit Lines... Sense Amps .... Data Buffers x4 DRAM RowDecoder Memory Array Column Decoder
  • 20. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland let’s look at the interface another way .. the say the data sheets portray it. [explain] main point: the RAS and CAS signals directly control the latches that hold the row and column addresses ... DRAM Architecture Basics Read Timing for Conventional DRAM Row Address Column Address Valid Dataout RAS CAS Address DQ Row Address Column Address Valid Dataout Data Transfer Column Access Row Access
  • 21. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland since DRAM’s inception, there have been a stream of changes to the design, from FPM to EDO to Burst EDO to SDRAM. the changes are largely structural modifications -- nimor -- that target THROUGHPUT. [discuss FPM up to SDRAM Everything up to and including SDRAM has been relatively inexpensive, especially when considering the pay-off (FPM was essentially free, EDO cost a latch, PBEDO cost a counter, SDRAM cost a slight re-design). however, we’re run out of “free” ideas, and now all changes are considered expensive ... thus there is no consensus on new directions and myriad of choices has appeared [ do LATENCY mods starting with ESDRAM ... and then the INTERFACE mods ] DRAM Evolutionary Tree (Mostly) Structural Modifications Interface Modifications Structural Conventional FPM EDO ESDRAM Rambus, DDR/2 Future Trends ........ . . . . . . MOSYS FCRAM VCDRAM $ Modifications Targeting Latency Targeting Throughput Targeting Throughput DRAM SDRAMP/BEDO
  • 22. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland NOTE DRAM Evolution Read Timing for Conventional DRAM Row Address Column Address Valid Dataout RAS CAS Address DQ Row Address Column Address Valid Dataout Data Transfer Column Access Transfer Overlap Row Access
  • 23. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland FPM aallows you to keep th esense amps actuve for multiple CAS commands ... much better throughput problem: cannot latch a new value in the column address buffer until the read-out of the data is complete DRAM Evolution Read Timing for Fast Page Mode Row Address Column Address Valid Dataout Column Address Column Address Valid Dataout Valid Dataout RAS CAS Address DQ Data Transfer Column Access Transfer Overlap Row Access
  • 24. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland solution to that problem -- instead of simple tri-state buffers, use a latch as well. by putting a latch after the column mux, the next column address command can begin sooner DRAM Evolution Read Timing for Extended Data Out Row Address Column Address Valid Dataout RAS CAS Address DQ Column Address Column Address Valid Dataout Valid Dataout Data Transfer Column Access Transfer Overlap Row Access
  • 25. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland by driving the col-addr latch from an internal counter rather than an external signal, the minimum cycle time for driving the output bus was reduced by roughly 30% DRAM Evolution Read Timing for Burst EDO Row Address Column Address RAS CAS Address DQ Data Transfer Column Access Transfer Overlap Row Access Valid Data Valid Data Valid Data Valid Data
  • 26. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland “pipeline” refers to the setting up of the read pipeline ... first CAS toggle latches the column address, all following CAS toggles drive data out onto the bus. therefore data stops coming when the memory controller stops toggling CAS DRAM Evolution Read Timing for Pipeline Burst EDO Row Address Column Address RAS CAS Address DQ Data Transfer Column Access Transfer Overlap Row Access Valid Data Valid Data Valid Data Valid Data
  • 27. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland main benefit: frees up the CPU or memory controller from having to control the DRAM’s internal latches directly ... the controller/CPU can go off and do other things during the idle cycles instead of “wait” ... even though the time-to-first-word latency actually gets worse, the scheme increases system throughput DRAM Evolution Read Timing for Synchronous DRAM (RAS + CAS + OE ... == Command Bus) Command Address DQ Clock Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ RAS CAS Data Transfer Column Access Transfer Overlap Row Access
  • 28. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland output latch on EDO allowed you to start CAS sooner for next accesss (to same row) latch whole row in ESDRAM -- allows you to start precharge & RAS sooner for thee next page access -- HIDE THE PRECHARGE OVERHEAD. DRAM Evolution Inter-Row Read Timing for ESDRAM Command Address DQ Clock Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READPRE “Regular” CAS-2 SDRAM, R/R to same bank Command Address DQ Clock Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ ESDRAM, R/R to same bank PRE Bank Bank
  • 29. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland neat feature of this type of buffering: write-around DRAM Evolution Write-Around in ESDRAM (can second READ be this aggressive?) Command Address DQ Clock Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT WRITEPRE “Regular” CAS-2 SDRAM, R/W/R to same bank, rows 0/1/0 Command Address DQ Clock Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT READ Row Addr Col Addr Valid Data Valid Data Valid Data Valid Data ACT WRITE ESDRAM, R/W/R to same bank, rows 0/1/0 PRE Bank Bank Row Addr Col Addr Valid Data Valid Data Valid Data ACT READPRE Bank Col Addr Valid Data Valid Data Valid Data Valid Data READ
  • 30. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland main thing ... it is like having a bunch of open row buffers (a la rambus), but the problem is that you must deal with the cache directly (move into and out of it), not the DRAM banks ... adds an extra couple of cycles of latency ... however, you get good bandwidth if the data you want is cache, and you can “prefetch” into cache ahead of when you want it ... originally targetted at reducing latency, now that SDRAM is CAS-2 and RCD-2, this make sense only in a throughput way DRAM Evolution Internal Structure of Virtual Channel Segment cache is software-managed, reduces energy $ Row Decoder 2Kb Segment 2Kb Segment 2Kb Segment 2Kb Segment Bank A Bank B 16 “Channels” Input/Output Buffer DQs Sel/Dec (segments) Sense Amps 2Kbit # DQs Activate Prefetch Restore Read Write
  • 31. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland FCRAM opts to break up the data array .. only activate a portion of the word line 8K rows requires 13 bits tto select ... FCRAM uses 15 (assuming the array is 8k x 1k ... the data sheet does not specify) DRAM Evolution Internal Structure of Fast Cycle RAM Reduces access time and energy/access tRCD = 15ns tRCD = 5ns 8M Array 13 bits Sense Amps 8M Array 15 bits Sense Amps SDRAM FCRAM (one clock)(two clocks) RowDecoder RowDecoder (8Kr x 1Kb) (?)
  • 32. DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland MoSys takes this one step further ... DRAM with an SRAM interface & speed but DRAM energy [physical partitioning: 72 banks] auto refresh -- how to do this transparently? the logic moves tthrough the arrays, refreshing them when not active. but what is one bank gets repeated access for a long duration? all other banks will be refreshed, but that one will not. solution: they have a bank-sized CACHE of lines ... in theory, should never have a problem (magic) DRAM Evolution Internal Structure of MoSys 1T-SRAM ........ . . . . . . addr Bank Select DQs $ Auto Refresh