Memory technology and optimization in Advance Computer Architechture


Published on

Published in: Engineering, Technology, Business
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • (The words Memory,Buffer,Cache are all refers Ram). Which is Nearly 11,000 times faster than secondary memory (Hard Disk) in Random Access.)
  • Memory technology and optimization in Advance Computer Architechture

    1. 1. Presented by: Trupti Diwan Shweta Ghate Sapana Vasave
    2. 2.  Basics of memory  Memory Technology  Memory optimization
    3. 3.  Main memory is RAM.(The words Memory, Buffer, Cache are all refers Ram). Which is Nearly 11,000 times faster than secondary memory (Hard Disk) in Random Access.)  Characteristics of Main Memory is as vital as the processor chip to a computer system. Fast systems have both a fast processor and a large memory
    4. 4.  Here is a list of some characteristics of computer memory.  closely connected to the processor.  Holds programs and data that the processor is actively working with.  Used for long term storage.  The processor interacts with it millions of times per second.  The contents is easily changed.  Usually its contents are organized into files.
    5. 5.  Main memory is the short term memory of a computer. It retains data only for the period that a program is running, and that's it.  Memory is used for the purpose of running programs
    6. 6.  Main memory satisfies the demands of caches and serves as the I/O interface.  Performance measures of main memory emphasize both latency and bandwidth (Memory bandwidth is the number of bytes read or written per unit time)  Memory latency is the time delay required to obtain a specific item of data  Memory Bandwidth is the rate at which data can be accessed (e.g. bits per second)  Bandwidth unit is normally 1/cycle time  This rate can be improved by concurrent access
    7. 7.  The main memory affects the cache miss penalty,  the primary concern of the cache.  Main memory bandwidth  the primary concern of I/O and multiprocessors.
    8. 8.  Although caches are interested in low latency memory, it is generally easier to improve memory bandwidth with new organizations than it is to reduce latency  cache designers increase block size to take advantage of the high memory bandwidth.
    9. 9.  Memory latency is traditionally quoted using two measures  Access time  Access time is the time between when a read is requested and when the desired word arrives and  Cycle time  Cycle time is the minimum time between requests to memory.  Cycle time is greater than access time because the memory needs the address lines to be stable between accesses.
    10. 10. Memory Hierarchy of a Modern Computer System • By taking advantage of the principle of locality: – Present the user with as much memory as is available in the cheapest technology. – Provide access at the speed offered by the fastest technology. Control Datapath Secondary Storage (Disk) Processor Registers Main Memory (DRAM) Second Level Cache (SRAM) On-Chip Cache 1s 10,000,000s (10s ms) Speed (ns): 10s 100s 100s GsSize (bytes): Ks Ms Tertiary Storage (Tape) 10,000,000,000s (10s sec) Ts
    11. 11.  Static Random Access Memory (SRAM) - use for cache.  Dynamic Random Access Memory (DRAM) - use for main memory
    12. 12.  ‘S’ stands for static.  No need to be refresh , so access time is close to cycle time.  Uses 6 transistor per bit.  Bits stored as on/off switches  No charges to leak  More complex construction  Larger per bit  More expensive  Faster
    13. 13.  Transistor arrangement gives stable logic state  State 1  C1 high, C2 low  T1 T4 off, T2 T3 on  State 0  C2 high, C1 low  T2 T3 off, T1 T4 on  Address line transistors T5 T6 is switch  Write – apply value to B & compliment to B  Read – value is on line B
    14. 14.  Bits stored as charge in capacitors  Charges leak  Need refreshing even when powered  Simpler construction  Smaller per bit  Less expensive  Need refresh circuits  Slower  Cycle time is longer than the access time
    15. 15.  Address line active when bit read or written  Transistor switch closed (current flows)  Write  Voltage to bit line  High for 1 low for 0  Then signal address line  Transfers charge to capacitor  Read  Address line selected  transistor turns on  Charge from capacitor fed via bit line to sense amplifier  Compares with reference value to determine 0 or 1  Capacitor charge must be restored
    16. 16.  Addresses divided into 2 halves (Memory as a 2D matrix):  RAS or Row Access Strobe  CAS or Column Access Strobe Fig : Internal Organization of a DRAM
    17. 17.  DRAM had an asynchronous interface to the memory controller so every transfer involved overhead to synchronize with controller.  Adding clock signal to the DRAM interface reduces the overhead, such optimization called “Synchronous DRAM “ i.e SDRAMs
    18. 18.  Double Data Rate i.e DDR was a later development of SDRAM, used in PC memory beginning in 2000.  DDR SDRAM internally performs double-width accesses at the clock rate, and uses a double data rate interface to transfer one half on each clock edge.  Further version of DDR(2 data transfer/cycle) -DDR2(4 data transfer/cycle) -DDR3 (8 data transfer/cycle)
    19. 19.  Review: 6 Basic Cache Optimizations  • Reducing hit time  1.Address Translation during Cache Indexing  • Reducing Miss Penalty  2. Multilevel Caches  3. Giving priority to read misses over write misses  • Reducing Miss Rate  4. Larger Block size (Compulsory misses)  5. Larger Cache size (Capacity misses)  6. Higher Associativity (Conflict misses)
    20. 20.  Reducing hit time  1. Small and simple caches  2. Way prediction  3. Trace caches  • Increasing cache bandwidth  4. Pipelined caches  5. Multibanked caches  6. Nonblocking caches  • Reducing Miss Penalty  7. Critical word first  8. Merging write buffers  • Reducing Miss Rate  9.Compiler optimizations  • Reducing miss penalty or miss rate via parallelism  10.Hardware prefetching  11.Compiler prefetching
    21. 21.  Advanced cache optimizations into the following categories:  Reducing the hit time: small and simple caches, way prediction, and trace  caches  Increasing cache bandwidth: pipelined caches, multibanked caches, and nonblocking  caches  Reducing the miss penalty: critical word first and merging write buffers  Reducing the miss rate: compiler optimizations  Reducing the miss penalty or miss rate via parallelism: hardware prefetching  and compiler prefetching  We will conclude with a summary of the implementation complexity and the performance
    22. 22.  First Optimization: Small and Simple Caches to Reduce Hit Time  A time-consuming portion of a cache hit is using the index portion of the address  to read the tag memory and then compare it to the address. Smaller hardware can  be faster, so a small cache can help the hit time. It is also critical to keep an L2  cache small enough to fit on the same chip as the processor to avoid the time penalty  of going off chip.  The second suggestion is to keep the cache simple, such as using direct mapping.  One benefit of direct-mapped caches is that the designer can overlap the tag  check with the transmission of the data. This effectively reduces hit time
    23. 23.  Compiler techniques to reduce cache misses+ 0 Software is a challenge; some computers have compiler option  Hardware prefetching of instructions and data + + 2 instr.,3 data  Many prefetch instructions;Opteron and Pentium 4 prefetch data  Compiler-controlled prefetching + + 3 Needs nonblocking cache;  possible instruction overhead; in many CPUs
    24. 24.  Figure 5.11 Summary of 11 advanced cache optimizations showing impact on cache performance and complexity.  Although generally a technique helps only one factor, prefetching can reduce misses if done sufficiently early; if not,  it can reduce miss penalty. + means that the technique improves the factor, – means it hurts that factor, and blank  means it has no impact. The complexity measure is subjective, with 0 being the easiest and 3 being a challenge
    25. 25.  The techniques to improve hit time, bandwidth, miss penalty  Miss rate generally affect the other components of the average memory access equation as well as the complexity of the memory hierarchy.  Estimates the impact on complexity, with + meaning that the technique  Improves the factor, – meaning it hurts that factor, and blank meaning it has no impact.  Generally, no technique helps more than one category.
    26. 26.  Computer Architecture - A Quantitative Approach