SlideShare a Scribd company logo
Memory


Eri Prasetyo Wibowo

http://eri.staff.gunadarma.ac.id
Sem
       icon
       duc
        tor
        Me
       m or
         y
       Typ
        es


                             2
CompOrg - Memory Hierarchy
Sem
RAM                        icon
                            duc
  • Misnamed as all semiconduct or memory is random access
  • Read/ Writ e
  • Volat ile               tor
  • Temporary st orage      Me
  • St at ic or dynamic
                            m or
                              y




                                                             3
                      CompOrg - Memory Hierarchy
4
CompOrg - Memory Hierarchy
Random-Access Memory (RAM)
Key features
  • RAM adalah sebagai kemasan chip.
  • Dasar penyimpanan unit adalah sel (satu bit per sel).
  • Beberapa chip RAM membentuk memori


Static RAM (SRAM)
  • Setiap sel menyimpan bit dengan enam transistor-sirkuit.
  • Nilai tetap tentu, selama ini disimpan daya.
  • Relatif kebal untuk gangguan listrik seperti kebisingan.
  •
    Lebih cepat dan lebih mahal daripada DRAM.

Dynamic RAM (DRAM)
  • Setiap sel menyimpan bit dengan kapasitor dan transistor.
  • Nilai harus refresh setiap 10-100 ms.
  • Sensitif terhadap gangguan.
  • Lambat dan lebih murah dibandingkan SRAM.                   5
                         CompOrg - Memory Hierarchy
SRAM vs DRAM
                         summary


       Tran.     Access
       per bit   time   Persist? Sensitive?           Cost   Applications

SRAM   6         1X      Yes      No                  100x   cache memories
DRAM   1         10X     No       Yes                 1X     Main memories,
                                                             frame buffers




                                                                            6
                         CompOrg - Memory Hierarchy
Dyn
Bit s st ored as charge in capacit ors i
                                am
Charges leak
Need refreshing even when powered
                                   c
Simpler const ruct ion           RA
Smaller per bit                   M
Less ex pensive
Need refresh circuit s
Slower
Main memory
Essent ially analogue
    •   Level of charge det ermines value




                                                          7
                             CompOrg - Memory Hierarchy
8
CompOrg - Memory Hierarchy
DR
Address line active when bit read
                                  AM or writt en
                                 Ope
   • Transist or swit ch closed (current flows)
Write                             rati
   • Volt age t o bit line
      – High for 1 low for 0 on
   • Then signal address line
      – Transfers charge to capacitor
Read
   • Address line select ed
      – transistor turns on
   • Charge from capacit or fed via bit line t o sense amplifier
      – Compares with reference value to determine 0 or 1
   • Capacit or charge must be rest ored


                                                                   9
                        CompOrg - Memory Hierarchy
Conventional DRAM organization
d x w DRAM:
  • dw total bits organized as d supercells of size w bits


                                  16 x 8 DRAM chip
                                                          cols
                                             0        1          2   3
                         2 bits          0
                            /
                         addr
                                         1
                                  rows
             memory
                                         2                               supercell
            controller
 (to CPU)                                                                  (2,1)
                                         3
                         8 bits
                            /
                         data


                                                 internal row buffer
                                                                           10
                          CompOrg - Memory Hierarchy
Reading DRAM supercell
                        (2,1)
Step 1(a): Row access strobe (RAS) selects row 2.
Step 1(b): Row 2 copied from DRAM array to row
 buffer.
                               16 x 8 DRAM chip
                                                       cols
                                          0        1          2   3
                     RAS = 2
                        2
                        /             0
                       addr
                                      1
                               rows
         memory
        controller                    2

                        8             3
                        /
                      data
                                                   row 2

                                              internal row buffer
                                                                      11
                        CompOrg - Memory Hierarchy
Reading DRAM supercell
                        (2,1)
Step 2(a): Column access strobe (CAS) selects column 1.
Step 2(b): Supercell (2,1) copied from buffer to data lines,
 and eventually back to the CPU.
                                  16 x 8 DRAM chip
                                                           cols
                                              0        1          2   3
                     CAS = 1
                         2
                         /                0
                       addr
                                          1
                                   rows
         memory
        controller   supercell            2
                       (2,1)
                         8                3
                         /
                       data



                                                  internal row buffer
                                                                          12
                             CompOrg - Memory Hierarchy
Memory
            modules
  addr (row = i, col = j)
                                                                    : supercell (i,j)
                                                     DRAM 0
                                                                  64 MB
                                                                  memory module
                                                                  consisting of
DRAM 7
                                                                  eight 8Mx8 DRAMs
                                                           data


       bits bits bits    bits bits bits bits               bits
       56-63 48-55 40-47 32-39 24-31 16-23 8-15            0-7



  63   56 55   48 47   40 39   32 31   24 23 16 15   8 7      0
                                                                  Memory
                                                                  controller
64-bit doubleword at main memory address A


                                       64-bit doubleword to CPU chip
                                                                                   13
                 CompOrg - Memory Hierarchy
Enhanced
                   DRAMs
All enhanced DRAMs are built around the conventional
 DRAM core.
  • Fast page mode DRAM (FPM DRAM)
    – Access contents of row with [RAS, CAS, CAS, CAS, CAS] instead of
      [(RAS,CAS), (RAS,CAS), (RAS,CAS), (RAS,CAS)].
  • Extended data out DRAM (EDO DRAM)
    – Enhanced FPM DRAM with more closely spaced CAS signals.
  • Synchronous DRAM (SDRAM)
    – Driven with rising clock edge instead of asynchronous control signals.
  • Double data-rate synchronous DRAM (DDR SDRAM)
    – Enhancement of SDRAM that uses both clock edges as control signals.
  • Video RAM (VRAM)
    – Like FPM DRAM, but output is produced by shifting row buffer
    – Dual ported (allows concurrent reads and writes)

                                                                         14
                         CompOrg - Memory Hierarchy
Stat
                              ic
Bit s stored as on/ off swit ches
No charges t o leak          RA
No refreshing needed when powered
                              M
More complex construction
Larger per bit
More ex pensive
Does not need refresh circuits
Fast er
Cache
Digital
   • Uses flip-flops


                                                    15
                       CompOrg - Memory Hierarchy
16
CompOrg - Memory Hierarchy
Stat
        ing
         RA
         M
        Stru
        ctur
          e




                             17
CompOrg - Memory Hierarchy
Stat
                         ic stable logic state
Transistor arrangement gives
St ate 1                RA
    • C high, C low
      1       2         M
    • T T off, T T on
      1   4       2   3
                       Ope
St ate 0
    • C high, C low
      2       1
                       rati
    • T T off, T T on
      2   3       1   4
                        on
Address line transistors T5 T6 is switch
Write – apply value to B & compliment t o B
Read – value is on line B



                                                       18
                          CompOrg - Memory Hierarchy
SRA
Both volatile                    M
                                 vs
   • Power needed t o preserve dat a
Dynamic cell
   • Simpler t o build, smaller
                                DR
   • More dense                 AM
   • Less ex pensive
   • Needs refresh
   • Larger memory unit s
St atic
   • Fast er
   • Cache




                                                    19
                       CompOrg - Memory Hierarchy
Nonvolatile
                memories
DRAM and SRAM are volatile memories
  • Lose information if powered off.
Nonvolatile memories retain value even if powered off.
  • Generic name is read-only memory (ROM).
  • Misleading because some ROMs can be read and modified.
Types of ROMs
  • Programmable ROM (PROM)
  • Eraseable programmable ROM (EPROM)
  • Electrically eraseable PROM (EEPROM)
  • Flash memory
Firmware
  • Program stored in a ROM
     – Boot time code, BIOS (basic input/ouput system)
     – graphics cards, disk controllers.
                                                             20
                           CompOrg - Memory Hierarchy
Rea
Permanent storage       d
   • Nonvolat ile
                       Onl
Microprogramming (see later)
Library subroutines
                         y
Systems programs (BIOS) Me
Function tables       m or
                         y
                      (RO
                        M)


                                              21
                 CompOrg - Memory Hierarchy
Typ
Written during manufacture       es
                                 of
   • Very ex pensive for small runs
Programmable (once)
   • PROM
                                 RO
   • Needs special equipment t o Mprogram
Read “mostly”
   • Erasable Programmable (EPROM)
       – Erased by UV
   • Elect rically Erasable (EEPROM)
       – Takes much longer to write than read
   • Flash memory
       – Erase whole memory electrically



                                                      22
                         CompOrg - Memory Hierarchy
Org
                             anis
A 16Mbit chip can be organised as 1M of 16 bit words
                             atio
A bit per chip system has 16 lot s of 1Mbit chip wit h
  bit 1 of each word in chip 1 and so on
                             n in as a 2048 x 2048 x
A 16Mbit chip can be organised
  4bit array                 det
                              ail column address
   • Reduces number of address pins
       – Multiplex row address and
      – 11 pins to address (2 11 =2048)
      – Adding one more pin doubles range of values so x4
        capacity (2 12 x4 Capacity with 2 11 )




                                                            23
                       CompOrg - Memory Hierarchy
Bus structure connecting
                CPU and memory
A bus is a collection of parallel wires that carry
 address, data, and control signals.
Buses are typically shared by multiple devices.

  CPU chip

             register file


                              ALU

                                        system bus        memory bus



                                                I/O                     main
      bus interface
                                               bridge                  memory

                                                                          24
                             CompOrg - Memory Hierarchy
Memory read transaction
                        (1)
CPU places address A on the memory bus.



         register file                   Load operation: movl A, %eax

                         ALU
     %eax

                                                          main memory
                                     I/O bridge                     0
                                                      A
     bus interface                                             x    A




                                                                        25
                         CompOrg - Memory Hierarchy
Memory read transaction
                        (2)
Main memory reads A from the memory bus, retreives
 word x, and places it on the bus.


        register file
                                          Load operation: movl A, %eax

                        ALU
    %eax

                                                           main memory
                                   I/O bridge          x             0

    bus interface                                               x   A




                                                                         26
                          CompOrg - Memory Hierarchy
Memory read transaction
CPU read word x from the(3) and copies it into
                         bus
 register %eax.


         register file                   Load operation: movl A, %eax

                         ALU
     %eax     x

                                                        main memory
                                    I/O bridge                    0

     bus interface                                            x     A




                                                                        27
                         CompOrg - Memory Hierarchy
Memory write transaction
                       (1)
CPU places address A on bus. Main memory reads it
and waits for the corresponding data word to arrive.


        register file                    Store operation: movl %eax, A

                        ALU
    %eax     y

                                                          main memory
                                   I/O bridge                      0
                                                      A
    bus interface                                                   A




                                                                         28
                         CompOrg - Memory Hierarchy
Memory write transaction
                        (2)
CPU places data word y on the bus.



        register file                   Store operation: movl %eax, A

                        ALU
    %eax     y

                                                          main memory
                                    I/O bridge                     0
                                                     y
    bus interface                                                       A




                                                                            29
                        CompOrg - Memory Hierarchy
Memory write transaction
                       (3)
Main memory read data word y from the bus and
stores it at address A.


         register file
                                         Store operation: movl %eax, A

                         ALU
     %eax     y

                                                          main memory
                                    I/O bridge                     0

     bus interface                                              y    A




                                                                         30
                         CompOrg - Memory Hierarchy
Disk
                     geometry
Disks consist of platters, each with two surfaces.
Each surface consists of concentric rings called tracks.
Each track consists of sectors separated by gaps.

  tracks
              surface
                                                     track k   gaps




              spindle




                                                     sectors

                                                                      31
                        CompOrg - Memory Hierarchy
Disk geometry (muliple-platter view)
Aligned tracks form a cylinder.


                                 cylinder k


         surface 0
                                              platter 0
         surface 1
         surface 2
                                              platter 1
         surface 3
         surface 4
                                              platter 2
         surface 5

                           spindle




                                                          32
                     CompOrg - Memory Hierarchy
Disk
                   capacity
Capacity: maximum number of bits that can be stored.
  • Vendors express capacity in units of gigabytes (GB), where 1 GB =
    10^6.
Capacity is determined by these technology factors:
  • Recording density (bits/in): number of bits that can be squeezed into
    a 1 inch segment of a track.
  • Track density (tracks/in): number of tracks that can be squeezed into
    a 1 inch radial segment.
  • Areal density (bits/in2): product of recording and track density.
Modern disks partition tracks into disjoint subsets
 called recording zones
  • Each track in a zone has the same number of sectors, determined by
    the circumference of innermost track.
  • Each zone has a different number of sectors/track


                                                                        33
                         CompOrg - Memory Hierarchy
Computing disk capacity
Capacity = (# bytes/sector) x (avg. # sectors/track) x
           (# tracks/surface) x (# surfaces/platter) x
           (# platters/disk)
Example:
  • 512 bytes/sector
  • 300 sectors/track (on average)
  • 20,000 tracks/surface
  • 2 surfaces/platter
  • 5 platters/disk


Capacity = 512 x 300 x 20000 x 2 x 5
         = 30,720,000,000
         = 30.72 GB
                                                      34
                         CompOrg - Memory Hierarchy
Disk operation (single-platter view)

The disk
                                                The read/write head
surface
                                                is attached to the end
spins at a fixed
                                                of the arm and flies over
rotational rate
                                                 the disk surface on
                                                a thin cushion of air.


                    spindle




                                            By moving radially, the arm
                                            can position the read/write
                                            head over any track.




                                                                            35
                   CompOrg - Memory Hierarchy
Disk operation (multi-platter view)
                            read/write heads
                             move in unison
                         from cylinder to cylinder




                                        arm




                       spindle




                                                     36
           CompOrg - Memory Hierarchy
Disk access
                        time
Average time to access some target sector
 approximated by :
  • Taccess = Tavg seek + Tavg rotation + Tavg transfer


Seek time
  • Time to position heads over cylinder containing target sector.
  • Typical Tavg seek = 9 ms

Rotational latency
  • Time waiting for first bit of target sector to pass under r/w head.
  • Tavg rotation = 1/2 x 1/RPMs x 60 sec/1 min

Transfer time
  • Time to read the bits in the target sector.
  • Tavg transfer = 1/RPM x 1/(avg # sectors/track) x 60 secs/1 min.

                                                                          37
                                CompOrg - Memory Hierarchy
Disk access time
Given:                  example
  • Rotational rate = 7,200 RPM
  • Average seek time = 9 ms.
  • Avg # sectors/track = 400.
Derived:
  • Tavg rotation = 1/2 x (60 secs/7200 RPM) x 1000 ms/sec = 4 ms.
  • Tavg transfer = 60/7200 RPM x 1/400 secs/track x 1000 ms/sec = 0.02 ms
  • Taccess = 9 ms + 4 ms + 0.02 ms

Important points:
  • Access time dominated by seek time and rotational latency.
  • First bit in a sector is the most expensive, the rest are free.
  • SRAM access time is about 4ns/doubleword, DRAM about 60 ns
     – Disk is about 40,000 times slower than SRAM,
     – 2,500 times slower then DRAM.
                                                                        38
                          CompOrg - Memory Hierarchy
Logical disk blocks
Modern disks present a simpler abstract view of the
 complex sector geometry:
  • The set of available sectors is modeled as a sequence of b-sized
    logical blocks (0, 1, 2, ...)
Mapping between logical blocks and actual (physical)
 sectors
  • Maintained by hardware/firmware device called disk controller.
  • Converts requests for logical blocks into (surface,track,sector)
    triples.
Allows controller to set aside spare cylinders for each
 zone.
  • Accounts for the difference in “formatted capacity” and “maximum
    capacity”.



                                                                       39
                         CompOrg - Memory Hierarchy
Bus structure connecting I/O and CPU
  CPU chip
             register file

                             ALU
                                    system bus       memory bus


                                            I/O                     main
      bus interface
                                           bridge                  memory




                                           I/O bus                Expansion slots for
                                                                  other devices such
         USB                 graphics                  disk       as network adapters.
       controller            adapter                 controller

     mouse keyboard          monitor
                                                           disk
                                                                               40
                              CompOrg - Memory Hierarchy
Reading a disk sector (1)
CPU chip
                                         CPU initiates a disk read by writing a
           register file
                                         command, logical block number, and
                                         destination memory address to a port
                           ALU
                                         (address) associated with disk controller.



                                                                    main
    bus interface
                                                                   memory


                                                         I/O bus




       USB                 graphics                 disk
     controller            adapter                controller

   mouse keyboard          monitor
                                                               disk
                                                                                  41
                                 CompOrg - Memory Hierarchy
Reading a disk sector (2)
CPU chip
                                           Disk controller reads the sector and
        register file
                                           performs a direct memory access (DMA)
                                           transfer into main memory.
                        ALU




                                                                       main
    bus interface
                                                                      memory


                                                            I/O bus




       USB              graphics                    disk
     controller         adapter                   controller

   mouse keyboard       monitor
                                                     disk
                                                                               42
                                   CompOrg - Memory Hierarchy
Reading a disk sector (3)
CPU chip
                                      When the DMA transfer completes, the
        register file
                                      disk controller notifies the CPU with an
                                      interrupt (i.e., asserts a special “interrupt”
                        ALU
                                      pin on the CPU)



                                                                    main
    bus interface
                                                                   memory


                                                         I/O bus




       USB              graphics                 disk
     controller         adapter                controller

   mouse keyboard       monitor
                                                  disk
                                                                                  43
                              CompOrg - Memory Hierarchy
Storage trends
       metric          1980     1985    1990   1995    2000    2000:1980

SRAM   $/MB            19,200   2,900   320    256     100     190
       access (ns)     300      150     35     15      2       100


       metric          1980     1985    1990   1995    2000    2000:1980

DRAM   $/MB             8,000   880     100    30      1       8,000
       access (ns)      375     200     100    70      60      6
       typical size(MB) 0.064   0.256   4      16      64      1,000


       metric          1980     1985    1990   1995    2000    2000:1980

       $/MB             500     100     8      0.30    0.05    10,000
Disk   access (ms)      87      75      28     10      8       11
       typical size(MB) 1       10      160    1,000   9,000   9,000


                                                                   44
                        (Culled from back issues of Byte and PC Magazine)
                          CompOrg - Memory Hierarchy
CPU clock
                            rates
                  1980      1985      1990      1995   2000    2000:1980
processor          8080     286       386       Pent   P-III
clock rate(MHz)   1         6         20        150    750     750
cycle time(ns)    1,000     166       50        6      1.6     750




                                                                       45
                          CompOrg - Memory Hierarchy
The CPU-Memory
                            Gap
     The increasing gap between DRAM, disk, and CPU
      speeds.

     100,000,000
      10,000,000
       1,000,000
                                                                 Disk seek time
        100,000
                                                                 DRAM access time
         10,000
ns




                                                                 SRAM access time
           1,000
                                                                 CPU cycle time
            100
             10
              1
                   1980   1985    1990      1995          2000
                                  year



                                                                            46
                             CompOrg - Memory Hierarchy
Memory
                 hierarchies
Some fundamental and enduring properties of
 hardware and software:
  • Fast storage technologies cost more per byte and have less
    capacity.
  • The gap between CPU and main memory speed is widening.
  • Well-written programs tend to exhibit good locality.


These fundamental properties complement each other
 beautifully.

Suggest an approach for organizing memory and
 storage systems known as a “memory hierarchy”.


                                                                 47
                         CompOrg - Memory Hierarchy
An example memory
                          hierarchy
                                       L0:
 Smaller,
  faster,                                registers     CPU registers hold words retrieved
   and                                                 from cache memory.
 costlier                        L1:    on-chip L1
(per byte)                             cache (SRAM)
 storage                                                   L1 cache holds cache lines
 devices                                                   retrieved from the L2 cache.
                           L2:          off-chip L2
                                       cache (SRAM)               L2 cache holds cache lines
                                                                  retrieved from memory.

                     L3:               main memory
                                         (DRAM)
 Larger,                                                                  Main memory holds disk
 slower,                                                                  blocks retrieved from local
   and                                                                    disks.
 cheaper                     local secondary storage
(per byte)     L4:
 storage                           (local disks)
 devices                                                                         Local disks hold files
                                                                                 retrieved from disks on
                                                                                 remote network servers.

         L5:                 remote secondary storage
                      (distributed file systems, Web servers)
                                                                                               48
                                  CompOrg - Memory Hierarchy
Cache
                           s
Cache: A smaller, faster storage device that acts as a
 staging area for a subset of the data in a larger,
 slower device.

Fundamental idea of a memory hierarchy:
  • For each k, the faster, smaller device at level k serves as a cache for
    the larger, slower device at level k+1.


Why do memory hierarchies work?
  • Programs tend to access the data at level k more often than they
    access the data at level k+1.
  • Thus, the storage at level k+1 can be slower, and thus larger and
    cheaper per bit.
  • Net effect: A large pool of memory that costs as much as the cheap
    storage near the bottom, but that serves data to programs at the rate
    of the fast storage near the top.                                49
                          CompOrg - Memory Hierarchy
Cac
                        he
Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or module




                                              50
                 CompOrg - Memory Hierarchy
Cac
       he/
       Mai
        n
       Me
       mor
        y
       Stru
       ctur
        e

                             51
CompOrg - Memory Hierarchy
Cac
CPU requests contents of he memory location
Check cache for this dataope
If present, get from cacherati
                            (fast)
If not present, read required block from main memory
   to cache               on –
                           ove
Then deliver from cache to CPU
                          rvie
Cache includes tags to identify which block of main
   memory is in each cache slot
                             w



                                                   52
                   CompOrg - Memory Hierarchy
Cac
         he
        Rea
         d
        Ope
        rati
        on -
        Flo
        wch
        art

                             53
CompOrg - Memory Hierarchy
Cac
Size                      he
Mapping Function         Des
Replacement Algorithm    ign
Write Policy
Block Size
Number of Caches




                                              54
                 CompOrg - Memory Hierarchy
Size
Cost                           doe
  • More cache is expensive
                                  s
Speed
                               mat
  • More cache is faster (up to a point)
                                ter
  • Checking cache for data takes time




                                                     55
                        CompOrg - Memory Hierarchy
Typi
        cal
       Cac
        he
       Org
       aniz
       atio
         n



                             56
CompOrg - Memory Hierarchy
Co
  Processor            Type
                                        Year ofmpa    L1 cachea         L2 cache      L3 cache
                                     Introduction
  IBM 360/85
  PDP-11/70
                    Mainframe
                   Minicomputer
                                         1968
                                        1975
                                               riso   16 to 32 KB
                                                         1 KB
                                                                          —
                                                                          —
                                                                                        —
                                                                                        —
 VAX 11/780
  IBM 3033
                   Minicomputer
                    Mainframe
                                        1978
                                        1978
                                               n of     16 KB
                                                        64 KB
                                                                          —
                                                                          —
                                                                                        —
                                                                                        —
  IBM 3090
  Intel 80486
                    Mainframe
                        PC
                                        1985
                                        1989
                                               Cac   128 to 256 KB
                                                         8 KB
                                                                          —
                                                                          —
                                                                                        —
                                                                                        —
   Pentium
 PowerPC 601
                        PC
                        PC
                                        1993
                                        1993
                                                he    8 KB/8 KB
                                                        32 KB
                                                                     256 to 512 KB
                                                                          —
                                                                                        —
                                                                                        —
 PowerPC 620
 PowerPC G4
                        PC
                     PC/server
                                        1996
                                        1999
                                               Size  32 KB/32 KB
                                                     32 KB/32 KB
                                                                          —
                                                                     256 KB to 1 MB
                                                                                        —
                                                                                       2 MB
IBM S/390 G4
IBM S/390 G6
                    Mainframe
                    Mainframe
                                        1997
                                        1999
                                                 s      32 KB
                                                        256 KB
                                                                        256 KB
                                                                         8 MB
                                                                                       2 MB
                                                                                        —
  Pentium 4          PC/server          2000          8 KB/8 KB         256 KB          —
                  High-end server/
   IBM SP                               2000         64 KB/32 KB         8 MB           —
                   supercomputer
 CRAY MTAb        Supercomputer         2000             8 KB            2 MB           —
    Itanium          PC/server          2001         16 KB/16 KB         96 KB         4 MB
SGI Origin 2001   High-end server       2001         32 KB/32 KB         4 MB           —
   Itanium 2         PC/server          2002            32 KB           256 KB         6 MB
IBM POWER5        High-end server       2003            64 KB           1.9 MB         36 MB
                                                                                       57
 CRAY XD-1        Supercomputer      CompOrg - Memory Hierarchy
                                       2004         64 KB/64 KB          1MB            —
Map
Cache of 64kByte                   pin
Cache block of 4 bytes               g
   • i.e. cache is 16k (2 ) lines of 4 bytes
                         14


16MBytes main memory
                                  Fun
24 bit address                    ctio
   • (2 =16M)
        24                           n




                                                       58
                          CompOrg - Memory Hierarchy
Dire
                                      ct
Each block of main memory maps to only one cache
  line
                                   Map in one specific place
   • i.e. if a block is in cache, it must be
Address is in two parts pin
                                      g
Least Significant w bits identify unique word
Most Significant s bits specify one memory block
The MSBs are split into a cache line field r and a tag of
  s-r (most significant)




                                                               59
                      CompOrg - Memory Hierarchy
Dire
                                       ct
 Tag s-r
                                     MapSlot r
                                     Line or               Word w
                                      pin 14                      2
        8
                                        g
24 bit address
2 bit word identifier (4 byte block) Add
22 bit block identifier
    •   8 bit tag (=22-14)
                                      res
    •   14 bit slot or line             s
No two blocks in the same line have the same Tag field
                                     Stru
Check contents of cache by finding line and checking Tag
                                     ctur
                                        e
                                                             60
                              CompOrg - Memory Hierarchy
Dire
Cache line            ct
             Main Memory blocks held
0            0, m, 2m, 3m…2s-m
1
                    Map
             1,m+1, 2m+1…2s-m+1
                     pin
m-1                   g
             m-1, 2m-1,3m-1…2s-1

                    Cac
                     he
                    Line
                    Tabl
                       e

                                          61
             CompOrg - Memory Hierarchy
Dire
        ct
       Map
       pin
        g
       Cac
        he
       Org
       aniz
       atio
        n
                             62
CompOrg - Memory Hierarchy
Direct Mapping Example




                                  63
     CompOrg - Memory Hierarchy
Dire
                            ct
Address length = (s + w) bits
                           Map
Number of addressable units = 2s+w words or bytes
Block size = line size = 2w words or bytes
                           pin
Number of blocks in main memory = 2s+ w/2w = 2s
                            g
Number of lines in cache = m = 2r
Size of tag = (s – r) bits
                           Su
                           mm
                           ary



                                                    64
                  CompOrg - Memory Hierarchy
Dire
Simple                           ct
Inexpensive                    Map
Fixed location for given block  pin
   • If a program accesses 2 blocks that map to the same line
     repeatedly, cache misses areg high
                                  very

                                pro
                                s&
                               con
                                 s


                                                                65
                         CompOrg - Memory Hierarchy
Ass
                         ocia
A main memory block can load into any line of cache
                          tive
Memory address is interpreted as tag and word
Tag uniquely identifies block of memory
                         Map
Every line’s tag is examined for a match
                          pin
Cache searching gets expensive
                            g




                                                  66
                  CompOrg - Memory Hierarchy
Full
         y
       Ass
       ocia
       tive
       Cac
        he
       Org
       aniz
       atio
         n
                             67
CompOrg - Memory Hierarchy
Assosiative Mapping Example




                                         68
            CompOrg - Memory Hierarchy
Ass
                                ocia
                                tive
                         Tag 22 bit                               Word
                                Map                               2 bit
                                pin of data
22 bit tag stored with each 32 bit block
                                  g
Compare tag field with tag entry in cache to check for hit
                                Add
Least significant 2 bits of address identify which 16 bit word is
   required from 32 bit data block
e.g.                            res
     • Address
     • FFFFFC
                        Tag
                                  s Data
                        FFFFFC 246824683FFF
                                                       Cache line


                                Stru
                                ctur
                                  e
                        CompOrg - Memory Hierarchy
                                                                 69
Ass
Address length = (s + w) ocia
                          bits
                          tive
Number of addressable units = 2s+w words or bytes
Block size = line size = 2w words or bytes
                          Map
Number of blocks in main memory = 2s+ w/2w = 2s
                           pin
Number of lines in cache = undetermined
Size of tag = s bits
                            g
                           Su
                          mm
                           ary


                                                    70
                  CompOrg - Memory Hierarchy
Set
                                Ass
Cache is divided into a number of sets
                               ocia
Each set contains a number of lines
A given block maps to any line in a given set
                                tive
   • e.g. Block B can be in any line of set i
e.g. 2 lines per set           Map
   • 2 way associative mapping pin
   • A given block can be in one of 2 lines in only one set
                                   g




                                                              71
                         CompOrg - Memory Hierarchy
Set
13 bit set number       Ass
                       ocia
Block number in main memory is modulo 2       13


000000, 00A000, 00B000, tive … map to same set
                        00C000

                       Map
                        pin
                          g
                        Exa
                        mpl
                          e

                                                   72
                 CompOrg - Memory Hierarchy
Two Way Set Associative Cache
        Organization




                                     73
        CompOrg - Memory Hierarchy
Set
                           Ass
                           ocia                       Word
 Tag 9 bit                    Set 13 bit
                           tive                       2 bit
                           Map
Use set field to determine cache set to look in
                            pin
Compare tag field to see if we have a hit
e.g                          g
    • Address       Tag    Data          Set number
    • 1FF 7FFC 1FF
                           Add 1FFF
                    12345678
    • 001 7FFC 001  11223344res 1FFF
                             s
                           Stru
                           ctur
                     CompOrg - Memory Hierarchy
                                                      74
Two Way set Assosiative Mapping Example


             Two Way
               Set
            Associative
             Mapping
             Example




                                          75
             CompOrg - Memory Hierarchy
Set
                           Ass
Address length = (s + w) bits
                           ocia
Number of addressable units = 2s+w words or bytes
Block size = line size = 2w words or bytes
                           tive
Number of blocks in main memory = 2d
                           Map
Number of lines in set = k
Number of sets = v = 2d
                           pin
Number of lines in cache = g = k * 2d
                             kv
Size of tag = (s – d) bits Su
                           mm
                           ary

                                                    76
                  CompOrg - Memory Hierarchy
Rep
No choice              lace
Each block only maps tomen
                        one line
Replace that line         t
                        Alg
                        orit
                       hms
                         (1)
                       Dire
                          ct
                       map
                        pin
                  CompOrg - Memory Hierarchy
                                               77
Replacement Algorithms (2)
            Associative & Set Associative
Hardware implemented algorithm (speed)
Least Recently used (LRU)
e.g. in 2 way set associative
   • Which of the 2 block is lru?
First in first out (FIFO)
   • replace block that has been in cache longest
Least frequently used
   • replace block which has had fewest hits
Random



                                                     78
                        CompOrg - Memory Hierarchy
Writ
Must not overwrite a cachee
                          block unless main memory
   is up to date
                        Poli
Multiple CPUs may have individual caches
                         cy directly
I/O may address main memory




                                                79
                 CompOrg - Memory Hierarchy
Writ
                           e
All writes go to main memory as well as cache
                        thro
Multiple CPUs can monitor main memory traffic to keep
  local (to CPU) cache up to date
Lots of traffic
                        ugh
Slows down writes

Remember bogus write through caches!




                                                  80
                    CompOrg - Memory Hierarchy
Writ
                             e
Updates initially made in cache only
                           bac
Update bit for cache slot is set when update occurs
If block is to be replaced, write to main memory only if
                             k
   update bit is set
Other caches get out of sync
I/O must access main memory through cache
N.B. 15% of memory references are writes




                                                      81
                    CompOrg - Memory Hierarchy
Caching in a memory hierarchy
                                                 Smaller, faster, more expensive
    Level k:   4     9     14        3           device at level k caches a
                                                 subset of the blocks from level k+1

                          Data is copied between
                          levels in block-sized transfer
                          units




                0    1      2        3

                4    5      6        7           Larger, slower, cheaper storage
Level k+1:
                                                 device at level k+1 is partitioned
                8    9      10       11          into blocks.

                12   13     14       15




                                                                              82
                          CompOrg - Memory Hierarchy
General caching concepts
  Level k:                        Program needs object d,
      4      9    14   3
                                   which is stored in some
                                   block b.

                                  Cache hit
                                     • Program finds b in the cache
Level k+1:                             at level k. E.g. block 14.
                                  Cache miss
       0     1     2   3
                                     • b is not at level k, so level k
       4     5     6   7               cache must fetch it from level
                                       k+1. E.g. block 12.
       8     9    10   11
                                     • If level k cache is full, then
      12     13   14   15              some current block must be
                                       replaced (evicted).
                                       • Which one? Determined by
                                         replacement policy. E.g. evict
                                         least recently used block. 83
                       CompOrg - Memory Hierarchy
General caching concepts
Types of cache misses:
  • Cold (compulsary) miss
     – Cold misses occur because the cache is empty.
  • Conflict miss
     – Most caches limit blocks at level k+1 to a small subset (sometimes a
       singleton) of the block positions at level k.
     – E.g. Block i at level k+1 must be placed in block (i mod 4) at level k+1.
     – Conflict misses occur when the level k cache is large enough, but
       multiple data objects all map to the same level k block.
     – E.g. Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time.
  • Capacity miss
     – Occurs when the set of active cache blocks (working set) is larger than
       the cache.




                                                                              84
                           CompOrg - Memory Hierarchy
Examples of caching in the hierarchy
Cache Type       What Cached            Where Cached           Latency        Managed
                                                               (cycles)       By
Registers        4-byte word             CPU registers                     0 Compiler
TLB              Address                On-Chip TLB                        0 Hardware
                 translations
L1 cache         32-byte block          On-Chip L1                          1 Hardware
L2 cache         32-byte block          Off-Chip L2                        10 Hardware
Virtual Memory   4-KB page              Main memory                       100 Hardware+
                                                                              OS
Buffer cache     Parts of files         Main memory                       100 OS
Network buffer   Parts of files         Local disk                 10,000,000 AFS/NFS
cache                                                                         client
Browser cache    Web pages              Local disk                 10,000,000 Web
                                                                              browser
Web cache        Web pages              Remote server           1,000,000,000 Web proxy
                                        disks                                 server

                                                                                 85
                                  CompOrg - Memory Hierarchy
Pen
80386 – no on chip cache        tiu
80486 – 8k using 16 byte lines and four way set associative
  organization                m4
                              Cac
Pentium (all versions) – two on chip L1 caches
   • Data & instructions
                                he
Pentium III – L3 cache added off chip
Pentium 4
   •   L1 caches
        – 8k bytes
        – 64 byte lines
        – four way set associative
   •   L2 cache
        – Feeding both L1 caches
        – 256k
        – 128 byte lines
        – 8 way set associative
   •   L3 cache on chip
                                                              86
                             CompOrg - Memory Hierarchy
Intel
Problem
                                                               Cac            Solution
                                                                                                        Processor on which feature
                                                                                                               first appears


External memory slower than the system bus.
                                                                 he
                                                                Add external cache using faster                    386
                                                                memory technology.


Increased processor speed results in external bus becoming a
                                                               Evo
                                                                Move external cache on-chip,                       486
bottleneck for cache access.
                                                                luti
                                                                operating at the same speed as the
                                                                processor.

Internal cache is rather small, due to limited space on chip     on
                                                                Add external L2 cache using faster
                                                                technology than main memory
                                                                                                                   486


Contention occurs when both the Instruction Prefetcher and      Create separate data and instruction             Pentium
the Execution Unit simultaneously require access to the         caches.
cache. In that case, the Prefetcher is stalled while the
Execution Unit’s data access takes place.

                                                                Create separate back-side bus that             Pentium Pro
                                                                runs at higher speed than the main
Increased processor speed results in external bus becoming a    (front-side) external bus. The BSB is
bottleneck for L2 cache access.                                 dedicated to the L2 cache.

                                                                Move L2 cache on to the processor               Pentium II
                                                                chip.

Some applications deal with massive databases and must        Add external L3 cache.                           Pentium III
have rapid access to large amounts of data. The on-chip
caches are too small.
                                                                                                                         87
                                                    CompOrg - Memorycache on-chip.
                                                              Move L3 Hierarchy                                 Pentium 4
Pen
        tiu
        m4
        Blo
         ck
        Dia
        gra
         m



                             88
CompOrg - Memory Hierarchy
Pen
Fetch/Decode Unit                    tiu
    • Fetches instructions from L2 cache
    • Decode into micro-ops         m4
    • Store micro-ops in L1 cache
                                    Cor
Out of order execution logic
    • Schedules micro-ops             e
                                    Pro
    • Based on data dependence and resources
    • May speculatively execute
Execution units                     ces
    • Execute micro-ops
    • Data from L1 cache
                                    sor
    •   Results in registers
Memory subsystem
    •   L2 cache and systems bus



                                                            89
                               CompOrg - Memory Hierarchy
Pen
                                    tiu
Decodes instructions into RISC like micro-ops before L1 cache
Micro-ops fixed length
    •                              m4
        Superscalar pipelining and scheduling

                                  Des
Pentium instructions long & complex
Performance improved by separating decoding from scheduling &
   pipelining
    •   (More later – ch14)
                                   ign
Data cache is write back
    •
                                  Rea
        Can be configured to write through
                                  soni
L1 cache controlled by 2 bits in register
    •   CD = cache disable
    •   NW = not write through      ng
    •   2 instructions to invalidate (flush) cache and write back then invalidate
L2 and L3 8-way set-associative
    •   Line size 128 bytes




                                                                                    90
                                 CompOrg - Memory Hierarchy
Pow
                                erP
601 – single 32kb 8 way set associative
                                 C
603 – 16kb (2 x 8kb) two way set associative
604 – 32kb                      Cac
620 – 64kb
                                 he
G3 & G4
   • 64kb L1 cache              Org
      – 8 way set associative
   • 256k, 512k or 1M L2 cache
                                aniz
      – two way set associative atio
G5                               n
   • 32kB instruction cache
   • 64kB data cache



                                                    91
                       CompOrg - Memory Hierarchy
Pow
       erP
        C
        G5
       Blo
        ck
       Dia
       gra
        m


                             92
CompOrg - Memory Hierarchy

More Related Content

What's hot

Design and Simulation Low power SRAM Circuits
Design and Simulation Low power SRAM CircuitsDesign and Simulation Low power SRAM Circuits
Design and Simulation Low power SRAM Circuits
ijsrd.com
 
301378156 design-of-sram-in-verilog
301378156 design-of-sram-in-verilog301378156 design-of-sram-in-verilog
301378156 design-of-sram-in-verilog
Srinivas Naidu
 
SRAM Design
SRAM DesignSRAM Design
SRAM Design
Bharat Biyani
 
Static and Dynamic Read/Write memories
Static and Dynamic Read/Write memoriesStatic and Dynamic Read/Write memories
Static and Dynamic Read/Write memories
Abhilash Nair
 
Semiconductor Memories
Semiconductor MemoriesSemiconductor Memories
Semiconductor Memories
melisha monteiro
 
Project Report Of SRAM Design
Project Report Of SRAM DesignProject Report Of SRAM Design
Project Report Of SRAM Design
Aalay Kapadia
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...
rahul kumar verma
 
DRAM
DRAMDRAM
Project Report Multilevel Cache
Project Report Multilevel CacheProject Report Multilevel Cache
Project Report Multilevel Cache
guest136ed3
 
250nm Technology Based Low Power SRAM Memory
250nm Technology Based Low Power SRAM Memory250nm Technology Based Low Power SRAM Memory
250nm Technology Based Low Power SRAM Memory
iosrjce
 
SRAM
SRAMSRAM
Sram memory design
Sram memory designSram memory design
Sram memory design
NIT Goa
 
Time and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
Time and Low Power Operation Using Embedded Dram to Gain Cell Data RetentionTime and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
Time and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
IJMTST Journal
 
SRAM DRAM
SRAM DRAMSRAM DRAM
SRAM DRAM
Tipu Sultan
 
Talk on Parallel Computing at IGWA
Talk on Parallel Computing at IGWATalk on Parallel Computing at IGWA
Talk on Parallel Computing at IGWA
Dishant Ailawadi
 
DRAM Cell - Working and Read and Write Operations
DRAM Cell - Working and Read and Write OperationsDRAM Cell - Working and Read and Write Operations
DRAM Cell - Working and Read and Write Operations
Naman Bhalla
 

What's hot (18)

Design and Simulation Low power SRAM Circuits
Design and Simulation Low power SRAM CircuitsDesign and Simulation Low power SRAM Circuits
Design and Simulation Low power SRAM Circuits
 
International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)
 
301378156 design-of-sram-in-verilog
301378156 design-of-sram-in-verilog301378156 design-of-sram-in-verilog
301378156 design-of-sram-in-verilog
 
SRAM Design
SRAM DesignSRAM Design
SRAM Design
 
Static and Dynamic Read/Write memories
Static and Dynamic Read/Write memoriesStatic and Dynamic Read/Write memories
Static and Dynamic Read/Write memories
 
Semiconductor Memories
Semiconductor MemoriesSemiconductor Memories
Semiconductor Memories
 
Project Report Of SRAM Design
Project Report Of SRAM DesignProject Report Of SRAM Design
Project Report Of SRAM Design
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...
 
DRAM
DRAMDRAM
DRAM
 
Project Report Multilevel Cache
Project Report Multilevel CacheProject Report Multilevel Cache
Project Report Multilevel Cache
 
250nm Technology Based Low Power SRAM Memory
250nm Technology Based Low Power SRAM Memory250nm Technology Based Low Power SRAM Memory
250nm Technology Based Low Power SRAM Memory
 
SRAM
SRAMSRAM
SRAM
 
Sram memory design
Sram memory designSram memory design
Sram memory design
 
Time and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
Time and Low Power Operation Using Embedded Dram to Gain Cell Data RetentionTime and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
Time and Low Power Operation Using Embedded Dram to Gain Cell Data Retention
 
SRAM DRAM
SRAM DRAMSRAM DRAM
SRAM DRAM
 
Talk on Parallel Computing at IGWA
Talk on Parallel Computing at IGWATalk on Parallel Computing at IGWA
Talk on Parallel Computing at IGWA
 
Lecture2
Lecture2Lecture2
Lecture2
 
DRAM Cell - Working and Read and Write Operations
DRAM Cell - Working and Read and Write OperationsDRAM Cell - Working and Read and Write Operations
DRAM Cell - Working and Read and Write Operations
 

Similar to Memory hir

COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5
Dr.MAYA NAYAK
 
Volatile memory
Volatile memoryVolatile memory
Volatile memory
Simon Paul
 
Digital electronics and logic design
Digital electronics and logic designDigital electronics and logic design
Digital electronics and logic designToufiq Akbar
 
Internal memory
Internal memoryInternal memory
Internal memory
Riya Choudhary
 
Memory (Part 3)
Memory (Part 3)Memory (Part 3)
Memory (Part 3)
Ajeng Savitri
 
ROM
ROMROM
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
Praveen Kumar
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeFPGA Central
 
Memory And Storages
Memory And StoragesMemory And Storages
Memory And Storages
Arsalan Qureshi
 
Ch05 coa9e
Ch05 coa9eCh05 coa9e
Ch05 coa9e
Thodoris Skylatos
 
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDLIRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
IRJET Journal
 
Random access memory
Random access memoryRandom access memory
Random access memory
TanvirHossain133
 
Cpu spec
Cpu specCpu spec
Cpu spec
Sinu Jose
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
Subhasis Dash
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memory
Deepak John
 
Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.ppt
PavikaSharma3
 
Memory module
Memory moduleMemory module
Memory modulelemar12
 

Similar to Memory hir (20)

COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5
 
Volatile memory
Volatile memoryVolatile memory
Volatile memory
 
Digital electronics and logic design
Digital electronics and logic designDigital electronics and logic design
Digital electronics and logic design
 
Internal memory
Internal memoryInternal memory
Internal memory
 
Memoryhierarchy
MemoryhierarchyMemoryhierarchy
Memoryhierarchy
 
Memory (Part 3)
Memory (Part 3)Memory (Part 3)
Memory (Part 3)
 
ROM
ROMROM
ROM
 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
 
Memory And Storages
Memory And StoragesMemory And Storages
Memory And Storages
 
Ch05 coa9e
Ch05 coa9eCh05 coa9e
Ch05 coa9e
 
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDLIRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
IRJET- Design And VLSI Verification of DDR SDRAM Controller Using VHDL
 
Random access memory
Random access memoryRandom access memory
Random access memory
 
Cpu spec
Cpu specCpu spec
Cpu spec
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memory
 
Dram and its types
Dram and its typesDram and its types
Dram and its types
 
Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.ppt
 
Memory module
Memory moduleMemory module
Memory module
 
natchatra
natchatranatchatra
natchatra
 

Recently uploaded

Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Recently uploaded (20)

Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 

Memory hir

  • 2. Sem icon duc tor Me m or y Typ es 2 CompOrg - Memory Hierarchy
  • 3. Sem RAM icon duc • Misnamed as all semiconduct or memory is random access • Read/ Writ e • Volat ile tor • Temporary st orage Me • St at ic or dynamic m or y 3 CompOrg - Memory Hierarchy
  • 4. 4 CompOrg - Memory Hierarchy
  • 5. Random-Access Memory (RAM) Key features • RAM adalah sebagai kemasan chip. • Dasar penyimpanan unit adalah sel (satu bit per sel). • Beberapa chip RAM membentuk memori Static RAM (SRAM) • Setiap sel menyimpan bit dengan enam transistor-sirkuit. • Nilai tetap tentu, selama ini disimpan daya. • Relatif kebal untuk gangguan listrik seperti kebisingan. • Lebih cepat dan lebih mahal daripada DRAM. Dynamic RAM (DRAM) • Setiap sel menyimpan bit dengan kapasitor dan transistor. • Nilai harus refresh setiap 10-100 ms. • Sensitif terhadap gangguan. • Lambat dan lebih murah dibandingkan SRAM. 5 CompOrg - Memory Hierarchy
  • 6. SRAM vs DRAM summary Tran. Access per bit time Persist? Sensitive? Cost Applications SRAM 6 1X Yes No 100x cache memories DRAM 1 10X No Yes 1X Main memories, frame buffers 6 CompOrg - Memory Hierarchy
  • 7. Dyn Bit s st ored as charge in capacit ors i am Charges leak Need refreshing even when powered c Simpler const ruct ion RA Smaller per bit M Less ex pensive Need refresh circuit s Slower Main memory Essent ially analogue • Level of charge det ermines value 7 CompOrg - Memory Hierarchy
  • 8. 8 CompOrg - Memory Hierarchy
  • 9. DR Address line active when bit read AM or writt en Ope • Transist or swit ch closed (current flows) Write rati • Volt age t o bit line – High for 1 low for 0 on • Then signal address line – Transfers charge to capacitor Read • Address line select ed – transistor turns on • Charge from capacit or fed via bit line t o sense amplifier – Compares with reference value to determine 0 or 1 • Capacit or charge must be rest ored 9 CompOrg - Memory Hierarchy
  • 10. Conventional DRAM organization d x w DRAM: • dw total bits organized as d supercells of size w bits 16 x 8 DRAM chip cols 0 1 2 3 2 bits 0 / addr 1 rows memory 2 supercell controller (to CPU) (2,1) 3 8 bits / data internal row buffer 10 CompOrg - Memory Hierarchy
  • 11. Reading DRAM supercell (2,1) Step 1(a): Row access strobe (RAS) selects row 2. Step 1(b): Row 2 copied from DRAM array to row buffer. 16 x 8 DRAM chip cols 0 1 2 3 RAS = 2 2 / 0 addr 1 rows memory controller 2 8 3 / data row 2 internal row buffer 11 CompOrg - Memory Hierarchy
  • 12. Reading DRAM supercell (2,1) Step 2(a): Column access strobe (CAS) selects column 1. Step 2(b): Supercell (2,1) copied from buffer to data lines, and eventually back to the CPU. 16 x 8 DRAM chip cols 0 1 2 3 CAS = 1 2 / 0 addr 1 rows memory controller supercell 2 (2,1) 8 3 / data internal row buffer 12 CompOrg - Memory Hierarchy
  • 13. Memory modules addr (row = i, col = j) : supercell (i,j) DRAM 0 64 MB memory module consisting of DRAM 7 eight 8Mx8 DRAMs data bits bits bits bits bits bits bits bits 56-63 48-55 40-47 32-39 24-31 16-23 8-15 0-7 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0 Memory controller 64-bit doubleword at main memory address A 64-bit doubleword to CPU chip 13 CompOrg - Memory Hierarchy
  • 14. Enhanced DRAMs All enhanced DRAMs are built around the conventional DRAM core. • Fast page mode DRAM (FPM DRAM) – Access contents of row with [RAS, CAS, CAS, CAS, CAS] instead of [(RAS,CAS), (RAS,CAS), (RAS,CAS), (RAS,CAS)]. • Extended data out DRAM (EDO DRAM) – Enhanced FPM DRAM with more closely spaced CAS signals. • Synchronous DRAM (SDRAM) – Driven with rising clock edge instead of asynchronous control signals. • Double data-rate synchronous DRAM (DDR SDRAM) – Enhancement of SDRAM that uses both clock edges as control signals. • Video RAM (VRAM) – Like FPM DRAM, but output is produced by shifting row buffer – Dual ported (allows concurrent reads and writes) 14 CompOrg - Memory Hierarchy
  • 15. Stat ic Bit s stored as on/ off swit ches No charges t o leak RA No refreshing needed when powered M More complex construction Larger per bit More ex pensive Does not need refresh circuits Fast er Cache Digital • Uses flip-flops 15 CompOrg - Memory Hierarchy
  • 16. 16 CompOrg - Memory Hierarchy
  • 17. Stat ing RA M Stru ctur e 17 CompOrg - Memory Hierarchy
  • 18. Stat ic stable logic state Transistor arrangement gives St ate 1 RA • C high, C low 1 2 M • T T off, T T on 1 4 2 3 Ope St ate 0 • C high, C low 2 1 rati • T T off, T T on 2 3 1 4 on Address line transistors T5 T6 is switch Write – apply value to B & compliment t o B Read – value is on line B 18 CompOrg - Memory Hierarchy
  • 19. SRA Both volatile M vs • Power needed t o preserve dat a Dynamic cell • Simpler t o build, smaller DR • More dense AM • Less ex pensive • Needs refresh • Larger memory unit s St atic • Fast er • Cache 19 CompOrg - Memory Hierarchy
  • 20. Nonvolatile memories DRAM and SRAM are volatile memories • Lose information if powered off. Nonvolatile memories retain value even if powered off. • Generic name is read-only memory (ROM). • Misleading because some ROMs can be read and modified. Types of ROMs • Programmable ROM (PROM) • Eraseable programmable ROM (EPROM) • Electrically eraseable PROM (EEPROM) • Flash memory Firmware • Program stored in a ROM – Boot time code, BIOS (basic input/ouput system) – graphics cards, disk controllers. 20 CompOrg - Memory Hierarchy
  • 21. Rea Permanent storage d • Nonvolat ile Onl Microprogramming (see later) Library subroutines y Systems programs (BIOS) Me Function tables m or y (RO M) 21 CompOrg - Memory Hierarchy
  • 22. Typ Written during manufacture es of • Very ex pensive for small runs Programmable (once) • PROM RO • Needs special equipment t o Mprogram Read “mostly” • Erasable Programmable (EPROM) – Erased by UV • Elect rically Erasable (EEPROM) – Takes much longer to write than read • Flash memory – Erase whole memory electrically 22 CompOrg - Memory Hierarchy
  • 23. Org anis A 16Mbit chip can be organised as 1M of 16 bit words atio A bit per chip system has 16 lot s of 1Mbit chip wit h bit 1 of each word in chip 1 and so on n in as a 2048 x 2048 x A 16Mbit chip can be organised 4bit array det ail column address • Reduces number of address pins – Multiplex row address and – 11 pins to address (2 11 =2048) – Adding one more pin doubles range of values so x4 capacity (2 12 x4 Capacity with 2 11 ) 23 CompOrg - Memory Hierarchy
  • 24. Bus structure connecting CPU and memory A bus is a collection of parallel wires that carry address, data, and control signals. Buses are typically shared by multiple devices. CPU chip register file ALU system bus memory bus I/O main bus interface bridge memory 24 CompOrg - Memory Hierarchy
  • 25. Memory read transaction (1) CPU places address A on the memory bus. register file Load operation: movl A, %eax ALU %eax main memory I/O bridge 0 A bus interface x A 25 CompOrg - Memory Hierarchy
  • 26. Memory read transaction (2) Main memory reads A from the memory bus, retreives word x, and places it on the bus. register file Load operation: movl A, %eax ALU %eax main memory I/O bridge x 0 bus interface x A 26 CompOrg - Memory Hierarchy
  • 27. Memory read transaction CPU read word x from the(3) and copies it into bus register %eax. register file Load operation: movl A, %eax ALU %eax x main memory I/O bridge 0 bus interface x A 27 CompOrg - Memory Hierarchy
  • 28. Memory write transaction (1) CPU places address A on bus. Main memory reads it and waits for the corresponding data word to arrive. register file Store operation: movl %eax, A ALU %eax y main memory I/O bridge 0 A bus interface A 28 CompOrg - Memory Hierarchy
  • 29. Memory write transaction (2) CPU places data word y on the bus. register file Store operation: movl %eax, A ALU %eax y main memory I/O bridge 0 y bus interface A 29 CompOrg - Memory Hierarchy
  • 30. Memory write transaction (3) Main memory read data word y from the bus and stores it at address A. register file Store operation: movl %eax, A ALU %eax y main memory I/O bridge 0 bus interface y A 30 CompOrg - Memory Hierarchy
  • 31. Disk geometry Disks consist of platters, each with two surfaces. Each surface consists of concentric rings called tracks. Each track consists of sectors separated by gaps. tracks surface track k gaps spindle sectors 31 CompOrg - Memory Hierarchy
  • 32. Disk geometry (muliple-platter view) Aligned tracks form a cylinder. cylinder k surface 0 platter 0 surface 1 surface 2 platter 1 surface 3 surface 4 platter 2 surface 5 spindle 32 CompOrg - Memory Hierarchy
  • 33. Disk capacity Capacity: maximum number of bits that can be stored. • Vendors express capacity in units of gigabytes (GB), where 1 GB = 10^6. Capacity is determined by these technology factors: • Recording density (bits/in): number of bits that can be squeezed into a 1 inch segment of a track. • Track density (tracks/in): number of tracks that can be squeezed into a 1 inch radial segment. • Areal density (bits/in2): product of recording and track density. Modern disks partition tracks into disjoint subsets called recording zones • Each track in a zone has the same number of sectors, determined by the circumference of innermost track. • Each zone has a different number of sectors/track 33 CompOrg - Memory Hierarchy
  • 34. Computing disk capacity Capacity = (# bytes/sector) x (avg. # sectors/track) x (# tracks/surface) x (# surfaces/platter) x (# platters/disk) Example: • 512 bytes/sector • 300 sectors/track (on average) • 20,000 tracks/surface • 2 surfaces/platter • 5 platters/disk Capacity = 512 x 300 x 20000 x 2 x 5 = 30,720,000,000 = 30.72 GB 34 CompOrg - Memory Hierarchy
  • 35. Disk operation (single-platter view) The disk The read/write head surface is attached to the end spins at a fixed of the arm and flies over rotational rate the disk surface on a thin cushion of air. spindle By moving radially, the arm can position the read/write head over any track. 35 CompOrg - Memory Hierarchy
  • 36. Disk operation (multi-platter view) read/write heads move in unison from cylinder to cylinder arm spindle 36 CompOrg - Memory Hierarchy
  • 37. Disk access time Average time to access some target sector approximated by : • Taccess = Tavg seek + Tavg rotation + Tavg transfer Seek time • Time to position heads over cylinder containing target sector. • Typical Tavg seek = 9 ms Rotational latency • Time waiting for first bit of target sector to pass under r/w head. • Tavg rotation = 1/2 x 1/RPMs x 60 sec/1 min Transfer time • Time to read the bits in the target sector. • Tavg transfer = 1/RPM x 1/(avg # sectors/track) x 60 secs/1 min. 37 CompOrg - Memory Hierarchy
  • 38. Disk access time Given: example • Rotational rate = 7,200 RPM • Average seek time = 9 ms. • Avg # sectors/track = 400. Derived: • Tavg rotation = 1/2 x (60 secs/7200 RPM) x 1000 ms/sec = 4 ms. • Tavg transfer = 60/7200 RPM x 1/400 secs/track x 1000 ms/sec = 0.02 ms • Taccess = 9 ms + 4 ms + 0.02 ms Important points: • Access time dominated by seek time and rotational latency. • First bit in a sector is the most expensive, the rest are free. • SRAM access time is about 4ns/doubleword, DRAM about 60 ns – Disk is about 40,000 times slower than SRAM, – 2,500 times slower then DRAM. 38 CompOrg - Memory Hierarchy
  • 39. Logical disk blocks Modern disks present a simpler abstract view of the complex sector geometry: • The set of available sectors is modeled as a sequence of b-sized logical blocks (0, 1, 2, ...) Mapping between logical blocks and actual (physical) sectors • Maintained by hardware/firmware device called disk controller. • Converts requests for logical blocks into (surface,track,sector) triples. Allows controller to set aside spare cylinders for each zone. • Accounts for the difference in “formatted capacity” and “maximum capacity”. 39 CompOrg - Memory Hierarchy
  • 40. Bus structure connecting I/O and CPU CPU chip register file ALU system bus memory bus I/O main bus interface bridge memory I/O bus Expansion slots for other devices such USB graphics disk as network adapters. controller adapter controller mouse keyboard monitor disk 40 CompOrg - Memory Hierarchy
  • 41. Reading a disk sector (1) CPU chip CPU initiates a disk read by writing a register file command, logical block number, and destination memory address to a port ALU (address) associated with disk controller. main bus interface memory I/O bus USB graphics disk controller adapter controller mouse keyboard monitor disk 41 CompOrg - Memory Hierarchy
  • 42. Reading a disk sector (2) CPU chip Disk controller reads the sector and register file performs a direct memory access (DMA) transfer into main memory. ALU main bus interface memory I/O bus USB graphics disk controller adapter controller mouse keyboard monitor disk 42 CompOrg - Memory Hierarchy
  • 43. Reading a disk sector (3) CPU chip When the DMA transfer completes, the register file disk controller notifies the CPU with an interrupt (i.e., asserts a special “interrupt” ALU pin on the CPU) main bus interface memory I/O bus USB graphics disk controller adapter controller mouse keyboard monitor disk 43 CompOrg - Memory Hierarchy
  • 44. Storage trends metric 1980 1985 1990 1995 2000 2000:1980 SRAM $/MB 19,200 2,900 320 256 100 190 access (ns) 300 150 35 15 2 100 metric 1980 1985 1990 1995 2000 2000:1980 DRAM $/MB 8,000 880 100 30 1 8,000 access (ns) 375 200 100 70 60 6 typical size(MB) 0.064 0.256 4 16 64 1,000 metric 1980 1985 1990 1995 2000 2000:1980 $/MB 500 100 8 0.30 0.05 10,000 Disk access (ms) 87 75 28 10 8 11 typical size(MB) 1 10 160 1,000 9,000 9,000 44 (Culled from back issues of Byte and PC Magazine) CompOrg - Memory Hierarchy
  • 45. CPU clock rates 1980 1985 1990 1995 2000 2000:1980 processor 8080 286 386 Pent P-III clock rate(MHz) 1 6 20 150 750 750 cycle time(ns) 1,000 166 50 6 1.6 750 45 CompOrg - Memory Hierarchy
  • 46. The CPU-Memory Gap The increasing gap between DRAM, disk, and CPU speeds. 100,000,000 10,000,000 1,000,000 Disk seek time 100,000 DRAM access time 10,000 ns SRAM access time 1,000 CPU cycle time 100 10 1 1980 1985 1990 1995 2000 year 46 CompOrg - Memory Hierarchy
  • 47. Memory hierarchies Some fundamental and enduring properties of hardware and software: • Fast storage technologies cost more per byte and have less capacity. • The gap between CPU and main memory speed is widening. • Well-written programs tend to exhibit good locality. These fundamental properties complement each other beautifully. Suggest an approach for organizing memory and storage systems known as a “memory hierarchy”. 47 CompOrg - Memory Hierarchy
  • 48. An example memory hierarchy L0: Smaller, faster, registers CPU registers hold words retrieved and from cache memory. costlier L1: on-chip L1 (per byte) cache (SRAM) storage L1 cache holds cache lines devices retrieved from the L2 cache. L2: off-chip L2 cache (SRAM) L2 cache holds cache lines retrieved from memory. L3: main memory (DRAM) Larger, Main memory holds disk slower, blocks retrieved from local and disks. cheaper local secondary storage (per byte) L4: storage (local disks) devices Local disks hold files retrieved from disks on remote network servers. L5: remote secondary storage (distributed file systems, Web servers) 48 CompOrg - Memory Hierarchy
  • 49. Cache s Cache: A smaller, faster storage device that acts as a staging area for a subset of the data in a larger, slower device. Fundamental idea of a memory hierarchy: • For each k, the faster, smaller device at level k serves as a cache for the larger, slower device at level k+1. Why do memory hierarchies work? • Programs tend to access the data at level k more often than they access the data at level k+1. • Thus, the storage at level k+1 can be slower, and thus larger and cheaper per bit. • Net effect: A large pool of memory that costs as much as the cheap storage near the bottom, but that serves data to programs at the rate of the fast storage near the top. 49 CompOrg - Memory Hierarchy
  • 50. Cac he Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module 50 CompOrg - Memory Hierarchy
  • 51. Cac he/ Mai n Me mor y Stru ctur e 51 CompOrg - Memory Hierarchy
  • 52. Cac CPU requests contents of he memory location Check cache for this dataope If present, get from cacherati (fast) If not present, read required block from main memory to cache on – ove Then deliver from cache to CPU rvie Cache includes tags to identify which block of main memory is in each cache slot w 52 CompOrg - Memory Hierarchy
  • 53. Cac he Rea d Ope rati on - Flo wch art 53 CompOrg - Memory Hierarchy
  • 54. Cac Size he Mapping Function Des Replacement Algorithm ign Write Policy Block Size Number of Caches 54 CompOrg - Memory Hierarchy
  • 55. Size Cost doe • More cache is expensive s Speed mat • More cache is faster (up to a point) ter • Checking cache for data takes time 55 CompOrg - Memory Hierarchy
  • 56. Typi cal Cac he Org aniz atio n 56 CompOrg - Memory Hierarchy
  • 57. Co Processor Type Year ofmpa L1 cachea L2 cache L3 cache Introduction IBM 360/85 PDP-11/70 Mainframe Minicomputer 1968 1975 riso 16 to 32 KB 1 KB — — — — VAX 11/780 IBM 3033 Minicomputer Mainframe 1978 1978 n of 16 KB 64 KB — — — — IBM 3090 Intel 80486 Mainframe PC 1985 1989 Cac 128 to 256 KB 8 KB — — — — Pentium PowerPC 601 PC PC 1993 1993 he 8 KB/8 KB 32 KB 256 to 512 KB — — — PowerPC 620 PowerPC G4 PC PC/server 1996 1999 Size 32 KB/32 KB 32 KB/32 KB — 256 KB to 1 MB — 2 MB IBM S/390 G4 IBM S/390 G6 Mainframe Mainframe 1997 1999 s 32 KB 256 KB 256 KB 8 MB 2 MB — Pentium 4 PC/server 2000 8 KB/8 KB 256 KB — High-end server/ IBM SP 2000 64 KB/32 KB 8 MB — supercomputer CRAY MTAb Supercomputer 2000 8 KB 2 MB — Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB — Itanium 2 PC/server 2002 32 KB 256 KB 6 MB IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB 57 CRAY XD-1 Supercomputer CompOrg - Memory Hierarchy 2004 64 KB/64 KB 1MB —
  • 58. Map Cache of 64kByte pin Cache block of 4 bytes g • i.e. cache is 16k (2 ) lines of 4 bytes 14 16MBytes main memory Fun 24 bit address ctio • (2 =16M) 24 n 58 CompOrg - Memory Hierarchy
  • 59. Dire ct Each block of main memory maps to only one cache line Map in one specific place • i.e. if a block is in cache, it must be Address is in two parts pin g Least Significant w bits identify unique word Most Significant s bits specify one memory block The MSBs are split into a cache line field r and a tag of s-r (most significant) 59 CompOrg - Memory Hierarchy
  • 60. Dire ct Tag s-r MapSlot r Line or Word w pin 14 2 8 g 24 bit address 2 bit word identifier (4 byte block) Add 22 bit block identifier • 8 bit tag (=22-14) res • 14 bit slot or line s No two blocks in the same line have the same Tag field Stru Check contents of cache by finding line and checking Tag ctur e 60 CompOrg - Memory Hierarchy
  • 61. Dire Cache line ct Main Memory blocks held 0 0, m, 2m, 3m…2s-m 1 Map 1,m+1, 2m+1…2s-m+1 pin m-1 g m-1, 2m-1,3m-1…2s-1 Cac he Line Tabl e 61 CompOrg - Memory Hierarchy
  • 62. Dire ct Map pin g Cac he Org aniz atio n 62 CompOrg - Memory Hierarchy
  • 63. Direct Mapping Example 63 CompOrg - Memory Hierarchy
  • 64. Dire ct Address length = (s + w) bits Map Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes pin Number of blocks in main memory = 2s+ w/2w = 2s g Number of lines in cache = m = 2r Size of tag = (s – r) bits Su mm ary 64 CompOrg - Memory Hierarchy
  • 65. Dire Simple ct Inexpensive Map Fixed location for given block pin • If a program accesses 2 blocks that map to the same line repeatedly, cache misses areg high very pro s& con s 65 CompOrg - Memory Hierarchy
  • 66. Ass ocia A main memory block can load into any line of cache tive Memory address is interpreted as tag and word Tag uniquely identifies block of memory Map Every line’s tag is examined for a match pin Cache searching gets expensive g 66 CompOrg - Memory Hierarchy
  • 67. Full y Ass ocia tive Cac he Org aniz atio n 67 CompOrg - Memory Hierarchy
  • 68. Assosiative Mapping Example 68 CompOrg - Memory Hierarchy
  • 69. Ass ocia tive Tag 22 bit Word Map 2 bit pin of data 22 bit tag stored with each 32 bit block g Compare tag field with tag entry in cache to check for hit Add Least significant 2 bits of address identify which 16 bit word is required from 32 bit data block e.g. res • Address • FFFFFC Tag s Data FFFFFC 246824683FFF Cache line Stru ctur e CompOrg - Memory Hierarchy 69
  • 70. Ass Address length = (s + w) ocia bits tive Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes Map Number of blocks in main memory = 2s+ w/2w = 2s pin Number of lines in cache = undetermined Size of tag = s bits g Su mm ary 70 CompOrg - Memory Hierarchy
  • 71. Set Ass Cache is divided into a number of sets ocia Each set contains a number of lines A given block maps to any line in a given set tive • e.g. Block B can be in any line of set i e.g. 2 lines per set Map • 2 way associative mapping pin • A given block can be in one of 2 lines in only one set g 71 CompOrg - Memory Hierarchy
  • 72. Set 13 bit set number Ass ocia Block number in main memory is modulo 2 13 000000, 00A000, 00B000, tive … map to same set 00C000 Map pin g Exa mpl e 72 CompOrg - Memory Hierarchy
  • 73. Two Way Set Associative Cache Organization 73 CompOrg - Memory Hierarchy
  • 74. Set Ass ocia Word Tag 9 bit Set 13 bit tive 2 bit Map Use set field to determine cache set to look in pin Compare tag field to see if we have a hit e.g g • Address Tag Data Set number • 1FF 7FFC 1FF Add 1FFF 12345678 • 001 7FFC 001 11223344res 1FFF s Stru ctur CompOrg - Memory Hierarchy 74
  • 75. Two Way set Assosiative Mapping Example Two Way Set Associative Mapping Example 75 CompOrg - Memory Hierarchy
  • 76. Set Ass Address length = (s + w) bits ocia Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes tive Number of blocks in main memory = 2d Map Number of lines in set = k Number of sets = v = 2d pin Number of lines in cache = g = k * 2d kv Size of tag = (s – d) bits Su mm ary 76 CompOrg - Memory Hierarchy
  • 77. Rep No choice lace Each block only maps tomen one line Replace that line t Alg orit hms (1) Dire ct map pin CompOrg - Memory Hierarchy 77
  • 78. Replacement Algorithms (2) Associative & Set Associative Hardware implemented algorithm (speed) Least Recently used (LRU) e.g. in 2 way set associative • Which of the 2 block is lru? First in first out (FIFO) • replace block that has been in cache longest Least frequently used • replace block which has had fewest hits Random 78 CompOrg - Memory Hierarchy
  • 79. Writ Must not overwrite a cachee block unless main memory is up to date Poli Multiple CPUs may have individual caches cy directly I/O may address main memory 79 CompOrg - Memory Hierarchy
  • 80. Writ e All writes go to main memory as well as cache thro Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date Lots of traffic ugh Slows down writes Remember bogus write through caches! 80 CompOrg - Memory Hierarchy
  • 81. Writ e Updates initially made in cache only bac Update bit for cache slot is set when update occurs If block is to be replaced, write to main memory only if k update bit is set Other caches get out of sync I/O must access main memory through cache N.B. 15% of memory references are writes 81 CompOrg - Memory Hierarchy
  • 82. Caching in a memory hierarchy Smaller, faster, more expensive Level k: 4 9 14 3 device at level k caches a subset of the blocks from level k+1 Data is copied between levels in block-sized transfer units 0 1 2 3 4 5 6 7 Larger, slower, cheaper storage Level k+1: device at level k+1 is partitioned 8 9 10 11 into blocks. 12 13 14 15 82 CompOrg - Memory Hierarchy
  • 83. General caching concepts Level k: Program needs object d, 4 9 14 3 which is stored in some block b. Cache hit • Program finds b in the cache Level k+1: at level k. E.g. block 14. Cache miss 0 1 2 3 • b is not at level k, so level k 4 5 6 7 cache must fetch it from level k+1. E.g. block 12. 8 9 10 11 • If level k cache is full, then 12 13 14 15 some current block must be replaced (evicted). • Which one? Determined by replacement policy. E.g. evict least recently used block. 83 CompOrg - Memory Hierarchy
  • 84. General caching concepts Types of cache misses: • Cold (compulsary) miss – Cold misses occur because the cache is empty. • Conflict miss – Most caches limit blocks at level k+1 to a small subset (sometimes a singleton) of the block positions at level k. – E.g. Block i at level k+1 must be placed in block (i mod 4) at level k+1. – Conflict misses occur when the level k cache is large enough, but multiple data objects all map to the same level k block. – E.g. Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time. • Capacity miss – Occurs when the set of active cache blocks (working set) is larger than the cache. 84 CompOrg - Memory Hierarchy
  • 85. Examples of caching in the hierarchy Cache Type What Cached Where Cached Latency Managed (cycles) By Registers 4-byte word CPU registers 0 Compiler TLB Address On-Chip TLB 0 Hardware translations L1 cache 32-byte block On-Chip L1 1 Hardware L2 cache 32-byte block Off-Chip L2 10 Hardware Virtual Memory 4-KB page Main memory 100 Hardware+ OS Buffer cache Parts of files Main memory 100 OS Network buffer Parts of files Local disk 10,000,000 AFS/NFS cache client Browser cache Web pages Local disk 10,000,000 Web browser Web cache Web pages Remote server 1,000,000,000 Web proxy disks server 85 CompOrg - Memory Hierarchy
  • 86. Pen 80386 – no on chip cache tiu 80486 – 8k using 16 byte lines and four way set associative organization m4 Cac Pentium (all versions) – two on chip L1 caches • Data & instructions he Pentium III – L3 cache added off chip Pentium 4 • L1 caches – 8k bytes – 64 byte lines – four way set associative • L2 cache – Feeding both L1 caches – 256k – 128 byte lines – 8 way set associative • L3 cache on chip 86 CompOrg - Memory Hierarchy
  • 87. Intel Problem Cac Solution Processor on which feature first appears External memory slower than the system bus. he Add external cache using faster 386 memory technology. Increased processor speed results in external bus becoming a Evo Move external cache on-chip, 486 bottleneck for cache access. luti operating at the same speed as the processor. Internal cache is rather small, due to limited space on chip on Add external L2 cache using faster technology than main memory 486 Contention occurs when both the Instruction Prefetcher and Create separate data and instruction Pentium the Execution Unit simultaneously require access to the caches. cache. In that case, the Prefetcher is stalled while the Execution Unit’s data access takes place. Create separate back-side bus that Pentium Pro runs at higher speed than the main Increased processor speed results in external bus becoming a (front-side) external bus. The BSB is bottleneck for L2 cache access. dedicated to the L2 cache. Move L2 cache on to the processor Pentium II chip. Some applications deal with massive databases and must Add external L3 cache. Pentium III have rapid access to large amounts of data. The on-chip caches are too small. 87 CompOrg - Memorycache on-chip. Move L3 Hierarchy Pentium 4
  • 88. Pen tiu m4 Blo ck Dia gra m 88 CompOrg - Memory Hierarchy
  • 89. Pen Fetch/Decode Unit tiu • Fetches instructions from L2 cache • Decode into micro-ops m4 • Store micro-ops in L1 cache Cor Out of order execution logic • Schedules micro-ops e Pro • Based on data dependence and resources • May speculatively execute Execution units ces • Execute micro-ops • Data from L1 cache sor • Results in registers Memory subsystem • L2 cache and systems bus 89 CompOrg - Memory Hierarchy
  • 90. Pen tiu Decodes instructions into RISC like micro-ops before L1 cache Micro-ops fixed length • m4 Superscalar pipelining and scheduling Des Pentium instructions long & complex Performance improved by separating decoding from scheduling & pipelining • (More later – ch14) ign Data cache is write back • Rea Can be configured to write through soni L1 cache controlled by 2 bits in register • CD = cache disable • NW = not write through ng • 2 instructions to invalidate (flush) cache and write back then invalidate L2 and L3 8-way set-associative • Line size 128 bytes 90 CompOrg - Memory Hierarchy
  • 91. Pow erP 601 – single 32kb 8 way set associative C 603 – 16kb (2 x 8kb) two way set associative 604 – 32kb Cac 620 – 64kb he G3 & G4 • 64kb L1 cache Org – 8 way set associative • 256k, 512k or 1M L2 cache aniz – two way set associative atio G5 n • 32kB instruction cache • 64kB data cache 91 CompOrg - Memory Hierarchy
  • 92. Pow erP C G5 Blo ck Dia gra m 92 CompOrg - Memory Hierarchy