What is Micro architecture?
Microarchitecture (µarch or            Chip area/cost
uarch), is the way a given ISA is
implemented on a processor.            Power consumption
                                       Logic complexity
Micro architecture design focus        Ease of connectivity
on these aspects
                                       Manufacturability
                                       Ease of debugging
                                       Testability
Nehalem ?
 Is the codename for         It was then followed by several
  thelatest Intel processor   Xeon processors and by i3 and i5.
  micro architecture.
                              Was supposed to be latest evolution
 The first processor         of NetBurst, but was renamed as
                              ‘Nehalem’
  released with the
  Nehalem architecture        HT is reintroduced along with an L3
  was the desktop Core i7     cache to achieve higher clock
                              speeds and energy efficiency.
  (Nov 2008).
Technology
   Wide range of two, four, six,          Integrated memory
    eight, ten or twelve core               controller supporting two or
    processors.                             three memory channels
   Initial release with 45nm               of DDR3 SDRAM or four FB-
    manufacturing process, followed         DIMM2 channel.
    by 32nm variants.It allows more        Integrated graphics processor
    number of transistors in a single       (IGP) located off-die, but in the
    die (731 million in quad core           same CPU package.
    variant!).
Technology                                                 (..contd)

   A new point-to-point processor         Simultaneous multithreading
    interconnect, the Intel QuickPath       (Hyper Threading) by multiple
    Interconnect, in high-end               cores which enables two threads
    models, replacing the                   per core.
    legacy front side bus
   Integration of PCI                     Native (monolithic) quad- and
    Express and Direct Media                octal-core processor.
    Interface into the processor in        33% more in-flight micro-
    mid-range models, replacing             operations than Core 2 uarch.
    the northbridge
Technology                                                   (..contd)

   The following caches:
        32 KB L1 instruction and 32 KB L1 data cache per core
        256 KB L2 cache per core
        4–12 MB L3 cache (2MB/Core) shared by all cores
   Second-level branch predictor and second-level translation lookaside
    buffer.
   Modular blocks of components such as cores that can be added and
    subtracted for varying market segments.
Performance and Power Optimizations
   1.1× to 1.25× the single-threaded performance or 1.2× to 2×
    the multithreaded performance at the same power level.
   30% lower power usage for the same performance
   According to a preview from AnandTech "expect a 20–30%
    overall advantage over Penryn with a 10% increase in power
    usage.
   Per Core, clock-for-clock, Nehalem provides a 15–20% increase
    in performance compared to Penryn
Processor Release Timeline




The successor to Nehalem and Westmere will be Sandy Bridge, scheduled for release in late
2010, according to statements by Intel. The successor to Sandy Bridge will be Haswell,
scheduled for release in 2012. It will come with a new cache subsystem, a FMA (fused multiply-
add) unit, and a vector coprocessor.
Related Definitions                               (1/4)

 An instruction set, or instruction set
 architecture (ISA), is the part of the computer
 architecture related to programming, including the
 native data types, instructions, registers, addressing
 modes, memory architecture, interrupt and exception
 handling, and external I/O. ISA includes specifications
 of the set of opcodes (machine language), and the
 native commands implemented by a particular
 processor
Related Definitions                               (2/4)

 Per the International Technology Roadmap for
  Semiconductors, the 45 nm technology node should
  refer to the average half-pitch of a memory cell
  manufactured at around the 2007–2008 time frame.
 QuickPath is a core to core interconnection that
  eases inter processor data transfer with a bandwidth
  of 25.6GB/Second.
Related Definitions                                        (3/4)


                      Front side Bus is the the bus that
                      carries data between the CPU and
                      the northbridge.

                      Back side bus is the bus between
                      CPU and cache memory.
Related Definitions                                  (4/4)

 branch predictor is a digital circuit that tries to guess
 which way a branch (e.g. an if-then-else structure) will
 go before this is known for sure.
Nehalem
Nehalem

Nehalem

  • 2.
    What is Microarchitecture? Microarchitecture (µarch or  Chip area/cost uarch), is the way a given ISA is implemented on a processor.  Power consumption  Logic complexity Micro architecture design focus  Ease of connectivity on these aspects  Manufacturability  Ease of debugging  Testability
  • 3.
    Nehalem ?  Isthe codename for It was then followed by several thelatest Intel processor Xeon processors and by i3 and i5. micro architecture. Was supposed to be latest evolution  The first processor of NetBurst, but was renamed as ‘Nehalem’ released with the Nehalem architecture HT is reintroduced along with an L3 was the desktop Core i7 cache to achieve higher clock speeds and energy efficiency. (Nov 2008).
  • 4.
    Technology  Wide range of two, four, six,  Integrated memory eight, ten or twelve core controller supporting two or processors. three memory channels  Initial release with 45nm of DDR3 SDRAM or four FB- manufacturing process, followed DIMM2 channel. by 32nm variants.It allows more  Integrated graphics processor number of transistors in a single (IGP) located off-die, but in the die (731 million in quad core same CPU package. variant!).
  • 5.
    Technology (..contd)  A new point-to-point processor  Simultaneous multithreading interconnect, the Intel QuickPath (Hyper Threading) by multiple Interconnect, in high-end cores which enables two threads models, replacing the per core. legacy front side bus  Integration of PCI  Native (monolithic) quad- and Express and Direct Media octal-core processor. Interface into the processor in  33% more in-flight micro- mid-range models, replacing operations than Core 2 uarch. the northbridge
  • 6.
    Technology (..contd)  The following caches:  32 KB L1 instruction and 32 KB L1 data cache per core  256 KB L2 cache per core  4–12 MB L3 cache (2MB/Core) shared by all cores  Second-level branch predictor and second-level translation lookaside buffer.  Modular blocks of components such as cores that can be added and subtracted for varying market segments.
  • 7.
    Performance and PowerOptimizations  1.1× to 1.25× the single-threaded performance or 1.2× to 2× the multithreaded performance at the same power level.  30% lower power usage for the same performance  According to a preview from AnandTech "expect a 20–30% overall advantage over Penryn with a 10% increase in power usage.  Per Core, clock-for-clock, Nehalem provides a 15–20% increase in performance compared to Penryn
  • 8.
    Processor Release Timeline Thesuccessor to Nehalem and Westmere will be Sandy Bridge, scheduled for release in late 2010, according to statements by Intel. The successor to Sandy Bridge will be Haswell, scheduled for release in 2012. It will come with a new cache subsystem, a FMA (fused multiply- add) unit, and a vector coprocessor.
  • 9.
    Related Definitions (1/4)  An instruction set, or instruction set architecture (ISA), is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O. ISA includes specifications of the set of opcodes (machine language), and the native commands implemented by a particular processor
  • 10.
    Related Definitions (2/4)  Per the International Technology Roadmap for Semiconductors, the 45 nm technology node should refer to the average half-pitch of a memory cell manufactured at around the 2007–2008 time frame.  QuickPath is a core to core interconnection that eases inter processor data transfer with a bandwidth of 25.6GB/Second.
  • 11.
    Related Definitions (3/4) Front side Bus is the the bus that carries data between the CPU and the northbridge. Back side bus is the bus between CPU and cache memory.
  • 12.
    Related Definitions (4/4)  branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure.