Published on

Intel's Nehalem Microprocessor Architecture Explained

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. What is Micro architecture?Microarchitecture (µarch or  Chip area/costuarch), is the way a given ISA isimplemented on a processor.  Power consumption  Logic complexityMicro architecture design focus  Ease of connectivityon these aspects  Manufacturability  Ease of debugging  Testability
  2. 2. Nehalem ? Is the codename for It was then followed by several thelatest Intel processor Xeon processors and by i3 and i5. micro architecture. Was supposed to be latest evolution The first processor of NetBurst, but was renamed as ‘Nehalem’ released with the Nehalem architecture HT is reintroduced along with an L3 was the desktop Core i7 cache to achieve higher clock speeds and energy efficiency. (Nov 2008).
  3. 3. Technology Wide range of two, four, six,  Integrated memory eight, ten or twelve core controller supporting two or processors. three memory channels Initial release with 45nm of DDR3 SDRAM or four FB- manufacturing process, followed DIMM2 channel. by 32nm variants.It allows more  Integrated graphics processor number of transistors in a single (IGP) located off-die, but in the die (731 million in quad core same CPU package. variant!).
  4. 4. Technology (..contd) A new point-to-point processor  Simultaneous multithreading interconnect, the Intel QuickPath (Hyper Threading) by multiple Interconnect, in high-end cores which enables two threads models, replacing the per core. legacy front side bus Integration of PCI  Native (monolithic) quad- and Express and Direct Media octal-core processor. Interface into the processor in  33% more in-flight micro- mid-range models, replacing operations than Core 2 uarch. the northbridge
  5. 5. Technology (..contd) The following caches:  32 KB L1 instruction and 32 KB L1 data cache per core  256 KB L2 cache per core  4–12 MB L3 cache (2MB/Core) shared by all cores Second-level branch predictor and second-level translation lookaside buffer. Modular blocks of components such as cores that can be added and subtracted for varying market segments.
  6. 6. Performance and Power Optimizations 1.1× to 1.25× the single-threaded performance or 1.2× to 2× the multithreaded performance at the same power level. 30% lower power usage for the same performance According to a preview from AnandTech "expect a 20–30% overall advantage over Penryn with a 10% increase in power usage. Per Core, clock-for-clock, Nehalem provides a 15–20% increase in performance compared to Penryn
  7. 7. Processor Release TimelineThe successor to Nehalem and Westmere will be Sandy Bridge, scheduled for release in late2010, according to statements by Intel. The successor to Sandy Bridge will be Haswell,scheduled for release in 2012. It will come with a new cache subsystem, a FMA (fused multiply-add) unit, and a vector coprocessor.
  8. 8. Related Definitions (1/4) An instruction set, or instruction set architecture (ISA), is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O. ISA includes specifications of the set of opcodes (machine language), and the native commands implemented by a particular processor
  9. 9. Related Definitions (2/4) Per the International Technology Roadmap for Semiconductors, the 45 nm technology node should refer to the average half-pitch of a memory cell manufactured at around the 2007–2008 time frame. QuickPath is a core to core interconnection that eases inter processor data transfer with a bandwidth of 25.6GB/Second.
  10. 10. Related Definitions (3/4) Front side Bus is the the bus that carries data between the CPU and the northbridge. Back side bus is the bus between CPU and cache memory.
  11. 11. Related Definitions (4/4) branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure.