Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction Cell Processor

by Mansoor Mirza

  • Login to see the comments

  • Be the first to like this

Introduction Cell Processor

  1. 1. Introduction Cell Processor
  2. 2. Why Cell Processor <ul><li>Performance improvement with increase in frequency </li></ul><ul><ul><li>Possible due to increase in transistor density </li></ul></ul><ul><ul><li>Clock frequency is timing reference for a processor </li></ul></ul><ul><li>Power density </li></ul><ul><ul><li>Leakage currents increase with reducing the transistor density </li></ul></ul><ul><ul><li>Increase the idle power consumption </li></ul></ul>
  3. 3. History of Cell Processor <ul><li>A powerful processor of next generation of PS2 </li></ul><ul><ul><li>Powerful multimedia and broadband network interface </li></ul></ul><ul><li>IBM contribution in shaping the concept of Cell processor </li></ul><ul><li>Collaboration with Toshiba </li></ul><ul><li>STI Alliance </li></ul>
  4. 4. History of Cell Processor <ul><li>Development of Cell </li></ul><ul><ul><li>1999: Sony proposed partnership with IBM for successor of PS2 </li></ul></ul><ul><ul><li>2001: STI alliance initiated the development on Cell </li></ul></ul><ul><ul><li>2004: first prototype of Cell </li></ul></ul><ul><ul><li>2005: Sony unveil the PS3 in an E3 </li></ul></ul><ul><ul><li>2006: official release of PS3, Cell SDK by IBM </li></ul></ul><ul><ul><li>2008: IBM Roadrunner become fastest supercomputer in the world (1.026 pflops) </li></ul></ul>
  5. 5. Overview of Cell Design and Animation Game Programming Graphics Programming Matthew Scarpino
  6. 6. Overview of Cell 6.189 IAP 2007 MIT
  7. 7. Cell components <ul><li>Memory Interface Controller (MIC) </li></ul><ul><li>Bus Interface Controller (BIC) </li></ul><ul><li>PowerPC Processor Element/Unit (PPE/PPU) </li></ul><ul><li>Synergistic processing Element/Unit (SPE/SPU) </li></ul><ul><li>Element Interconnect Bus (EIB) </li></ul><ul><li>Input/Output InterFace (IOIF) </li></ul>
  8. 8. Cell components <ul><li>MIC </li></ul><ul><ul><li>Connects the processor with system memory </li></ul></ul><ul><ul><li>Two channels to system memory </li></ul></ul><ul><ul><li>Xteram Data Rate Dynamic Random Access Memory (XDR DRAM) </li></ul></ul><ul><ul><ul><li>Can support 8 data transfers per second </li></ul></ul></ul><ul><ul><ul><li>Provides high data flow at low frequency </li></ul></ul></ul><ul><ul><li>PS3 contains 256 MB XDR DRAM </li></ul></ul>
  9. 9. Cell components <ul><li>PPU </li></ul><ul><ul><li>Based on IBM PowerPC architecture </li></ul></ul><ul><ul><li>RISC architecture </li></ul></ul><ul><ul><li>Cell control center </li></ul></ul><ul><ul><ul><li>Runs operating system </li></ul></ul></ul><ul><ul><ul><li>Manages interrupts </li></ul></ul></ul><ul><ul><ul><li>Manages L2 shared cache </li></ul></ul></ul><ul><ul><ul><li>Issues work to SPU </li></ul></ul></ul>
  10. 10. Cell components <ul><li>PPU </li></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  11. 11. Cell components <ul><li>PPU </li></ul><ul><ul><li>64bit architecture </li></ul></ul><ul><ul><li>Supports SIMD </li></ul></ul><ul><ul><li>Supports cell related functions </li></ul></ul><ul><ul><li>Dual thread processor </li></ul></ul><ul><ul><li>Computation power is reduced </li></ul></ul><ul><ul><ul><li>PPU is not computational element in Cell </li></ul></ul></ul><ul><ul><ul><li>Reduces power consumption </li></ul></ul></ul>
  12. 12. Cell components <ul><li>Functional units of PPU </li></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  13. 13. Cell components <ul><li>Instruction unit (IU) </li></ul><ul><ul><li>Fetches and executes the instruction </li></ul></ul><ul><li>Load and Store Unit </li></ul><ul><ul><li>Receives the memory access request </li></ul></ul><ul><li>Vector/Scalar Unit (VSU) </li></ul><ul><ul><li>Contains Floating Point Unit </li></ul></ul><ul><ul><li>Performs FP operations on individual or multiple operands </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  14. 14. Cell components <ul><li>Fixed point unit (FPU) </li></ul><ul><ul><li>Performs fix point operations </li></ul></ul><ul><ul><ul><li>Arithmetic and logical operations </li></ul></ul></ul><ul><li>Memory Management Unit (MMU) </li></ul><ul><ul><li>Performs virtual memory management </li></ul></ul><ul><li>PPU registers </li></ul><ul><ul><li>Provides quick access to operands </li></ul></ul><ul><ul><li>Some functional unit can access only processor registers </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  15. 15. Cell components <ul><li>32 general purpose registers </li></ul><ul><li>32 floating point registers </li></ul><ul><li>Link register </li></ul><ul><ul><li>Holds branch address of upcoming target </li></ul></ul><ul><li>Count register </li></ul><ul><ul><li>Holds branch address of upcoming target (or) </li></ul></ul><ul><ul><li>Holds loop counter </li></ul></ul><ul><li>Fixed point exception register </li></ul><ul><ul><li>Holds carry and overflow bits for fixed point op. </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  16. 16. Cell components <ul><li>Condition register </li></ul><ul><ul><li>Holds status of arithmetic, logical or comparison </li></ul></ul><ul><li>Floating point status and control register </li></ul><ul><ul><li>Status of scalar FP operation </li></ul></ul><ul><li>Vector registers </li></ul><ul><ul><li>Contains data for vector operations </li></ul></ul><ul><li>Vector status and control register </li></ul><ul><ul><li>Holds saturation bit for vector operation </li></ul></ul><ul><li>Vector register save and restore register </li></ul><ul><ul><li>Saves vector registers in case of context switch </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  17. 17. Cell components <ul><li>SPU </li></ul><ul><ul><li>Basic work horse of Cell </li></ul></ul><ul><ul><li>Designed to executes SIMD </li></ul></ul><ul><ul><li>Separate Instruction set </li></ul></ul><ul><ul><li>Takes the work for PPU </li></ul></ul><ul><ul><li>Does have any cache </li></ul></ul><ul><ul><li>No virtual memory </li></ul></ul><ul><ul><li>Each SPU can contain only 256KB of memory </li></ul></ul>
  18. 18. Cell components <ul><li>SPU </li></ul><ul><ul><li>SPU can only access its own 256KB memory directly </li></ul></ul><ul><ul><li>Dynamic Memory Access is required to transfer the required data to SPU </li></ul></ul><ul><ul><li>Memory alignment is required to pass data to SPU </li></ul></ul><ul><ul><li>Different methods to communicates with PPU and other memory </li></ul></ul>
  19. 19. Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
  20. 20. Cell components <ul><li>Purpose of SPU </li></ul><ul><ul><li>Take 128-bit data to local register </li></ul></ul><ul><ul><li>Apply operation on it </li></ul></ul><ul><ul><li>Save the result to local memory </li></ul></ul><ul><li>Two distinct pipelines </li></ul><ul><ul><li>Even pipeline handles mathematical operations </li></ul></ul><ul><ul><li>Odd pipeline handles everything else </li></ul></ul>
  21. 21. Cell components <ul><li>SPU Control Unit (SCN) </li></ul><ul><ul><li>Fetches and dispatches the instructions </li></ul></ul><ul><ul><li>Perform branching and other control operations </li></ul></ul><ul><li>SPU even fixed point unit </li></ul><ul><ul><li>Handles logic/arithmetic operations </li></ul></ul><ul><ul><li>Performs comparisons and reciprocations for FP </li></ul></ul><ul><li>SPU odd fixed point unit </li></ul><ul><ul><li>Performs bit level shifts, rotations, and shuffling </li></ul></ul>
  22. 22. Cell components <ul><li>SPU floating point unit </li></ul><ul><ul><li>Performs floating point operations </li></ul></ul><ul><li>SPU load/store unit </li></ul><ul><ul><li>Performs loads and stores </li></ul></ul><ul><ul><li>Manages branch targets and DMA to Local store </li></ul></ul><ul><li>SPU channel and DMA unit </li></ul><ul><ul><li>Communicates with Memory Flow Controller </li></ul></ul><ul><ul><li>Controls DMA transfer </li></ul></ul>
  23. 23. Cell components <ul><li>SPU registers </li></ul><ul><ul><li>128 general purpose registers </li></ul></ul><ul><ul><li>Floating point status and control registers </li></ul></ul><ul><ul><ul><li>Contains status and results of floating point operations </li></ul></ul></ul><ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Each SPU contains very low latency 256KB memory </li></ul></ul><ul><ul><li>It acts as local cache for SPU </li></ul></ul><ul><ul><li>All data transfer is responsibility of the programmer </li></ul></ul>
  24. 24. Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Not a cache just an SRAM </li></ul></ul><ul><ul><li>Only one read/write operations per second </li></ul></ul><ul><ul><li>Operations accessing the LS </li></ul></ul><ul><ul><ul><li>DMA </li></ul></ul></ul><ul><ul><ul><ul><li>Transfer data from main memory to LS </li></ul></ul></ul></ul><ul><ul><ul><li>SPU load/store </li></ul></ul></ul><ul><ul><ul><ul><li>Reads/writes 16 bytes at a time </li></ul></ul></ul></ul><ul><ul><ul><li>Instruction fetch </li></ul></ul></ul><ul><ul><ul><ul><li>Reads 128 bytes of the LS at once </li></ul></ul></ul></ul>
  25. 25. Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Does not support virtual memory </li></ul></ul><ul><ul><li>Tradeoff between cache coherence and fetching the data to LS </li></ul></ul><ul><ul><ul><li>LS is low latency memory </li></ul></ul></ul><ul><ul><ul><li>Cache coherence protocols are used for other processors </li></ul></ul></ul><ul><ul><ul><li>Data is transferred to LS using high throughput EIB via DMA instead of cache coherence protocols </li></ul></ul></ul><ul><ul><ul><li>Make the hardware simple </li></ul></ul></ul>
  26. 26. Cell components <ul><li>communications between SPU and other system </li></ul><ul><ul><li>DMA </li></ul></ul><ul><ul><li>Mailboxes </li></ul></ul><ul><ul><li>Events and signals </li></ul></ul>
  27. 27. Cell components <ul><li>DMA </li></ul><ul><ul><li>Transfers data to LS </li></ul></ul><ul><ul><li>Asynchronous in nature </li></ul></ul><ul><ul><ul><li>SPU continues its operation while DMA </li></ul></ul></ul><ul><ul><li>Transfers data in chunk of bytes of size power of 2 </li></ul></ul><ul><ul><li>Provides control to manage and synchronize the data transfer </li></ul></ul><ul><ul><li>One DMA can maximum transfer 16KB </li></ul></ul>
  28. 28. Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
  29. 29. Cell components <ul><li>EIB </li></ul><ul><ul><li>Connects all the system components </li></ul></ul><ul><ul><li>Consists of four data ring (two clockwise and two counter-clockwise) </li></ul></ul><ul><ul><li>One ring is for control signals </li></ul></ul><ul><ul><li>One bus cycles can transfer 16 bytes of data </li></ul></ul><ul><ul><li>Each ring can carry three DMA requests simultaneously </li></ul></ul><ul><ul><li>Each DMA takes at least 8 cycles to complete </li></ul></ul>
  30. 30. Cell components <ul><li>MFC </li></ul><ul><ul><li>Coprocessor to communicate between SPU and EIB </li></ul></ul><ul><ul><li>Process data transfer without interrupting the SPU </li></ul></ul><ul><ul><li>SPU requests the MFC to get the data </li></ul></ul><ul><ul><li>MFC processes the rest of data transfer </li></ul></ul>
  31. 31. Cell components <ul><li>Mailboxes </li></ul><ul><ul><li>Simplest way to transfer the data between PPU and SPU </li></ul></ul><ul><ul><li>Can only transfer 4 bytes of data </li></ul></ul><ul><ul><li>Provides one-to-one communication </li></ul></ul><ul><ul><li>Mailbox channels </li></ul></ul><ul><ul><ul><li>Outgoing mailbox </li></ul></ul></ul><ul><ul><ul><li>Outgoing interrupt mailbox </li></ul></ul></ul><ul><ul><ul><ul><li>Holds the data for outside world and cause interrupt if applicable </li></ul></ul></ul></ul><ul><ul><ul><li>Incoming mailbox </li></ul></ul></ul>
  32. 32. Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><ul><li>Signals can be sent directly to outside world </li></ul></ul><ul><ul><li>Signals can provide one-to-many style communication </li></ul></ul>
  33. 33. Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><ul><li>Signals can be sent directly to outside world </li></ul></ul><ul><ul><li>Signals can provide one-to-many style communication </li></ul></ul>
  34. 34. Software development of Cell <ul><li>Different instruction sets for SPU and PPU </li></ul><ul><li>Different compilers are required to compile the applications for two codes </li></ul><ul><li>Embedding the SPU code in PPU executable </li></ul>
  35. 35. Software development of Cell <ul><li>Tools to compile the application for Cell </li></ul><ul><ul><li>PPU compiler </li></ul></ul><ul><ul><ul><li>ppu-gcc </li></ul></ul></ul><ul><ul><li>SPU compiler </li></ul></ul><ul><ul><ul><li>spu-gcc </li></ul></ul></ul><ul><ul><li>Embed SPU code to PPU </li></ul></ul><ul><ul><ul><li>ppu-embedspu </li></ul></ul></ul>
  36. 36. Software development of Cell <ul><li>Cell simulator </li></ul><ul><ul><li>Full System Simulator </li></ul></ul><ul><ul><li>Emulates all system components </li></ul></ul><ul><ul><li>Can provides cycle accurate information </li></ul></ul><ul><ul><li>Provides graphical interface to se and interact with system components </li></ul></ul>
  37. 37. Software development of Cell IBM Full System Simulator user guide
  38. 38. Software development of Cell <ul><li>Three modes </li></ul><ul><ul><li>Fast mode </li></ul></ul><ul><ul><li>Simple mode </li></ul></ul><ul><ul><li>Cycle mode </li></ul></ul><ul><li>Graphical visualization of SPU and PPU </li></ul><ul><li>Provides debugging and profiling information </li></ul><ul><li>Provides system utilization information </li></ul>
  39. 39. Software development of Cell
  40. 40. Software development of Cell Design and Animation Game Programming Graphics Programming Matthew Scarpino

×