Introduction Cell Processor
Why Cell Processor <ul><li>Performance improvement with increase in frequency </li></ul><ul><ul><li>Possible due to increa...
History of Cell Processor <ul><li>A powerful processor of next generation of PS2 </li></ul><ul><ul><li>Powerful multimedia...
History of Cell Processor <ul><li>Development of Cell </li></ul><ul><ul><li>1999: Sony proposed partnership with IBM for s...
Overview of Cell  Design and Animation Game Programming Graphics Programming Matthew Scarpino
Overview of Cell  6.189 IAP 2007 MIT
Cell components <ul><li>Memory Interface Controller (MIC) </li></ul><ul><li>Bus Interface Controller (BIC) </li></ul><ul><...
Cell components <ul><li>MIC </li></ul><ul><ul><li>Connects the processor with system memory </li></ul></ul><ul><ul><li>Two...
Cell components <ul><li>PPU </li></ul><ul><ul><li>Based on IBM PowerPC architecture </li></ul></ul><ul><ul><li>RISC archit...
Cell components <ul><li>PPU </li></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components <ul><li>PPU </li></ul><ul><ul><li>64bit architecture </li></ul></ul><ul><ul><li>Supports SIMD  </li></ul><...
Cell components <ul><li>Functional units of PPU </li></ul>Design and Animation Game Programming Graphics Programming Matth...
Cell components <ul><li>Instruction unit (IU) </li></ul><ul><ul><li>Fetches and executes the instruction </li></ul></ul><u...
Cell components <ul><li>Fixed point unit (FPU) </li></ul><ul><ul><li>Performs fix point operations </li></ul></ul><ul><ul>...
Cell components <ul><li>32 general purpose registers </li></ul><ul><li>32 floating point registers </li></ul><ul><li>Link ...
Cell components <ul><li>Condition register </li></ul><ul><ul><li>Holds status of arithmetic, logical or comparison  </li><...
Cell components <ul><li>SPU  </li></ul><ul><ul><li>Basic work horse of Cell </li></ul></ul><ul><ul><li>Designed to execute...
Cell components <ul><li>SPU  </li></ul><ul><ul><li>SPU can only access its own 256KB memory directly </li></ul></ul><ul><u...
Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components <ul><li>Purpose of SPU </li></ul><ul><ul><li>Take 128-bit data to local register </li></ul></ul><ul><ul><l...
Cell components <ul><li>SPU Control Unit (SCN) </li></ul><ul><ul><li>Fetches and dispatches the instructions </li></ul></u...
Cell components <ul><li>SPU floating point unit </li></ul><ul><ul><li>Performs floating point operations  </li></ul></ul><...
Cell components <ul><li>SPU registers </li></ul><ul><ul><li>128 general purpose registers </li></ul></ul><ul><ul><li>Float...
Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Not a cache just an SRAM </li></ul></ul><ul><ul><li>Onl...
Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Does not support virtual memory </li></ul></ul><ul><ul>...
Cell components <ul><li>communications between SPU and other system </li></ul><ul><ul><li>DMA </li></ul></ul><ul><ul><li>M...
Cell components <ul><li>DMA </li></ul><ul><ul><li>Transfers data to LS  </li></ul></ul><ul><ul><li>Asynchronous in nature ...
Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components <ul><li>EIB </li></ul><ul><ul><li>Connects all the system components </li></ul></ul><ul><ul><li>Consists o...
Cell components <ul><li>MFC </li></ul><ul><ul><li>Coprocessor to communicate between SPU and EIB </li></ul></ul><ul><ul><l...
Cell components <ul><li>Mailboxes  </li></ul><ul><ul><li>Simplest way to transfer the data between PPU and SPU </li></ul><...
Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><u...
Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><u...
Software development of Cell <ul><li>Different instruction sets for SPU and PPU </li></ul><ul><li>Different compilers are ...
Software development of Cell <ul><li>Tools to compile the application for Cell </li></ul><ul><ul><li>PPU compiler </li></u...
Software development of Cell <ul><li>Cell simulator </li></ul><ul><ul><li>Full System Simulator </li></ul></ul><ul><ul><li...
Software development of Cell IBM Full System Simulator user guide
Software development of Cell <ul><li>Three modes </li></ul><ul><ul><li>Fast mode </li></ul></ul><ul><ul><li>Simple mode </...
Software development of Cell
Software development of Cell Design and Animation Game Programming Graphics Programming Matthew Scarpino
Upcoming SlideShare
Loading in …5
×

Introduction Cell Processor

613 views

Published on

shared by Mansoor Mirza

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
613
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction Cell Processor

  1. 1. Introduction Cell Processor
  2. 2. Why Cell Processor <ul><li>Performance improvement with increase in frequency </li></ul><ul><ul><li>Possible due to increase in transistor density </li></ul></ul><ul><ul><li>Clock frequency is timing reference for a processor </li></ul></ul><ul><li>Power density </li></ul><ul><ul><li>Leakage currents increase with reducing the transistor density </li></ul></ul><ul><ul><li>Increase the idle power consumption </li></ul></ul>
  3. 3. History of Cell Processor <ul><li>A powerful processor of next generation of PS2 </li></ul><ul><ul><li>Powerful multimedia and broadband network interface </li></ul></ul><ul><li>IBM contribution in shaping the concept of Cell processor </li></ul><ul><li>Collaboration with Toshiba </li></ul><ul><li>STI Alliance </li></ul>
  4. 4. History of Cell Processor <ul><li>Development of Cell </li></ul><ul><ul><li>1999: Sony proposed partnership with IBM for successor of PS2 </li></ul></ul><ul><ul><li>2001: STI alliance initiated the development on Cell </li></ul></ul><ul><ul><li>2004: first prototype of Cell </li></ul></ul><ul><ul><li>2005: Sony unveil the PS3 in an E3 </li></ul></ul><ul><ul><li>2006: official release of PS3, Cell SDK by IBM </li></ul></ul><ul><ul><li>2008: IBM Roadrunner become fastest supercomputer in the world (1.026 pflops) </li></ul></ul>
  5. 5. Overview of Cell Design and Animation Game Programming Graphics Programming Matthew Scarpino
  6. 6. Overview of Cell 6.189 IAP 2007 MIT
  7. 7. Cell components <ul><li>Memory Interface Controller (MIC) </li></ul><ul><li>Bus Interface Controller (BIC) </li></ul><ul><li>PowerPC Processor Element/Unit (PPE/PPU) </li></ul><ul><li>Synergistic processing Element/Unit (SPE/SPU) </li></ul><ul><li>Element Interconnect Bus (EIB) </li></ul><ul><li>Input/Output InterFace (IOIF) </li></ul>
  8. 8. Cell components <ul><li>MIC </li></ul><ul><ul><li>Connects the processor with system memory </li></ul></ul><ul><ul><li>Two channels to system memory </li></ul></ul><ul><ul><li>Xteram Data Rate Dynamic Random Access Memory (XDR DRAM) </li></ul></ul><ul><ul><ul><li>Can support 8 data transfers per second </li></ul></ul></ul><ul><ul><ul><li>Provides high data flow at low frequency </li></ul></ul></ul><ul><ul><li>PS3 contains 256 MB XDR DRAM </li></ul></ul>
  9. 9. Cell components <ul><li>PPU </li></ul><ul><ul><li>Based on IBM PowerPC architecture </li></ul></ul><ul><ul><li>RISC architecture </li></ul></ul><ul><ul><li>Cell control center </li></ul></ul><ul><ul><ul><li>Runs operating system </li></ul></ul></ul><ul><ul><ul><li>Manages interrupts </li></ul></ul></ul><ul><ul><ul><li>Manages L2 shared cache </li></ul></ul></ul><ul><ul><ul><li>Issues work to SPU </li></ul></ul></ul>
  10. 10. Cell components <ul><li>PPU </li></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  11. 11. Cell components <ul><li>PPU </li></ul><ul><ul><li>64bit architecture </li></ul></ul><ul><ul><li>Supports SIMD </li></ul></ul><ul><ul><li>Supports cell related functions </li></ul></ul><ul><ul><li>Dual thread processor </li></ul></ul><ul><ul><li>Computation power is reduced </li></ul></ul><ul><ul><ul><li>PPU is not computational element in Cell </li></ul></ul></ul><ul><ul><ul><li>Reduces power consumption </li></ul></ul></ul>
  12. 12. Cell components <ul><li>Functional units of PPU </li></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  13. 13. Cell components <ul><li>Instruction unit (IU) </li></ul><ul><ul><li>Fetches and executes the instruction </li></ul></ul><ul><li>Load and Store Unit </li></ul><ul><ul><li>Receives the memory access request </li></ul></ul><ul><li>Vector/Scalar Unit (VSU) </li></ul><ul><ul><li>Contains Floating Point Unit </li></ul></ul><ul><ul><li>Performs FP operations on individual or multiple operands </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  14. 14. Cell components <ul><li>Fixed point unit (FPU) </li></ul><ul><ul><li>Performs fix point operations </li></ul></ul><ul><ul><ul><li>Arithmetic and logical operations </li></ul></ul></ul><ul><li>Memory Management Unit (MMU) </li></ul><ul><ul><li>Performs virtual memory management </li></ul></ul><ul><li>PPU registers </li></ul><ul><ul><li>Provides quick access to operands </li></ul></ul><ul><ul><li>Some functional unit can access only processor registers </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  15. 15. Cell components <ul><li>32 general purpose registers </li></ul><ul><li>32 floating point registers </li></ul><ul><li>Link register </li></ul><ul><ul><li>Holds branch address of upcoming target </li></ul></ul><ul><li>Count register </li></ul><ul><ul><li>Holds branch address of upcoming target (or) </li></ul></ul><ul><ul><li>Holds loop counter </li></ul></ul><ul><li>Fixed point exception register </li></ul><ul><ul><li>Holds carry and overflow bits for fixed point op. </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  16. 16. Cell components <ul><li>Condition register </li></ul><ul><ul><li>Holds status of arithmetic, logical or comparison </li></ul></ul><ul><li>Floating point status and control register </li></ul><ul><ul><li>Status of scalar FP operation </li></ul></ul><ul><li>Vector registers </li></ul><ul><ul><li>Contains data for vector operations </li></ul></ul><ul><li>Vector status and control register </li></ul><ul><ul><li>Holds saturation bit for vector operation </li></ul></ul><ul><li>Vector register save and restore register </li></ul><ul><ul><li>Saves vector registers in case of context switch </li></ul></ul>Design and Animation Game Programming Graphics Programming Matthew Scarpino
  17. 17. Cell components <ul><li>SPU </li></ul><ul><ul><li>Basic work horse of Cell </li></ul></ul><ul><ul><li>Designed to executes SIMD </li></ul></ul><ul><ul><li>Separate Instruction set </li></ul></ul><ul><ul><li>Takes the work for PPU </li></ul></ul><ul><ul><li>Does have any cache </li></ul></ul><ul><ul><li>No virtual memory </li></ul></ul><ul><ul><li>Each SPU can contain only 256KB of memory </li></ul></ul>
  18. 18. Cell components <ul><li>SPU </li></ul><ul><ul><li>SPU can only access its own 256KB memory directly </li></ul></ul><ul><ul><li>Dynamic Memory Access is required to transfer the required data to SPU </li></ul></ul><ul><ul><li>Memory alignment is required to pass data to SPU </li></ul></ul><ul><ul><li>Different methods to communicates with PPU and other memory </li></ul></ul>
  19. 19. Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
  20. 20. Cell components <ul><li>Purpose of SPU </li></ul><ul><ul><li>Take 128-bit data to local register </li></ul></ul><ul><ul><li>Apply operation on it </li></ul></ul><ul><ul><li>Save the result to local memory </li></ul></ul><ul><li>Two distinct pipelines </li></ul><ul><ul><li>Even pipeline handles mathematical operations </li></ul></ul><ul><ul><li>Odd pipeline handles everything else </li></ul></ul>
  21. 21. Cell components <ul><li>SPU Control Unit (SCN) </li></ul><ul><ul><li>Fetches and dispatches the instructions </li></ul></ul><ul><ul><li>Perform branching and other control operations </li></ul></ul><ul><li>SPU even fixed point unit </li></ul><ul><ul><li>Handles logic/arithmetic operations </li></ul></ul><ul><ul><li>Performs comparisons and reciprocations for FP </li></ul></ul><ul><li>SPU odd fixed point unit </li></ul><ul><ul><li>Performs bit level shifts, rotations, and shuffling </li></ul></ul>
  22. 22. Cell components <ul><li>SPU floating point unit </li></ul><ul><ul><li>Performs floating point operations </li></ul></ul><ul><li>SPU load/store unit </li></ul><ul><ul><li>Performs loads and stores </li></ul></ul><ul><ul><li>Manages branch targets and DMA to Local store </li></ul></ul><ul><li>SPU channel and DMA unit </li></ul><ul><ul><li>Communicates with Memory Flow Controller </li></ul></ul><ul><ul><li>Controls DMA transfer </li></ul></ul>
  23. 23. Cell components <ul><li>SPU registers </li></ul><ul><ul><li>128 general purpose registers </li></ul></ul><ul><ul><li>Floating point status and control registers </li></ul></ul><ul><ul><ul><li>Contains status and results of floating point operations </li></ul></ul></ul><ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Each SPU contains very low latency 256KB memory </li></ul></ul><ul><ul><li>It acts as local cache for SPU </li></ul></ul><ul><ul><li>All data transfer is responsibility of the programmer </li></ul></ul>
  24. 24. Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Not a cache just an SRAM </li></ul></ul><ul><ul><li>Only one read/write operations per second </li></ul></ul><ul><ul><li>Operations accessing the LS </li></ul></ul><ul><ul><ul><li>DMA </li></ul></ul></ul><ul><ul><ul><ul><li>Transfer data from main memory to LS </li></ul></ul></ul></ul><ul><ul><ul><li>SPU load/store </li></ul></ul></ul><ul><ul><ul><ul><li>Reads/writes 16 bytes at a time </li></ul></ul></ul></ul><ul><ul><ul><li>Instruction fetch </li></ul></ul></ul><ul><ul><ul><ul><li>Reads 128 bytes of the LS at once </li></ul></ul></ul></ul>
  25. 25. Cell components <ul><li>SPU local Store (LS) </li></ul><ul><ul><li>Does not support virtual memory </li></ul></ul><ul><ul><li>Tradeoff between cache coherence and fetching the data to LS </li></ul></ul><ul><ul><ul><li>LS is low latency memory </li></ul></ul></ul><ul><ul><ul><li>Cache coherence protocols are used for other processors </li></ul></ul></ul><ul><ul><ul><li>Data is transferred to LS using high throughput EIB via DMA instead of cache coherence protocols </li></ul></ul></ul><ul><ul><ul><li>Make the hardware simple </li></ul></ul></ul>
  26. 26. Cell components <ul><li>communications between SPU and other system </li></ul><ul><ul><li>DMA </li></ul></ul><ul><ul><li>Mailboxes </li></ul></ul><ul><ul><li>Events and signals </li></ul></ul>
  27. 27. Cell components <ul><li>DMA </li></ul><ul><ul><li>Transfers data to LS </li></ul></ul><ul><ul><li>Asynchronous in nature </li></ul></ul><ul><ul><ul><li>SPU continues its operation while DMA </li></ul></ul></ul><ul><ul><li>Transfers data in chunk of bytes of size power of 2 </li></ul></ul><ul><ul><li>Provides control to manage and synchronize the data transfer </li></ul></ul><ul><ul><li>One DMA can maximum transfer 16KB </li></ul></ul>
  28. 28. Cell components Design and Animation Game Programming Graphics Programming Matthew Scarpino
  29. 29. Cell components <ul><li>EIB </li></ul><ul><ul><li>Connects all the system components </li></ul></ul><ul><ul><li>Consists of four data ring (two clockwise and two counter-clockwise) </li></ul></ul><ul><ul><li>One ring is for control signals </li></ul></ul><ul><ul><li>One bus cycles can transfer 16 bytes of data </li></ul></ul><ul><ul><li>Each ring can carry three DMA requests simultaneously </li></ul></ul><ul><ul><li>Each DMA takes at least 8 cycles to complete </li></ul></ul>
  30. 30. Cell components <ul><li>MFC </li></ul><ul><ul><li>Coprocessor to communicate between SPU and EIB </li></ul></ul><ul><ul><li>Process data transfer without interrupting the SPU </li></ul></ul><ul><ul><li>SPU requests the MFC to get the data </li></ul></ul><ul><ul><li>MFC processes the rest of data transfer </li></ul></ul>
  31. 31. Cell components <ul><li>Mailboxes </li></ul><ul><ul><li>Simplest way to transfer the data between PPU and SPU </li></ul></ul><ul><ul><li>Can only transfer 4 bytes of data </li></ul></ul><ul><ul><li>Provides one-to-one communication </li></ul></ul><ul><ul><li>Mailbox channels </li></ul></ul><ul><ul><ul><li>Outgoing mailbox </li></ul></ul></ul><ul><ul><ul><li>Outgoing interrupt mailbox </li></ul></ul></ul><ul><ul><ul><ul><li>Holds the data for outside world and cause interrupt if applicable </li></ul></ul></ul></ul><ul><ul><ul><li>Incoming mailbox </li></ul></ul></ul>
  32. 32. Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><ul><li>Signals can be sent directly to outside world </li></ul></ul><ul><ul><li>Signals can provide one-to-many style communication </li></ul></ul>
  33. 33. Cell components <ul><li>Events and signals </li></ul><ul><ul><li>Commonly used for DMA notifications </li></ul></ul><ul><ul><li>Signals can be sent directly to outside world </li></ul></ul><ul><ul><li>Signals can provide one-to-many style communication </li></ul></ul>
  34. 34. Software development of Cell <ul><li>Different instruction sets for SPU and PPU </li></ul><ul><li>Different compilers are required to compile the applications for two codes </li></ul><ul><li>Embedding the SPU code in PPU executable </li></ul>
  35. 35. Software development of Cell <ul><li>Tools to compile the application for Cell </li></ul><ul><ul><li>PPU compiler </li></ul></ul><ul><ul><ul><li>ppu-gcc </li></ul></ul></ul><ul><ul><li>SPU compiler </li></ul></ul><ul><ul><ul><li>spu-gcc </li></ul></ul></ul><ul><ul><li>Embed SPU code to PPU </li></ul></ul><ul><ul><ul><li>ppu-embedspu </li></ul></ul></ul>
  36. 36. Software development of Cell <ul><li>Cell simulator </li></ul><ul><ul><li>Full System Simulator </li></ul></ul><ul><ul><li>Emulates all system components </li></ul></ul><ul><ul><li>Can provides cycle accurate information </li></ul></ul><ul><ul><li>Provides graphical interface to se and interact with system components </li></ul></ul>
  37. 37. Software development of Cell IBM Full System Simulator user guide
  38. 38. Software development of Cell <ul><li>Three modes </li></ul><ul><ul><li>Fast mode </li></ul></ul><ul><ul><li>Simple mode </li></ul></ul><ul><ul><li>Cycle mode </li></ul></ul><ul><li>Graphical visualization of SPU and PPU </li></ul><ul><li>Provides debugging and profiling information </li></ul><ul><li>Provides system utilization information </li></ul>
  39. 39. Software development of Cell
  40. 40. Software development of Cell Design and Animation Game Programming Graphics Programming Matthew Scarpino

×