HPC Infrastructure To Solve The CFD Grand Challenge
Summary Of Academic Projects
1. SetiawanSoekamtoputra Summary of Academic Projects(spring 2009-Spring 2010) Master of Electrical and computer engineerin Illinois Institute of technology december 2010 graduate 1
2. COntents 32‐bit Pipelined CPU with Multiplier Accumulator and Pipeline Optimization Simple MC68000-based Monitor Program High-Performance Pipelined MIPS Processor Design Mesh-like Network on Chip Prototype Design Ring-like Network on Chip Prototype Design Small Office Network Design Prototype 4-bit 10T Adder Circuit With dual-threshold Logic Design Single-Ended 6T versus Standard 6T SRAM Bitcell Design Comparison 2
3. 32‐bit Pipelined CPU with Multiplier Accumulator andPipeline Optimization Class: VLSI Design instructor: Prof. Ken Choi Requirements/Specifications Modify existing multiplier functional unit into multiplier with accumulator unit in an existing CPU design Apply pipeline to optimize the new functional unit Hardware Description Language Verilog Tools Synopsys Design Compiler Cadence’s SimVision and SOC Encounter Mentor Graphic’s Modelsim 3
4. 32‐bit Pipelined CPU with Multiplier Accumulator andPipeline Optimization (cont’d) Block Diagram of the functional unit 4
5. 32‐bit Pipelined CPU with Multiplier Accumulator andPipeline Optimization (cont’d) Provided: Multiplier accumulator block diagram Simple CPU design written in verilog All required tools Implementation Construct fore-mentioned unit in verilog and modify the design to fit new unit Apply numbers of registers for pipelining Design functionality Test Verify in sumulation that function F= (-10)* 5 + (-60)*2 + (-60)*8 outputs the correct result 5
6. 32‐bit Pipelined CPU with Multiplier Accumulator andPipeline Optimization (cont’d) Results 6
7. 32‐bit Pipelined CPU with Multiplier Accumulator andPipeline Optimization (cont’d) Additional Analysis Result Finding the maximum frequency Expected maximum frequency of the design: 58 MHz Frequency vs. area vs. power consumption 7
8. Simple mc68000-based monitor program Class: Microprocessor instructor: Dr. JafarSaniie Requirements/Specifications Construct a simple monitor program for MC68000 processor that allows user to execute common memory and register accesses, basic exception handlers. Language 68000 assembly language Tools Easy68k Editor/Assembler/Simulator 8
12. High-performance pipelined mips processor design Class: Computer Architecture instructor: Prof. Jia Wang Requirements/Specifications Design a MIPS processor with pipeline, data forwarding, and hazard handling capabilities Language VHDL Tools Modelsim PE 6.5 MARS 3.6 MIPS Simulator Provided: Data memory unit design Testbench code 12
16. Mesh-like Network on chip prototype design Class: Hardware/Software Co-design (Project 2) instructor: Prof. Jia Wang Requirements/Specifications Build a simple mesh-like NoC architecture Verify the correctness of node-to-node communication Language SystemC Tools Microsoft Visual C++ Provided: Ring-like NoC architecture design codes 16
18. Mesh-like Network on chip prototype design(cont’d) Results Generated packets Result shows packets are delivered 18
19. Mesh-like Network on chip prototype design (cont’d) Results Delays due to the fact that only one packet is delivered to processing element PE at a time 19
20. Ring-like Network on chip prototype design Class: Hardware/Software Co-design (Project 1) instructor: Prof. Jia Wang Requirements/Specifications Extend two-node ring NoC architecture design into three nodes Create new function for the new node Language SystemC Tools Microsoft Visual C++ Provided: Two-node Ring-like NoC architecture design codes 20
21. ring-like Network on chip prototype design (cont’d) Three-node NoC System Diagram Third node function (called PE_dumpbox) It receives all packets that cannot be processed by the destination processing unit due to overloading in the network 21
22. ring-like Network on chip prototype design(cont’d) Results Overload in Router 1 network buffer 3rd processing unit PE_dumpbox receives packet 22
23. Small office network design prototype Class: Intro to Computer Networks instructor: Dr. Tricha Anjali Requirements/Specifications Propose a prototype of 2-story small office computer network capable of serving 20 users with three department LANs, four servers and wireless Internet Language N/A Tools Microsoft Visio Provided: None 23
24. Small office network design prototype (cont’d) Proposed configurations IP address allocation 24
26. Small office network design prototype (cont’d) Office Layout 26 2nd floor Colored arrows show how cables are managed 1st floor
27. 4-bit 10t adder circuit with dual-vt logic design Class: Advanced VLSI Design Instructor: Prof. ErdalOruklu Requirements Performance (delay and power consumption) comparison for 10T Adder Circuit using high-threshold (Vt), low-Vt, and dual-Vt transistors (simulations are done by using 45nm technology node) Tools Cadence Virtuoso Schematic Design Synopsys HSPICE Simulator Nanosim Simulator Provided Adder circuit is based on: J. Lin, M. Sheu, and C.Ho. A Novel High-Speed and Energy Efficient 10-Transistor Full Adder Design. IEEE Trans. on Circuits and Systems, May 2007. 27
28. 4-bit 10t adder circuit with dual-vt logic design Logic Equation Sum = (A XNOR B).Cin + (A XOR B). Cin_bar Cout= (A XOR B) .Cin + (A XNOR B).A Design Components Inverter (left) and multiplexer (right) 28
29. 4-bit 10t adder circuit with dual-vt logic design 1-bit Full Adder (consisting of multiplexers and inversters) and its symbol 4-bit Full Adder 29
30. 4-bit 10t adder circuit with dual-vt logic design Methodology Using combination of input vector to measure delay and power consumptions Delay : Switching delay between least significant bit (bit 0) and most significant bit (bit 3) Power : Average and maximum power during simulation Results Delay (in seconds) 30
31. 4-bit 10t adder circuit with dual-vt logic design Results Power consumption (in Watt) 31
32. Single-ended 6t vs. standard 6t srambitcell design comparison Class: High Performance VLSI/IC Systems Instructor: Prof. Ken Choi Requirements Power consumption and delay comparison between standard 6-Transistor SRAM and Single-Ended 6T SRAM Tools Cadence Virtuoso Schematic Design Synopsys HSPICE Simulator Provided Single-ended SRAM bitcell design from: J. Singh, et al. Single Ended 6T SRAM with Isolated Read-Port for Low-Power Embedded Systems. IEEE. 2009 32
33. Single-ended 6t vs. standard 6t srambitcell design comparison Standard SRAM Design (using Cadence Virtuoso) 33
34. Single-ended 6t vs. standard 6t srambitcell design comparison Single-Ended SRAM Design 34
35. Single-ended 6t vs. standard 6t srambitcell design comparison Comparison Results Write Delay 35 [3] Y. Chang, F. Lai, C. Yang. Zero-Aware Asymmetric SRAM Cell for Reducing Cache Power in Writing Zero. IEEE Trans. On VLSI Systems, Vol.12, No.8, August 2004.
36. Single-ended 6t vs. standard 6t srambitcell design comparison Comparison Results Power Consumption Comparison 36