Defense
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
543
On Slideshare
543
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms Advisor: Lih-Yih Chiou Student: Hi-Ho Chen 23 June 2008
  • 2. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 3. Introduction
    • Entering SoC era, more and more IPs are integrated onto one single chip
    • ESL (Electronic System Level) design is proposed to rapidly allow designer to simulate the system function behavior at higher level before hardware implementation
    • Communication design has become one of the important criteria for SoC design
  • 4. Top-down Design Flow [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, "Using TLM for exploring bus-based SoC communication architectures," 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
  • 5. Arbitration Level vs. Simulation Speed [2]C. Lennard and D. Mista, "Taking Design to the System Level," 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
  • 6. High Level Synthesis
    • Behavior Synthesis
    • Separate the Control and Data path from the behavior description
      • Control
        • If then else
        • Switch case
      • Data Path
        • Data flow
    [3]SPARK. Methodology, http:// mesl.ucsd.edu/spark/methodology.shtml
  • 7. Contributions
      • Rapid system exploration
        • Fast exploration of multiple micro-architecture alternatives
      • Shorter verification/simulation cycle
        • S peed up with behavior-level to transaction level
      • Quickly obtain the power and performance information
        • Earlier estimation of design specifications
      • Increase the performance
        • Reduce the communication & computation
  • 8. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 9. Previous Works - SPARK (1)
    • Input : C
    • Output C VHDL
    • Advantages :
      • They define a new synthesis tool for parallel design
    • Disadvantages :
      • No platform architecture
      • No communication issue
    [4]SPARK:A High-Level Synthesis Frame work For Applying Parallelizing Compiler Transformations VLSI Design, 2003. Proceedings. 16th International Conference on 4-8 Jan. 2003 Page(s):461 – 466
  • 10. Previous Works - xPilot (2)
    • Input: c/SystemC
    • Output: Verilog/SystemC
    • Method
      • Phase 1
        • SSDM
      • Phase 2
        • Synthesis
    • Advantages:
      • Directly mapping to FPGA
      • Quick Verification
    • Disadvantages:
      • No communication issue
    [5]“Platform-Based Behavior-Level and System-Level Synthesis “ International SOC Conference, 2006 IEEE Sept. 2006 Page(s):199 – 202
  • 11. Previous Works - MFASE (3)
    • MFASE:
    • (Multiple Functions SoCs Analysis Environment)
    • Design Flow
    • HW/SW Partition.
    • Architecture mapping.
    • communication analysis.
    • … ..
    • Advantage
      • HW/SW co-design
    • Limitation
      • IP Data Base
    [6]MFASE: Multiple Functions SoCs Analysis Environment the VLSI Desing/CAD Symposium, Taiwan, Augest 2007
  • 12. Summary
    • Previous works
      • Synthesis tool
        • SPARK & xPilot Synthesis from hardware C code to RTL Verilog code
        • SPARK & xPilot did not consider communication issue
        • MFASE did not mention about how to generate automatically
    • Thesis
        • Building a automation tool from Functional Level to Transaction Level for virtual Bus-based Platform
          • Computation & Communication issues
          • Automation tool from Behavior Level to Transaction Level
  • 13. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 14. Representation
    • Example C to CDFG
    • Example for “If the else”
    • Example for
    • “ for loop”
  • 15. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 16. Design Flow Overview 1/2
  • 17. Design Flow Overview 2/2
    • Block Level
      • Methodology
        • Parallel
        • Cascade (Multi cycle)
      • Translation
        • State & Edge Reduction
        • STG to SystemC generator
    • Platform Level using Simple Bus
      • Approximate time simulation
    • Platform Level using CoWare
      • *.tcl generator
      • Peripheral generator
  • 18. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 19. Block Level
    • Input
      • Functional Level CDFG
      • Block Inside Configuration
        • Max Parallel deep
        • Buffer Size
        • Boundary Case
      • Block to Bus Configuration
        • Max Burst size
        • Initial Address
        • Address offset
    • Output
      • TLM SystemC
  • 20. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop library
        • System Control generator
    • Experiment
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 21. Block Level - Methodology 1/10
    • Computation Reduction
    • Parallel analysis
      • Step 1: C to CDFG format
      • Step 2 : un-rolling the “for loop” to know the cycle counts
      • Step 3 : find the Solution to fit the “for loop” condition
        • Under Hardware constrain
        • GCD Methodology
      • Step 4: We will find the closed solution based on the Hardware condition
      • Step 5: update CDFG
    for(j=0;j< 2 ;j++){ for(i=3;i< 7 ;i++){ b[j][i] = (a[j][i]+a[j][i+1])>>1; } }
  • 22. Block Level – Methodology 2/10
    • Communication factors
      • We assume the array will be located in the external Memory
      • How can we get data from external memory?
      • Bus Transform
        • Single
        • Burst
      • Buffer Size requirement
      • Parallel & size of data transformation will influence the performance and power
    Burst New Transform Read Write
  • 23. Block Level - Methodology 3/10
    • Communication Reduction
  • 24. Block Level - Methodology 4/10
    • Case 1:
      • parallel deep 2 operator 1 cycle
      • Irregularity: 1 Buss Access times: Read : 4 Write : 4
      • Max Buffer Size usage :3
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 25. Block Level - Methodology 5/10
    • Case 2 :
      • parallel deep 2 operator 2 cycles
      • Irregularity : 1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :5
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 26. Block Level - Methodology 6/10
    • Case 3:
      • parallel deep 2 operator 3 cycles
      • Irregularity : 2 Bus Access times: Read 3: Write 3 Max Buffer Size usage :8
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 27. Block Level - Methodology 7/10
    • Boundary case
    • Limitation: high address relation
    • Relation with the Memory location
  • 28. Block Level - Methodology 8/10
    • Case 4:
      • parallel deep 2 operator 4 cycles
      • Irregularity :1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :10
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 29. Block Level - Methodology 9/10
    • Which case is better for implement?
    • Problem
      • Case 1
        • single operator cycle
        • Bus Access times
      • Case 3
        • Control is so complexity
        • No considering the Boundary case
      • Case 4
        • Buffer size
    • We choose “Case 2” to implement
      • Under Boundary case condition
      • Under Buffer size constrain
      • Bus Access issue
      • regular
    2 3 2 4 Write Bus Access times O X O O Boundary Case 4 3 1 Case 1 Read Bus Access times Max Buffer size Irregularity 2 5 1 Case 2 1 2 10 8 2 3 Case 4 Case 3
  • 30. Block Level - Methodology 10/10
    • Under condition
      • Parallel deep
      • Boundary Case
    • Analysis
      • Step 1: Trace states by operator cycles
      • Step 2: separate the Read and Write part, find the period
      • Step 3: estimation the cycles and hardware cost
      • Step 4: find the best solution
    O(): operator cycles B(): buffer size R(): Read counts W(): Write counts S(): state sizes Ir(): Irregularity case1 case2 case3 case4
  • 31. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 32. Translation 1/3
    • Example for CDFG to state transaction graph (STG)
      • Fit to time step
      • Easily to FSM Generator
    Example for ”If then else” Example for ”for loop”
  • 33. Translation 2/3
    • Step 1
      • CDFG to STG
      • Un-rolling “for loop” condition
    • Step 2
      • Methodology
        • Reduce Computation
          • Parallel
        • Reduce Communication
          • Cascade
      • Architecture definition
    • Step 3
      • Translate to TLM SystemC
        • Header
        • Function
  • 34. Translation 3/3
    • Block Level
    • Interface
      • Block to Wrapper
      • Block to Block
    • Control
      • FSM
    • Data path
      • Operator assignment
      • Control signals
        • Block to Wrapper
        • Block to Data path
        • Block to Buffer
  • 35. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 36. Platform Level
    • Input :
      • Port mapping
      • Library location
      • CoWare setting
    • Output
      • *.tcl for CoWare based
    • Communication Generator
      • System Control
      • Wrapper
      • Mux
      • PMU
      • Interrupt
  • 37. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 38. Develop Library for CoWare 1/3
    • Master Wrapper Generator
      • Base on CoWare API for AMBA AHB
    • Advantage
      • Support any burst type
      • Burst Lock
    • Limitation
      • Buffer size
  • 39. Develop Library for CoWare 2/3
    • PMU Generator
    • Input :Configure
      • Block Num: Default 3
      • Idle cycle: Default 1000
      • Wake Up cycle: Default 1000
      • Policy: fixed-time out Policy
    • Output : SystemC
  • 40. Develop Library for CoWare 3/3
    • Known parameters
      • Total simulation time
      • Operation frequency
      • Active duration
      • Total active number
    ACT Energy Idle Energy Total energy = (ACT Energy + Idle Energy) Power= (ACT Energy + Idle Energy)/total time Active power/unit time Number of Active counts
    • Power Calculation
    Number of Idle counts Idle power/unit time
  • 41. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 42. System Control Generator
    • TOP Control Generator
      • Input
        • Block scheduling
        • Block numbers
        • Type setting
          • Parallel
          • Pipeline
          • Single (Default)
      • Output
        • SystemC
  • 43. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 44. CoWare - Scalar
    • Sequence : Foreman, Football(30 frames)
  • 45. Simple Bus Environment - Scalar SystemC 2.1 Simple bus Read Transfer Write Transfer
  • 46. CoWare Environment -Scalar
    • Top Platform for scalar application
    Step 1 Step2 Step3 Step4 Step 5
  • 47. Experiments – Scalar
    • Performance with app-time and cycle time
    • Scalar performance && State size in Cycle time base
    100638 100638 325296 Cycle time 91775 91775 239761 Approximate time cycle cycle cycle Cr part Cb part Y part scalar Scalar Y part Parallel constrain 4 33388 115118 0 Communication Cycle 1668 31680 1724 81 11 case 3 1403 31680 9916 78 4 case 2 23 126720 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
  • 48. Experiments – Power Monitor
    • Power Library
    • Method
      • Search the Look up table
      • Block -> Module
        • FSM switch
        • InBuffer
        • OutBuffer
        • Register
      • Block ->Data Path
        • Operator
    Data Path 23.4124 nw 1.0444 mw 8 ADD 21.3216 nW 808.2718 uW 8 SUB 67.5333 nW 4.0100 mW 8 DIV 9.9244 nW 425.8246 uW 8 SHR Size Idle power Active power 66 nw 1.7346mw 32 Buffer 1.7346mw 0.418mw Power power 32 6 width 12 nw FSM 66 nw Register Idle power
  • 49. Experiments - Scalar
    • Scalar176*144 Power saving
    2124065.68 423934 1000 101638 WITH PMU 11000089.08 X X 526572 NO PMU Scalar Cb 11000089.08 X X 526572 NO PMU Scalar Cr 14038522.54 199276 1000 326296 WITH PMU Scalar Y 2124065.68 22584673.08 Power mw 423934 1000 101638 WITH PMU X X 526572 NO PMU Sleep Cycle Wake up Cycle Active Cycle Case 18286653.9mw with PMU 44584851.24mw No PMU 58.98% Scalar Power Saving Rate
  • 50.
    • DWT && IDWT
    Experiments - DWT DWT IDWT
  • 51. Experiments - DWT
    • Top Platform for DWT application
    Step 1 Step 2 Step 3 Step 4
  • 52. Experiments - DWT
    • Performance with app-time and cycle time
    • DWT performance && State size in Cycle time base
    76262 Cycle time 11088 Approximate time cycle DWT DWT Parallel constrain 1 74678 0 Communication Cycle 8630 1584 9504 42 1 case 1 46 1584 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
  • 53. Experiments - DWT
    • DWT 44*36 Power saving
    2155442.52 68600 1000 76362 WITH PMU 4066501.32 X X 145962 NO PMU DWT IDWT 2105550.72 75362 1000 69600 WITH PMU 4415350.53 X X 145962 NO PMU Power mw Sleep Cycle Interrupt Cycle Active Cycle 4260993.24mw With PMU 8481851.85mw NO PMU 49.765% DWT Power Saving Rate
  • 54. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop library
        • System Control generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future Works
    • References
  • 55. Conclusions
    • We develop a Automation tool from behavior level CDFG to TLM level SystemC for virtual bus based platform design
    • We have also incorporated some method to reduce the Bus Access times for the system design at the Architecture level profiling
    • We develop some library for virtual bus based platform
    • We can fast explore the Architecture to reduce the verification time
  • 56. Future Works
    • Model each module’s power using equations so that a more accurate power management could be carried out
    • Adding a test platform into the tool so that the corresponding test circuitry could be generated automatically
    • Including more hardware architectures to extend the Hardware Library so that designer can have more design options to choose
  • 57. References
    • [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, &quot;Using TLM for exploring bus-based SoC communication architectures,&quot; 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
    • [2]C. Lennard and D. Mista, &quot;Taking Design to the System Level,&quot; 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
    • [3] SPARK Methodology, (http://mesl.ucsd.edu/spark/methodology.shtml)
    • [4] S. Gupta, S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, &quot;SPARK: a high-level synthesis framework for applying parallelizing compiler transformations,&quot; Proceedings of 16th International Conference on VLSI Design, 2003 , pp. 461-466, 2003
    • [5] J. Cong, F. Yiping, H. Guoling, J. Wei, and Z. Zhiru, &quot;Platform-Based Behavior-Level and System-Level Synthesis,&quot; in IEEE International SOC Conference, 2006 , pp. 199-202, 2006
    • [6] Ya-Shu Chen, Shih-Chun Chou, Chi-Sheng Shih and Tei-Wei Kuo, &quot;MFASE: Multiple Functions SoCs Analysis Environment,&quot; in the VLSI Design/CAD Symposium, Taiwan, August 2007, 2007
  • 58.
    • Thank you