Your SlideShare is downloading. ×
Defense
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Defense

1,102
views

Published on

Published in: Technology, Design

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,102
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms Advisor: Lih-Yih Chiou Student: Hi-Ho Chen 23 June 2008
  • 2. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 3. Introduction
    • Entering SoC era, more and more IPs are integrated onto one single chip
    • ESL (Electronic System Level) design is proposed to rapidly allow designer to simulate the system function behavior at higher level before hardware implementation
    • Communication design has become one of the important criteria for SoC design
  • 4. Top-down Design Flow [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, "Using TLM for exploring bus-based SoC communication architectures," 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
  • 5. Arbitration Level vs. Simulation Speed [2]C. Lennard and D. Mista, "Taking Design to the System Level," 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
  • 6. High Level Synthesis
    • Behavior Synthesis
    • Separate the Control and Data path from the behavior description
      • Control
        • If then else
        • Switch case
      • Data Path
        • Data flow
    [3]SPARK. Methodology, http:// mesl.ucsd.edu/spark/methodology.shtml
  • 7. Contributions
      • Rapid system exploration
        • Fast exploration of multiple micro-architecture alternatives
      • Shorter verification/simulation cycle
        • S peed up with behavior-level to transaction level
      • Quickly obtain the power and performance information
        • Earlier estimation of design specifications
      • Increase the performance
        • Reduce the communication & computation
  • 8. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 9. Previous Works - SPARK (1)
    • Input : C
    • Output C VHDL
    • Advantages :
      • They define a new synthesis tool for parallel design
    • Disadvantages :
      • No platform architecture
      • No communication issue
    [4]SPARK:A High-Level Synthesis Frame work For Applying Parallelizing Compiler Transformations VLSI Design, 2003. Proceedings. 16th International Conference on 4-8 Jan. 2003 Page(s):461 – 466
  • 10. Previous Works - xPilot (2)
    • Input: c/SystemC
    • Output: Verilog/SystemC
    • Method
      • Phase 1
        • SSDM
      • Phase 2
        • Synthesis
    • Advantages:
      • Directly mapping to FPGA
      • Quick Verification
    • Disadvantages:
      • No communication issue
    [5]“Platform-Based Behavior-Level and System-Level Synthesis “ International SOC Conference, 2006 IEEE Sept. 2006 Page(s):199 – 202
  • 11. Previous Works - MFASE (3)
    • MFASE:
    • (Multiple Functions SoCs Analysis Environment)
    • Design Flow
    • HW/SW Partition.
    • Architecture mapping.
    • communication analysis.
    • … ..
    • Advantage
      • HW/SW co-design
    • Limitation
      • IP Data Base
    [6]MFASE: Multiple Functions SoCs Analysis Environment the VLSI Desing/CAD Symposium, Taiwan, Augest 2007
  • 12. Summary
    • Previous works
      • Synthesis tool
        • SPARK & xPilot Synthesis from hardware C code to RTL Verilog code
        • SPARK & xPilot did not consider communication issue
        • MFASE did not mention about how to generate automatically
    • Thesis
        • Building a automation tool from Functional Level to Transaction Level for virtual Bus-based Platform
          • Computation & Communication issues
          • Automation tool from Behavior Level to Transaction Level
  • 13. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 14. Representation
    • Example C to CDFG
    • Example for “If the else”
    • Example for
    • “ for loop”
  • 15. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 16. Design Flow Overview 1/2
  • 17. Design Flow Overview 2/2
    • Block Level
      • Methodology
        • Parallel
        • Cascade (Multi cycle)
      • Translation
        • State & Edge Reduction
        • STG to SystemC generator
    • Platform Level using Simple Bus
      • Approximate time simulation
    • Platform Level using CoWare
      • *.tcl generator
      • Peripheral generator
  • 18. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 19. Block Level
    • Input
      • Functional Level CDFG
      • Block Inside Configuration
        • Max Parallel deep
        • Buffer Size
        • Boundary Case
      • Block to Bus Configuration
        • Max Burst size
        • Initial Address
        • Address offset
    • Output
      • TLM SystemC
  • 20. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop library
        • System Control generator
    • Experiment
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 21. Block Level - Methodology 1/10
    • Computation Reduction
    • Parallel analysis
      • Step 1: C to CDFG format
      • Step 2 : un-rolling the “for loop” to know the cycle counts
      • Step 3 : find the Solution to fit the “for loop” condition
        • Under Hardware constrain
        • GCD Methodology
      • Step 4: We will find the closed solution based on the Hardware condition
      • Step 5: update CDFG
    for(j=0;j< 2 ;j++){ for(i=3;i< 7 ;i++){ b[j][i] = (a[j][i]+a[j][i+1])>>1; } }
  • 22. Block Level – Methodology 2/10
    • Communication factors
      • We assume the array will be located in the external Memory
      • How can we get data from external memory?
      • Bus Transform
        • Single
        • Burst
      • Buffer Size requirement
      • Parallel & size of data transformation will influence the performance and power
    Burst New Transform Read Write
  • 23. Block Level - Methodology 3/10
    • Communication Reduction
  • 24. Block Level - Methodology 4/10
    • Case 1:
      • parallel deep 2 operator 1 cycle
      • Irregularity: 1 Buss Access times: Read : 4 Write : 4
      • Max Buffer Size usage :3
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 25. Block Level - Methodology 5/10
    • Case 2 :
      • parallel deep 2 operator 2 cycles
      • Irregularity : 1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :5
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 26. Block Level - Methodology 6/10
    • Case 3:
      • parallel deep 2 operator 3 cycles
      • Irregularity : 2 Bus Access times: Read 3: Write 3 Max Buffer Size usage :8
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 27. Block Level - Methodology 7/10
    • Boundary case
    • Limitation: high address relation
    • Relation with the Memory location
  • 28. Block Level - Methodology 8/10
    • Case 4:
      • parallel deep 2 operator 4 cycles
      • Irregularity :1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :10
    B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
  • 29. Block Level - Methodology 9/10
    • Which case is better for implement?
    • Problem
      • Case 1
        • single operator cycle
        • Bus Access times
      • Case 3
        • Control is so complexity
        • No considering the Boundary case
      • Case 4
        • Buffer size
    • We choose “Case 2” to implement
      • Under Boundary case condition
      • Under Buffer size constrain
      • Bus Access issue
      • regular
    2 3 2 4 Write Bus Access times O X O O Boundary Case 4 3 1 Case 1 Read Bus Access times Max Buffer size Irregularity 2 5 1 Case 2 1 2 10 8 2 3 Case 4 Case 3
  • 30. Block Level - Methodology 10/10
    • Under condition
      • Parallel deep
      • Boundary Case
    • Analysis
      • Step 1: Trace states by operator cycles
      • Step 2: separate the Read and Write part, find the period
      • Step 3: estimation the cycles and hardware cost
      • Step 4: find the best solution
    O(): operator cycles B(): buffer size R(): Read counts W(): Write counts S(): state sizes Ir(): Irregularity case1 case2 case3 case4
  • 31. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 32. Translation 1/3
    • Example for CDFG to state transaction graph (STG)
      • Fit to time step
      • Easily to FSM Generator
    Example for ”If then else” Example for ”for loop”
  • 33. Translation 2/3
    • Step 1
      • CDFG to STG
      • Un-rolling “for loop” condition
    • Step 2
      • Methodology
        • Reduce Computation
          • Parallel
        • Reduce Communication
          • Cascade
      • Architecture definition
    • Step 3
      • Translate to TLM SystemC
        • Header
        • Function
  • 34. Translation 3/3
    • Block Level
    • Interface
      • Block to Wrapper
      • Block to Block
    • Control
      • FSM
    • Data path
      • Operator assignment
      • Control signals
        • Block to Wrapper
        • Block to Data path
        • Block to Buffer
  • 35. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 36. Platform Level
    • Input :
      • Port mapping
      • Library location
      • CoWare setting
    • Output
      • *.tcl for CoWare based
    • Communication Generator
      • System Control
      • Wrapper
      • Mux
      • PMU
      • Interrupt
  • 37. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 38. Develop Library for CoWare 1/3
    • Master Wrapper Generator
      • Base on CoWare API for AMBA AHB
    • Advantage
      • Support any burst type
      • Burst Lock
    • Limitation
      • Buffer size
  • 39. Develop Library for CoWare 2/3
    • PMU Generator
    • Input :Configure
      • Block Num: Default 3
      • Idle cycle: Default 1000
      • Wake Up cycle: Default 1000
      • Policy: fixed-time out Policy
    • Output : SystemC
  • 40. Develop Library for CoWare 3/3
    • Known parameters
      • Total simulation time
      • Operation frequency
      • Active duration
      • Total active number
    ACT Energy Idle Energy Total energy = (ACT Energy + Idle Energy) Power= (ACT Energy + Idle Energy)/total time Active power/unit time Number of Active counts
    • Power Calculation
    Number of Idle counts Idle power/unit time
  • 41. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 42. System Control Generator
    • TOP Control Generator
      • Input
        • Block scheduling
        • Block numbers
        • Type setting
          • Parallel
          • Pipeline
          • Single (Default)
      • Output
        • SystemC
  • 43. Outline
    • Motivation and Contributions
    • Previous Works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow Overview
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop Library for CoWare
        • System Control Generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future works
    • References
  • 44. CoWare - Scalar
    • Sequence : Foreman, Football(30 frames)
  • 45. Simple Bus Environment - Scalar SystemC 2.1 Simple bus Read Transfer Write Transfer
  • 46. CoWare Environment -Scalar
    • Top Platform for scalar application
    Step 1 Step2 Step3 Step4 Step 5
  • 47. Experiments – Scalar
    • Performance with app-time and cycle time
    • Scalar performance && State size in Cycle time base
    100638 100638 325296 Cycle time 91775 91775 239761 Approximate time cycle cycle cycle Cr part Cb part Y part scalar Scalar Y part Parallel constrain 4 33388 115118 0 Communication Cycle 1668 31680 1724 81 11 case 3 1403 31680 9916 78 4 case 2 23 126720 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
  • 48. Experiments – Power Monitor
    • Power Library
    • Method
      • Search the Look up table
      • Block -> Module
        • FSM switch
        • InBuffer
        • OutBuffer
        • Register
      • Block ->Data Path
        • Operator
    Data Path 23.4124 nw 1.0444 mw 8 ADD 21.3216 nW 808.2718 uW 8 SUB 67.5333 nW 4.0100 mW 8 DIV 9.9244 nW 425.8246 uW 8 SHR Size Idle power Active power 66 nw 1.7346mw 32 Buffer 1.7346mw 0.418mw Power power 32 6 width 12 nw FSM 66 nw Register Idle power
  • 49. Experiments - Scalar
    • Scalar176*144 Power saving
    2124065.68 423934 1000 101638 WITH PMU 11000089.08 X X 526572 NO PMU Scalar Cb 11000089.08 X X 526572 NO PMU Scalar Cr 14038522.54 199276 1000 326296 WITH PMU Scalar Y 2124065.68 22584673.08 Power mw 423934 1000 101638 WITH PMU X X 526572 NO PMU Sleep Cycle Wake up Cycle Active Cycle Case 18286653.9mw with PMU 44584851.24mw No PMU 58.98% Scalar Power Saving Rate
  • 50.
    • DWT && IDWT
    Experiments - DWT DWT IDWT
  • 51. Experiments - DWT
    • Top Platform for DWT application
    Step 1 Step 2 Step 3 Step 4
  • 52. Experiments - DWT
    • Performance with app-time and cycle time
    • DWT performance && State size in Cycle time base
    76262 Cycle time 11088 Approximate time cycle DWT DWT Parallel constrain 1 74678 0 Communication Cycle 8630 1584 9504 42 1 case 1 46 1584 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
  • 53. Experiments - DWT
    • DWT 44*36 Power saving
    2155442.52 68600 1000 76362 WITH PMU 4066501.32 X X 145962 NO PMU DWT IDWT 2105550.72 75362 1000 69600 WITH PMU 4415350.53 X X 145962 NO PMU Power mw Sleep Cycle Interrupt Cycle Active Cycle 4260993.24mw With PMU 8481851.85mw NO PMU 49.765% DWT Power Saving Rate
  • 54. Outline
    • Motivation and Contributions
    • Previous works
    • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
      • Representation
      • Design Flow
      • Block Level
        • Methodology
        • Translation
      • Platform Level
        • Develop library
        • System Control generator
    • Experiments
      • Scalar 176*144
      • DWT 44*36
    • Conclusions and Future Works
    • References
  • 55. Conclusions
    • We develop a Automation tool from behavior level CDFG to TLM level SystemC for virtual bus based platform design
    • We have also incorporated some method to reduce the Bus Access times for the system design at the Architecture level profiling
    • We develop some library for virtual bus based platform
    • We can fast explore the Architecture to reduce the verification time
  • 56. Future Works
    • Model each module’s power using equations so that a more accurate power management could be carried out
    • Adding a test platform into the tool so that the corresponding test circuitry could be generated automatically
    • Including more hardware architectures to extend the Hardware Library so that designer can have more design options to choose
  • 57. References
    • [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, &quot;Using TLM for exploring bus-based SoC communication architectures,&quot; 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
    • [2]C. Lennard and D. Mista, &quot;Taking Design to the System Level,&quot; 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
    • [3] SPARK Methodology, (http://mesl.ucsd.edu/spark/methodology.shtml)
    • [4] S. Gupta, S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, &quot;SPARK: a high-level synthesis framework for applying parallelizing compiler transformations,&quot; Proceedings of 16th International Conference on VLSI Design, 2003 , pp. 461-466, 2003
    • [5] J. Cong, F. Yiping, H. Guoling, J. Wei, and Z. Zhiru, &quot;Platform-Based Behavior-Level and System-Level Synthesis,&quot; in IEEE International SOC Conference, 2006 , pp. 199-202, 2006
    • [6] Ya-Shu Chen, Shih-Chun Chou, Chi-Sheng Shih and Tei-Wei Kuo, &quot;MFASE: Multiple Functions SoCs Analysis Environment,&quot; in the VLSI Design/CAD Symposium, Taiwan, August 2007, 2007
  • 58.
    • Thank you