• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Defense
 

Defense

on

  • 471 views

 

Statistics

Views

Total Views
471
Views on SlideShare
471
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Defense Defense Presentation Transcript

    • Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms Advisor: Lih-Yih Chiou Student: Hi-Ho Chen 23 June 2008
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Introduction
      • Entering SoC era, more and more IPs are integrated onto one single chip
      • ESL (Electronic System Level) design is proposed to rapidly allow designer to simulate the system function behavior at higher level before hardware implementation
      • Communication design has become one of the important criteria for SoC design
    • Top-down Design Flow [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, "Using TLM for exploring bus-based SoC communication architectures," 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
    • Arbitration Level vs. Simulation Speed [2]C. Lennard and D. Mista, "Taking Design to the System Level," 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
    • High Level Synthesis
      • Behavior Synthesis
      • Separate the Control and Data path from the behavior description
        • Control
          • If then else
          • Switch case
        • Data Path
          • Data flow
      [3]SPARK. Methodology, http:// mesl.ucsd.edu/spark/methodology.shtml
    • Contributions
        • Rapid system exploration
          • Fast exploration of multiple micro-architecture alternatives
        • Shorter verification/simulation cycle
          • S peed up with behavior-level to transaction level
        • Quickly obtain the power and performance information
          • Earlier estimation of design specifications
        • Increase the performance
          • Reduce the communication & computation
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Previous Works - SPARK (1)
      • Input : C
      • Output C VHDL
      • Advantages :
        • They define a new synthesis tool for parallel design
      • Disadvantages :
        • No platform architecture
        • No communication issue
      [4]SPARK:A High-Level Synthesis Frame work For Applying Parallelizing Compiler Transformations VLSI Design, 2003. Proceedings. 16th International Conference on 4-8 Jan. 2003 Page(s):461 – 466
    • Previous Works - xPilot (2)
      • Input: c/SystemC
      • Output: Verilog/SystemC
      • Method
        • Phase 1
          • SSDM
        • Phase 2
          • Synthesis
      • Advantages:
        • Directly mapping to FPGA
        • Quick Verification
      • Disadvantages:
        • No communication issue
      [5]“Platform-Based Behavior-Level and System-Level Synthesis “ International SOC Conference, 2006 IEEE Sept. 2006 Page(s):199 – 202
    • Previous Works - MFASE (3)
      • MFASE:
      • (Multiple Functions SoCs Analysis Environment)
      • Design Flow
      • HW/SW Partition.
      • Architecture mapping.
      • communication analysis.
      • … ..
      • Advantage
        • HW/SW co-design
      • Limitation
        • IP Data Base
      [6]MFASE: Multiple Functions SoCs Analysis Environment the VLSI Desing/CAD Symposium, Taiwan, Augest 2007
    • Summary
      • Previous works
        • Synthesis tool
          • SPARK & xPilot Synthesis from hardware C code to RTL Verilog code
          • SPARK & xPilot did not consider communication issue
          • MFASE did not mention about how to generate automatically
      • Thesis
          • Building a automation tool from Functional Level to Transaction Level for virtual Bus-based Platform
            • Computation & Communication issues
            • Automation tool from Behavior Level to Transaction Level
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Representation
      • Example C to CDFG
      • Example for “If the else”
      • Example for
      • “ for loop”
    • Outline
      • Motivation and Contributions
      • Previous works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Design Flow Overview 1/2
    • Design Flow Overview 2/2
      • Block Level
        • Methodology
          • Parallel
          • Cascade (Multi cycle)
        • Translation
          • State & Edge Reduction
          • STG to SystemC generator
      • Platform Level using Simple Bus
        • Approximate time simulation
      • Platform Level using CoWare
        • *.tcl generator
        • Peripheral generator
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Block Level
      • Input
        • Functional Level CDFG
        • Block Inside Configuration
          • Max Parallel deep
          • Buffer Size
          • Boundary Case
        • Block to Bus Configuration
          • Max Burst size
          • Initial Address
          • Address offset
      • Output
        • TLM SystemC
    • Outline
      • Motivation and Contributions
      • Previous works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop library
          • System Control generator
      • Experiment
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Block Level - Methodology 1/10
      • Computation Reduction
      • Parallel analysis
        • Step 1: C to CDFG format
        • Step 2 : un-rolling the “for loop” to know the cycle counts
        • Step 3 : find the Solution to fit the “for loop” condition
          • Under Hardware constrain
          • GCD Methodology
        • Step 4: We will find the closed solution based on the Hardware condition
        • Step 5: update CDFG
      for(j=0;j< 2 ;j++){ for(i=3;i< 7 ;i++){ b[j][i] = (a[j][i]+a[j][i+1])>>1; } }
    • Block Level – Methodology 2/10
      • Communication factors
        • We assume the array will be located in the external Memory
        • How can we get data from external memory?
        • Bus Transform
          • Single
          • Burst
        • Buffer Size requirement
        • Parallel & size of data transformation will influence the performance and power
      Burst New Transform Read Write
    • Block Level - Methodology 3/10
      • Communication Reduction
    • Block Level - Methodology 4/10
      • Case 1:
        • parallel deep 2 operator 1 cycle
        • Irregularity: 1 Buss Access times: Read : 4 Write : 4
        • Max Buffer Size usage :3
      B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
    • Block Level - Methodology 5/10
      • Case 2 :
        • parallel deep 2 operator 2 cycles
        • Irregularity : 1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :5
      B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
    • Block Level - Methodology 6/10
      • Case 3:
        • parallel deep 2 operator 3 cycles
        • Irregularity : 2 Bus Access times: Read 3: Write 3 Max Buffer Size usage :8
      B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
    • Block Level - Methodology 7/10
      • Boundary case
      • Limitation: high address relation
      • Relation with the Memory location
    • Block Level - Methodology 8/10
      • Case 4:
        • parallel deep 2 operator 4 cycles
        • Irregularity :1 Bus Access times: Read 2: Write 2 Max Buffer Size usage :10
      B(): Burst size T(): Transaction number R: Read from bus W: Write to bus
    • Block Level - Methodology 9/10
      • Which case is better for implement?
      • Problem
        • Case 1
          • single operator cycle
          • Bus Access times
        • Case 3
          • Control is so complexity
          • No considering the Boundary case
        • Case 4
          • Buffer size
      • We choose “Case 2” to implement
        • Under Boundary case condition
        • Under Buffer size constrain
        • Bus Access issue
        • regular
      2 3 2 4 Write Bus Access times O X O O Boundary Case 4 3 1 Case 1 Read Bus Access times Max Buffer size Irregularity 2 5 1 Case 2 1 2 10 8 2 3 Case 4 Case 3
    • Block Level - Methodology 10/10
      • Under condition
        • Parallel deep
        • Boundary Case
      • Analysis
        • Step 1: Trace states by operator cycles
        • Step 2: separate the Read and Write part, find the period
        • Step 3: estimation the cycles and hardware cost
        • Step 4: find the best solution
      O(): operator cycles B(): buffer size R(): Read counts W(): Write counts S(): state sizes Ir(): Irregularity case1 case2 case3 case4
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Translation 1/3
      • Example for CDFG to state transaction graph (STG)
        • Fit to time step
        • Easily to FSM Generator
      Example for ”If then else” Example for ”for loop”
    • Translation 2/3
      • Step 1
        • CDFG to STG
        • Un-rolling “for loop” condition
      • Step 2
        • Methodology
          • Reduce Computation
            • Parallel
          • Reduce Communication
            • Cascade
        • Architecture definition
      • Step 3
        • Translate to TLM SystemC
          • Header
          • Function
    • Translation 3/3
      • Block Level
      • Interface
        • Block to Wrapper
        • Block to Block
      • Control
        • FSM
      • Data path
        • Operator assignment
        • Control signals
          • Block to Wrapper
          • Block to Data path
          • Block to Buffer
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Platform Level
      • Input :
        • Port mapping
        • Library location
        • CoWare setting
      • Output
        • *.tcl for CoWare based
      • Communication Generator
        • System Control
        • Wrapper
        • Mux
        • PMU
        • Interrupt
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • Develop Library for CoWare 1/3
      • Master Wrapper Generator
        • Base on CoWare API for AMBA AHB
      • Advantage
        • Support any burst type
        • Burst Lock
      • Limitation
        • Buffer size
    • Develop Library for CoWare 2/3
      • PMU Generator
      • Input :Configure
        • Block Num: Default 3
        • Idle cycle: Default 1000
        • Wake Up cycle: Default 1000
        • Policy: fixed-time out Policy
      • Output : SystemC
    • Develop Library for CoWare 3/3
      • Known parameters
        • Total simulation time
        • Operation frequency
        • Active duration
        • Total active number
      ACT Energy Idle Energy Total energy = (ACT Energy + Idle Energy) Power= (ACT Energy + Idle Energy)/total time Active power/unit time Number of Active counts
      • Power Calculation
      Number of Idle counts Idle power/unit time
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • System Control Generator
      • TOP Control Generator
        • Input
          • Block scheduling
          • Block numbers
          • Type setting
            • Parallel
            • Pipeline
            • Single (Default)
        • Output
          • SystemC
    • Outline
      • Motivation and Contributions
      • Previous Works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow Overview
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop Library for CoWare
          • System Control Generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future works
      • References
    • CoWare - Scalar
      • Sequence : Foreman, Football(30 frames)
    • Simple Bus Environment - Scalar SystemC 2.1 Simple bus Read Transfer Write Transfer
    • CoWare Environment -Scalar
      • Top Platform for scalar application
      Step 1 Step2 Step3 Step4 Step 5
    • Experiments – Scalar
      • Performance with app-time and cycle time
      • Scalar performance && State size in Cycle time base
      100638 100638 325296 Cycle time 91775 91775 239761 Approximate time cycle cycle cycle Cr part Cb part Y part scalar Scalar Y part Parallel constrain 4 33388 115118 0 Communication Cycle 1668 31680 1724 81 11 case 3 1403 31680 9916 78 4 case 2 23 126720 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
    • Experiments – Power Monitor
      • Power Library
      • Method
        • Search the Look up table
        • Block -> Module
          • FSM switch
          • InBuffer
          • OutBuffer
          • Register
        • Block ->Data Path
          • Operator
      Data Path 23.4124 nw 1.0444 mw 8 ADD 21.3216 nW 808.2718 uW 8 SUB 67.5333 nW 4.0100 mW 8 DIV 9.9244 nW 425.8246 uW 8 SHR Size Idle power Active power 66 nw 1.7346mw 32 Buffer 1.7346mw 0.418mw Power power 32 6 width 12 nw FSM 66 nw Register Idle power
    • Experiments - Scalar
      • Scalar176*144 Power saving
      2124065.68 423934 1000 101638 WITH PMU 11000089.08 X X 526572 NO PMU Scalar Cb 11000089.08 X X 526572 NO PMU Scalar Cr 14038522.54 199276 1000 326296 WITH PMU Scalar Y 2124065.68 22584673.08 Power mw 423934 1000 101638 WITH PMU X X 526572 NO PMU Sleep Cycle Wake up Cycle Active Cycle Case 18286653.9mw with PMU 44584851.24mw No PMU 58.98% Scalar Power Saving Rate
      • DWT && IDWT
      Experiments - DWT DWT IDWT
    • Experiments - DWT
      • Top Platform for DWT application
      Step 1 Step 2 Step 3 Step 4
    • Experiments - DWT
      • Performance with app-time and cycle time
      • DWT performance && State size in Cycle time base
      76262 Cycle time 11088 Approximate time cycle DWT DWT Parallel constrain 1 74678 0 Communication Cycle 8630 1584 9504 42 1 case 1 46 1584 0 0 0 Original C code Code Line Computation Cycle Bus Access ST Size Max cascade
    • Experiments - DWT
      • DWT 44*36 Power saving
      2155442.52 68600 1000 76362 WITH PMU 4066501.32 X X 145962 NO PMU DWT IDWT 2105550.72 75362 1000 69600 WITH PMU 4415350.53 X X 145962 NO PMU Power mw Sleep Cycle Interrupt Cycle Active Cycle 4260993.24mw With PMU 8481851.85mw NO PMU 49.765% DWT Power Saving Rate
    • Outline
      • Motivation and Contributions
      • Previous works
      • Proposed Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms
        • Representation
        • Design Flow
        • Block Level
          • Methodology
          • Translation
        • Platform Level
          • Develop library
          • System Control generator
      • Experiments
        • Scalar 176*144
        • DWT 44*36
      • Conclusions and Future Works
      • References
    • Conclusions
      • We develop a Automation tool from behavior level CDFG to TLM level SystemC for virtual bus based platform design
      • We have also incorporated some method to reduce the Bus Access times for the system design at the Architecture level profiling
      • We develop some library for virtual bus based platform
      • We can fast explore the Architecture to reduce the verification time
    • Future Works
      • Model each module’s power using equations so that a more accurate power management could be carried out
      • Adding a test platform into the tool so that the corresponding test circuitry could be generated automatically
      • Including more hardware architectures to extend the Hardware Library so that designer can have more design options to choose
    • References
      • [1]S. S. Pasricha, N. Dutt, and M. Ben-Romdhane, &quot;Using TLM for exploring bus-based SoC communication architectures,&quot; 16th IEEE International Conference on Application-Specific Systems, Architecture Processors, 2005, pp. 79-85, 2005
      • [2]C. Lennard and D. Mista, &quot;Taking Design to the System Level,&quot; 2006 [Online]. Available:(http://www.arm.com/pdfs/ARM_ESL_20_3_JC.pdf)
      • [3] SPARK Methodology, (http://mesl.ucsd.edu/spark/methodology.shtml)
      • [4] S. Gupta, S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, &quot;SPARK: a high-level synthesis framework for applying parallelizing compiler transformations,&quot; Proceedings of 16th International Conference on VLSI Design, 2003 , pp. 461-466, 2003
      • [5] J. Cong, F. Yiping, H. Guoling, J. Wei, and Z. Zhiru, &quot;Platform-Based Behavior-Level and System-Level Synthesis,&quot; in IEEE International SOC Conference, 2006 , pp. 199-202, 2006
      • [6] Ya-Shu Chen, Shih-Chun Chou, Chi-Sheng Shih and Tei-Wei Kuo, &quot;MFASE: Multiple Functions SoCs Analysis Environment,&quot; in the VLSI Design/CAD Symposium, Taiwan, August 2007, 2007
      • Thank you