Software simulation at Register Tranfer Level (e.g. Modelsim) Use of embedded logic analyzer (e.g. ChipScope) modelling, automated testing, code generation…
The presentation is organized as follow. In a first part i will explain the framework for modeling execution platform and the way an application is specified. We will also see how the application and the platform are simulated. In a second part I will detail the tools and the software approach we used for designing and validating the application. The third part will give simulation results of the system and the application. And finally I will come to the conclusion and will give some perspectives.
The presentation is organized as follow. In a first part i will explain the framework for modeling execution platform and the way an application is specified. We will also see how the application and the platform are simulated. In a second part I will detail the tools and the software approach we used for designing and validating the application. The third part will give simulation results of the system and the application. And finally I will come to the conclusion and will give some perspectives.
Software simulation at Register Tranfer Level (e.g. Modelsim) Use of embedded logic analyzer (e.g. ChipScope)
Une architecture reconfigurable est un circuit dont la fonctionnalité est déterminée par une configuration chargée dans les mémoires internes du circuit. Du fait qu’elles soit reconfigurables ce type d’architecture se rapproche de la flexibilité logicielle tout en bénéficiant de performances élevées proche des circuits dédiés. Ce type d’architecture est donc un compromis entre flexibilité et performance. Elles trouvent leurs applications dans des domaines tels que les télécommunications, le prototypage de circtuis dédiés … et sont bien adaptées pour répondre aux besoins évolutifs des systèmes embarqués.
Le schéma montre une architecture reconfigurable générique et les deux ressources principales la composant. Les ressources de calcul permettent de mettre en œuvre n’importe quelle fonction dans une table de lookup ou LUT. Les ressources de routage connectent les différentes cellules entre elles pour réaliser le calcul spécifié. .
The presentation is organized as follow. In a first part i will explain the framework for modeling execution platform and the way an application is specified. We will also see how the application and the platform are simulated. In a second part I will detail the tools and the software approach we used for designing and validating the application. The third part will give simulation results of the system and the application. And finally I will come to the conclusion and will give some perspectives.
Avantage d’une spec smalltalk -> executable, debuggable dans l’environnement facilement.
Flot + captures d ‘écrans ? Flot MADEO-FET Spécification en C (dataflow) Synthèse logique Floorplan Placement/Routage Définition des IOBs Extraction du bitstream
Software simulation at Register Tranfer Level (e.g. Modelsim) Use of embedded logic analyzer (e.g. ChipScope)
The talk is organized around the global flow of the methodology implemented in a framework. - At the highest level the application is specified in Smalltalk and then refined to an high-level intermediate representation called CDFG, and then to a low-level representation at RTL level. Automated transformations applied on CDFGs are represented by the blue ellipses. 2 - The application is simulated on a model of the target platform. This model is described from an object framework defining components and communication links. 3 - The application can be simulated from functional specification in Smalltalk down to RTL level. This simulation is integrated in the system-level simulation for a global simulation. It produces Gantt and interaction diagrams giving information on the system behavior, and also signals waveforms generated by the application at RTL level. 4 - According to the extreme programming methodology, iterations are performed on debugging and testing steps during application development. These iterations are dependent of debug done at the different levels and simulation results. 5 - Once the application is validated the low level intermediate representation is taken as input of synthesis tool. For now we target the M2000 FPGA but the synthesis issue is not in the scope of this talk. The test and debugging steps are also applied to the synthesis results.
Here it is an example of CDFG. The CDFG model defines all the element for describing concurrent applications as communicating processes. There are two types of nodes: hierarchical or atomic. Hierarchical nodes can contain atomic and hierarchical, there are four main types of nodes. Structuring nodes holds information about the application structure for example they can represent function call or processes. Sequencing nodes gives information about the scheduling of their sub operators. They can be executed in parallel or sequentially. Iteration nodes correspond to loop with fixed indices or conditional loops. Conditional are for if-then-else or switch-case control instructions. Atomic nodes represent classical comuting operators. They can also correspond to constant used by the program or memory access operations. The communication between processes are performed by send or receive operation represented by atomic nodes.
The low level CDFG model is an extension of the high level model. It is produced from the mapping of a high level CDFG to a reconfigurable architecture. It defines additional constructs specific to RTL level such as registers, primitive operators from libraries, nodes holding finite state machine description in KISS and also nodes holding logic in BLIF format. This CDFG is taken as input of the synthesis tools for producing an EDIF. It is also used by the RTL-level simulator for debugging purpose and also for generating signal waveforms.
Design pattern: Composite Feature very useful when simulation of a same application is performed at different abstraction level. It enables to analyze both the variables and their corresponding signals.
We have seen the different specification levels of the applications and particularly the CDFG model which is used for multi-level simulation. The simulation of the application starts from a behavioural description in Smalltalk processes communicating by channels. At this level probes from the system level simulator API are inserted in the code for producing events in the simulator and generating Gantt diagram of the process activities. It gives a coarse view of how the processes behave. The high-level CDFG is also simulated by the system level simulator API. Compared to the functional simulation in Smalltalk each operator activity is traced on a Gantt diagram giving a detailed execution of the graph with the instruction level parallelism. Then the low level CDFG is simulated by a cycle accurate simulator embedded in the system level simulation of the platform. The RTL level involves a different notion of time compare to the system-level. The RTL level simulator with its application is embedded in the system model as a component and it is seen as an atomic task. A start signal is send to the application for activating the RTL simulator which returns the total latency of the application once it is done, then this latency is used at the system level for simulating an event. A correspondence is kept in the simulator between the low level and the high level CDFG for example the loop indices or operators for easing the debug. It is also possible to configure the simulation with conditional breakpoints in order to stop the simulation on a particular state. Typically these conditions can be set on signals. They also can be probed for producing waveforms and helping for debugging the application.
In order to take into account all the system activities of a SoC during the execution of an application it is necessary to have a model of the execution platform. A model is built from an object framework defining components that can be composed hierarchically. A component approach enhances reusability and modularity in the model, moreover the hierarchical organization enables to abstract complex subsystem when they are connected. A component declares and schedules a set of processes or sub-components representing its behaviour. For communication, it defines an interface holding a set of input and output ports. These ports are connected by communication channels that can be FIFO or blocking channel. Local communications between processes are also performed through channels. Connections between components are declared by the encapsulating hierarchy. For example on the figure the component, Main declares the two sub-components Unit1 and Unit2 as well as the connection between their interfaces. Unit1 encapsulates three processes communicating through channels and P1, P2 communicate with Unit2.
The framework is organized around two class hierarchies. An abstract component defines all the methods for declaring subcomponents, interface connectivity, processes, etc. The designer can defines his own model by sub-classing this framework and creating a new extension which can be reused. Two types of connection are defined which are FIFO with limit size that can be configured and blocking channel performing synchronization by rendez-vous. In this work the connection are used without modifications but it is possible to extend the model and to define new communication links with specific semantics.
In order to obtain a simulation of the execution platform model, the modeling framework is coupled to an event driven simulator. The simulator provides an API for simulating operator latencies, scheduling and stopping activities. In order to use this API the modeling framework inherits from the simulation class hierarchy. The components and connections are SimulationObject and have access to the A PI. For being able to run the simulation and perform initializations the user has to subclass the Simulation class on the right, which has access to the simulation kernel managing the event queue. In other word it corresponds to the simulator entry in the model. For summarizing a model defined by a designer corresponds to an extension of the simulator and modeling class hierarchies where elements of the model are at the most abstract level simulation objects.
As an example, we consider is the execution of an application specified as communicating processes and executed on a reconfigurable unit belonging to a system on chip. The structure of the system on chip is depicted by the figure. The component describes in the modeling framework are the CPU, a DMA, a main memory and a reconfigurable accelerator connected to a set of local memories. All the components are connected by a bus. The application is composed of three processes with two processes performing operations on local memories and feeding a computing function. The CDFG of the application is given here.
The visualization tools enable to study the interactions or communications between the components with interaction diagram and the activities of the system with Gantt diagrams. The interaction diagram gives the communications between the components in function of the simulations steps. Here the CPU sends a start signal to the DMA which in turn sends requests through the bus to the memory in order to start the transfer of data to the local memories of the reconfigurable unit. A Gantt diagram is generated giving the activities of the components. The thr ee traces are the data transfers performed on the bus and these traces correspond to tasks performed by the application specified as Smalltalk processes. A process read the data, the computing process receive and send the result to the another process writing back in local memories. The three processes create a pipeline.
The application simulated as a low level CDFG can be analyzed by probing signals. The values of the signals are given by waveforms. This waveform corresponds to the start and stop signal of the application. It gives the total latency of the execution. It is also possible to probe loop indices. The received values on a channel. Simulation results are used to perform tests on values produced. Traces recorded values for each cycle so it is easy to make SUnit methods to check result at given cycles.
The two types of test can be illustrated as follow. Test1 is a unit test. It checks if the result return by the mapping of a high level CDFG are correct. Basically, the test verify if the CDFGSynthesis API has produced a netlist. If not a debugging step is necessary. The second example test2 shows a characterization test which compares two configurations for a mapping. In the first case the CDFG uses primitive operators from a library and in the second case the operator are converted in random logic. Then a simulation of both low level CDFGs is performed and the results are compared for equality ensuring that the behavior of the CDFG is not changed in function of the mapping.
The two types of test can be illustrated as follow. Test1 is a unit test. It checks if the result return by the mapping of a high level CDFG are correct. Basically, the test verify if the CDFGSynthesis API has produced a netlist. If not a debugging step is necessary. The second example test2 shows a characterization test which compares two configurations for a mapping. In the first case the CDFG uses primitive operators from a library and in the second case the operator are converted in random logic. Then a simulation of both low level CDFGs is performed and the results are compared for equality ensuring that the behavior of the CDFG is not changed in function of the mapping.
Dependencies are very flexible and can also be defined with an array of actions to trigger for a single signal.
The presentation is organized as follow. In a first part i will explain the framework for modeling execution platform and the way an application is specified. We will also see how the application and the platform are simulated. In a second part I will detail the tools and the software approach we used for designing and validating the application. The third part will give simulation results of the system and the application. And finally I will come to the conclusion and will give some perspectives.
Software simulation at Register Tranfer Level (e.g. Modelsim) Use of embedded logic analyzer (e.g. ChipScope)
Software simulation at Register Tranfer Level (e.g. Modelsim) Use of embedded logic analyzer (e.g. ChipScope)
Synthesis time-consuming -> debugging in situ
We have shown a methodology implemented in a framework for simulating and validating an application running on a reconfigurable accelerator of a system on chip. The intermediate format CDFG is used for enabling a mutli level simulation and is taken as input of synthesis tools. The execution platform is modeled by an object framework defining components and communication links. The global simulation includes the simulation of the application and the platform. This methodology aims at bringing to hardware application design advantages from software engineering techniques by the use of the high level language Smalltalk and its environment enabling to debug and explore the model under simulation. These concept are extended to the synthesis with the possibility to synthesize the probes in hardware and to keep a high level interface for inspecting the application. Additionally the use of Extreme programming methodology is proposed for safe and more productive development. Of course the framework has some limitations. The execution platform is modeled and simulated only at one level of abstraction, only the application can be analyzed at different levels. Currently the low level CDFG is synthesize on one target which is M2000. And the framework lacks of interface with external tools for example with a Xilinx toolchain.
The first perspective is to improve the interface with the external toolchain and also of simulation tools for simulating netlists. The impact of the synthesized probes is not quantified and would give relevant information concerning the overheads in term of frequency and surface on the application. We plan to also investigate the possibility to add probes dynamically in the application loaded on the reconfigurable unit by exploiting partial reconfiguration technology. Finally the synthesized application has to be interfaced at high level in the framework with the possibility to inspect the circuit’s state and to control the execution in order to keep a software approach.
Software Engineering Methodology for Reconfigurable Platforms Damien Picard and Loic Lagadec Architectures et Systèmes, Lab-STICC Université de Bretagne Occidentale, France ESUG’09 Brest, France, 2009
Introduction
Increasing complexity of modern System-on-Chip
Difficulty to program and to validate applications
Shrinking time-to-market
Common techniques for hardware validation
Testing/debugging at a very low abstraction level
Time-consuming and burdensome
Need for productive methodologies with an higher level approach
Software development benefits from very efficient techniques
Our approach : applying software engineering methodologies to hardware design
Aim of this talk
This talk focuses on a key issue: validation of hardware application targeting RA
An HL synthesis flow for reconfigurable architectures based on MADEO [ESUG 08]
Multi-level simulation: from behavioral to hardware
Interfacing with third-party tools through code generation
Software-like debugging features embedded in hardware
Advocates for the use of software engineering techniques
Short development cycles, use of OO models, code generation, etc.
Outline
Overview of Reconfigurable Architectures
OO methodology for synthesis on RA
Simulation and testing methodology
Software-like debugging for RA
Conclusion
Outline
Overview of Reconfigurable Architectures
OO methodology for synthesis on RA
Simulation and testing methodology
Software-like debugging for RA
Conclusion
Reconfigurable Architectures
A reconfigurable architecture is a run-time programmable architecture based on the hardware reconfiguration
Used as flexible hardware accelerators for intensive computations
Based on Look-Up-Table (LUT) = memory
General-purpose and high parallelism
Slow and area/power-inefficient (routing overhead)
Several reconfigurable platforms
FPGAs (vendors, e.g. Xilinx, Altera)
eFPGAs (e.g. M2000, Menta)
RSoCs (e.g. Morpheus project): RA IP composition
Reconfigurable Architectures
Trade-off between flexibility/performance
Functionality of the circuit determined by a configuration
Flexibility Performance Processor Reconfigurable Architecture ASIC Results Data Program Configuration Data Data Results Results
Software-engineering concepts applied to logic synthesis on reconfigurable architectures
MADEO: extensive use of OO methodology
Modeling, generic tools through polymorphism
An HL synthesis flow for RSoC based on the MADEO approach with validation methodology
Multi-level simulation: from behavioral to hardware
Target modeling
Interfacing with third-party tools through code generation
Software-like debugging features embedded in hardware
Outline
Overview of Reconfigurable Architectures
OO methodology for synthesis on RA
Simulation and testing methodology
Software-like debugging for RA
Conclusion
Global Flow Smalltalk Method High-level CDFG Low-level CDFG SoC Model Multi-Level Simulator Export Testing Netlist Back-end Tools System Simulator Global Simulation System Behavior Gantt Diagram Interaction Diagram Application Behavior Waveform Components Framework Debugging Iterations Synthesis CDFG
Platypus tool CDFG design CDFG EXPRESS model Tool X Tool Y HLL CDFG API (Java) CDFG instances (STEP files) CDFG Checker Madeo+ synthesis tool CDFG Use Target 3 Specific Assembly code Target 2 C like code Target 1 EDIF Target architecture description HLL CDFG API (Smalltalk) ENTITY HierarchicalNode SUBTYPE OF (Node); localVariables : LIST OF AbstractData; subOperators : LIST [1 : ?] OF Node; END_ENTITY ; ENTITY AccumulatorNode SUBTYPE OF (HierarchicalNode); init : AbstractData; --”AccumulatorNode.init” the initial value we start accumulating from . toBeAccumulated: AbstractData; DERIVE cumulatedArguments : LIST OF AbstractData := subOperators [ SIZEOF (subOperators)].outputs; WHERE toBeAccumulatedSource: SIZEOF ( cumulatedArguments )=1; typeCompat: cumulatedArguments[1]. type = init. type ; END_ENTITY ; APPLICATION APPLICATION [Lagadec, ESUG08]
Application Intermediate Representation
Hierarchical Nodes
structuring process sequencing loop conditional
Component User Class #Main User Class #Unit1 User Class #Unit2 Abstraction Level Connection FIFO Blocking Channel Abstraction Level Inheritance relation
System-Level Simulation
Framework of classes
Based on an event-driven simulation kernel [Blue Book]
Defines an API for simulating operator latencies, scheduling, stopping activities
System modeling framework inherits from the simulator
Simulator + modeling class hierarchy
SimulationObject Component Connection Simulation User Model
Example
Case Study
Partial modeling of a reconfigurable system-on-chip
System activities: DAM transfers, synchronizations, etc.
Application accelerated on a RA
Bus CPU Reconfigurable Accelerator DMA Memory Local Mem.
System simulation
Interaction diagram between components
Behavior of the system and the application
LL-CDFG simulation
Cycle-accurate simulation
Stimuli on LL-CDFG signal interface defined in a method of the accelerator component
Interface between the system and accelerated function
Tracing values for LL-CDFG simulation
Simulator defines an API to set probes on the LL-CDFG signal interface
Tracing of the signal values
Graph generation
Tracing signals
Traced signals and stimuli set through a GUI
LL-CDFG simulation
Traces produced from signals
Values tested against expected results: SUnit testing
Software debugger features added to hardware application
Benefit from reconfigurability
Debugging support automatically inserted by the design framework (probes + controller)
All debugging features removed once design is validated
Probing signals
Breakpoints set through a GUI
Embedding software debugger features
Execution is controlled by the debugger
Breakpoints interfaced with a debug controller
Op Op Op Op Op Op Op Op Synthesis Debug ctrler Simulator Schedule controller Control interface Control interface
Hardwired Breakpoint
Freeze the execution when triggered
Limit on conditional operators
Hierarchical low-level CDFG Local Controller Global OR Top Hier1 Hier2 = != < > = != < > OpN Op Conditionnal breakpoint Conditionnal breakpoint Value Value OpSel OpSel i1 i2 i1 i2 O O Enable operators Control Debugging Controller
Hardwired Breakpoint
Configurable breakpoints
Pool of probed signals is static
Breakpoint condition are configurable and can be enabled/disabled
Possibility to speculate and to backtrack execution
No-need for re-synthesis
Configuration structure
2-D vector
Configuration word: contains operator selection, activation status and arguments
Extraction of the debug information: two execution modes
Running mode
Debug mode
Execution control: step-by-step, resume
Read back of the debugging information
Hardwired Watchpoint
Probed signals are wired to the top interface
Automatically crosses the hierarchies
Possibly conditional
Trace analysis
Hierarchical low-level CDFG Top Hier1 Hier2 Op Probed signal
Example Execution Frozen
Outline
Overview of Reconfigurable Architectures
OO methodology for synthesis on RA
Simulation and testing methodology
Software-like debugging for RA
Conclusion
Conclusion
Methodology for validating an application running on a RSoC
Multi-level simulation of the application specified as CDFG
High-level modeling of platform by a component-based approach
Benefit from software expertise for hardware design
Taking advantage of the Smalltalk dynamic language and environment
0 comments
Post a comment