Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060 SOC TRACER ARCHICTECTURE USING AHB BUS S.JAGADEESH 1, K.RAVINDER 2, DR.M.ASHOK 3 1 Associate Professor & HOD, Department of Electronics and Communication Engineering, Sri Sai Jyothi Engineering College, Gandipet, JNTUH(A. P.), India 2 P.G. Student, M.Tech. (VLSI), Department of Electronics and Communication Engineering, Sri Sai Jyothi Engineering College, Gandipet, JNTUH(A. P.), India 3 Professor, Department of Computer Science and Engineering, Sri Sai Jyothi Engineering College, Gandipet,JNTUH(A. P.), IndiaABSTRACT In the system-on-chip (SoC) debugging the main problem with hardware-based tracingand performance analysis/optimization, techniques is that the cyclebased traces size is usuallymonitoring the on-chip bus signals are necessary. extremely large.But, such signals are difficult to observe since they To address the bus tracing issue, we propose anare deeply embedded in a SoC and no sufficient embedded multi-resolution signal tracing for AMBAI/O pins to access those signals. Therefore, we AHB. The bus tracer consists of two major tracingembed a bus tracer in SoC to capture the bus approaches: (1)signal monitor/tracing approach, andsignals and store them. The stored trace memory (2) trace reduction approach. In the first approach, itcan be loaded to the trace analyzer software for provides the trade-off between the trace accuracy andanalysis. In this paper a multiresolution AHB On- the trace depth. Designers can decide to trace AHBChip bus tracer is proposed for system-on-chip signals in detail or in rough depend on debugging(SoC) debugging and monitoring which is capable objectives; moreover, the trace granularities can beof capturing the bus trace with different changed while tracing is processed. In other words,resolutions and efficient built-in compression different trace strategy results can be stored in a singlemechanisms. In addition, it allows the user to trace file. In the second approach, the bus tracerswitch the trace resolution dynamically so that compresses the trace data according to AHB signalappropriate resolution levels can be applied to characteristics such as address, data, and controldifferent segments of the trace. It also supports signals. The trace data will be decompressed on thetracing after/before an event triggering, named host, translated into VCD (Value Change Dump) [1]post-triggering trace/pre-triggering trace, format, and displayed on a waveform viewer. The busrespectively. The SoC can be verified in field- tracer is integrated into an ARM EASY (Exampleprogrammable gate array. A multiresolution AHB AMBA System) [2] [3] environment with a 3Don-chip bus tracer is named as SYS_HMRBT graphic hardware acceleration system to demonstrate.(AHB Multiresolution Bus Tracer) and is usedmonitoring. By using this SYS_HMRBT, we can This paper presents a real-time multiresolution AHBachieve 79%-96% of compression depending on on-chip bus tracer, named SYSHMRBT (aHbselected resolution mode. multiresolution bus tracer)[1]. The bus tracer adopts three trace compression mechanisms to achieve high Keywords - AHB, on-chip bus, compression, multi- trace compression ratio. It supports „multiresolutionresolution, tracing tracing‟ by capturing traces at different timing and signal abstraction levels. In addition, it provides the1. Introduction „dynamic mode change‟ feature to allow users to In the System-on-a-Chip (SoC) era, it is a switch the resolution on-the-fly for different portionschallenge to verify and debug system chip efficiently of the trace to match specific debugging/analysisand rapidly. For design verification and debugging at needs. Given a trace memory of fixed size, the usersystem level and chip level, not only external I/O can trade off between the granularity and trace lengthsignals observation, but also internal signals tracing to make the most use of the trace memory. Incan help designer to efficiently analyze and verify the addition, the bus tracer is capable of tracing signalsdesign such as the software program, hardware before/after the event triggering, named pre- T/post-Tprotocol, and system performance. SoC bus signal tracing, respectively. This feature provides a moretracing can be accomplished with either software or flexible tracing to focus on the interesting points.hardware approaches. The software approach traceonly program address, and the cost of hardware The rest of this documentation is organized asimplementation of software approaches would be follows. Chapter1.1 surveys the related work.high.On the other hand, the hardware approach can Chapter2 illustrates the literature survey of abstractiontrace signals at the target system directly. However, level. Chapter3 presents the hardware architecture of 1054 | P a g e
  2. 2. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060our bus tracer. Chapter4 provides simulation results of transaction, only one mode can use during a tracingour bus tracer so as to analyze the expected errors process. the transaction level. They point out that thewhile processing the signal through bus tracer used in traditional hardware and software debugging cannotSoC. . Finally, Chapter5 concludes this project and work collaboratively, since the software debugging isgives directions for future research. at the functional level and the hardware debugging is at the signal level. As a solution, the transaction-level debugging can provide software and hardware1.1 Related Work designers a common abstraction level to diagnose Existing on-chip bus tracers mostly adopt lossless bugs collaboratively, and thus, help focus problemscompression approaches. ARM provides the AMBA quickly. Both works indicate that the transaction-levelAHB trace macro-cell (HTM) [3] that is capable of debugging is a must in SoC development.tracing AHB bus signals. We characterize the bussignals into three categories: program address, data 2. Abstraction leveladdress/data and control signals. For program This paper presents the multi-resolutionaddresses, astraight forward way is to discard the approach that can use different trace modes during acontinuous instruction addresses and retain only the bus signal tracing process. The transaction-leveldiscontinuous ones, so called branch/target filtering. debugging provides software and hardware designersThis approach has been used in some commercial a common abstraction level to diagnose bugs. Thetracers, such as theTC1775 trace module in Tri Core abstraction level is in two dimensions: timing[5] and ARM’s Embedded Trace Macrocell (ETM) abstraction and signal abstraction. The timing[6],[4].For data address/value , the most popular dimension has two abstraction levels which are themethod is the differential approach which records the cycle level and transaction level. The cycle leveldifference between consecutive data. Since the captures the signals at every cycle. The transactiondifference usually could be represented with less level records the signals only when their valuenumber of bits than the original value, the information changes. The signal dimension involves grouping ofsize is reduced. For control signals, ARM HTM [3] AHB bus signals into four categories: programencodes them with the slice compression approach: address, data address/value, access control signalsthe control signal is recorded only when the value (ACS), and protocol control signals (PCS). Then, wechanges. The spirit of a hardware tracer is its data define three abstraction levels for those signals. Theyreduction or compression techniques. For program are full signal level, bus state level, and masteraddress tracing, an intuitive way is to discard the operation level. The full signal level captures all buscontinuous instruction addresses and retain only the signals. The bus state level further abstracts the PCSdiscontinuous ones, such as the addresses of by encoding them as states according to the bus-state-branching and target instructions, with some hardware machine (BSM).The states represent bus handshakingfilters. This approach has been used in some activities within a bus transaction. The master statecommercial processors, such as TriCore [4] [5], and level further abstracts the bus state level by onlyARM’s Embedded Trace Macrocell [6]. The hardware recording the transfer activities of bus masters andoverhead of these works is small since the filtering ignoring the handshaking activities withinmechanism is simple to implement in hardware. transactions. This level also ignores the signals whenHowever, the effectiveness of these techniques is the bus state is IDLE, WAIT, and BUSY. The BSM ismainly limited by the average basic block size, which designed based on the AMBA AHB 2.0 protocol tois roughly around four or five instructions per basic represent the key bus handshaking activities within ablock, as reported in [7] and [8]. For data address and transaction. The transitions between BSM statesvalue tracing, the most popular method is used the follow the AMBA protocol control signals.differential approach based on subtraction. Some Combining the abstraction levels in the timingresearchers have shown that using the differential dimension and the signal dimension, we provide fivemethod can reduce the data address and data values modes in different granularities They are Mode FCtraces by about 40 percent and 14 percent respectively (full signal, cycle level), Mode FT (full signal,[9] [10]. Besides the address and data bus, there are transaction level), Mode BC (bus state, cycle level),several control signals on system bus that need to be Mode BT (bus state, transaction level), and Mode MTtraced. Some FPGA boards have built-in signal trace (master state, transaction level). At Mode FC, thetools, such as the Altera Signal Tap [11] and Xilinx tracer traces all bus signals cycle-by-cycle so theChip- Scope [12]. FS2 AMBA Navigator [13] detailed bus activities can be observed. At Mode FT,supports bus clock mode and bus transfer mode to the tracer traces all signals only when their values aretrace bus signals on every clock and bus transfer changed. At Mode BC, the tracer uses the BSM, suchrespectively. Trace buffer stores bus cycles or bus as NORMAL, IDLE, ERROR, and so on, to representtransfers based on local internal memory size. bus transfer activities in cycle accurate level. At ModeAlthough these approaches support multiple trace BT, the tracer uses bus state to represent bus transfermodes such as tracing at cycle by-cycle or at signal activities in transaction level Our bus tracer also 1055 | P a g e
  3. 3. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060supports dynamic mode change (DMC) feature which ule. The Event Generation Module controls theallows designers to change the trace mode start/stop time, the trace mode, and the trace depth ofdynamically in real-time. traces. This information is sent to the followingThe post-T trace captures signals after a triggering modules. Based on the trace mode, the Abstractionevent, while the pre-T trace captures signals before Module abstracts the signals in both timing dimensionthe triggering event. The post-T trace is usually used and signal dimension. The abstracted data are furtherto observe signals after a known event. The pre-T compressed by the Compression Module to reduce thetrace is used to diagnose the cause for unexpected data size. Finally, the compressed results are packederrors by capturing the signals before the errors. In with proper headers and written to the trace memoryorder to overcome the problem, we adopt a periodical by the Packing Module.triggering technique.We divide the entire trace into several independentsmall traces. Destroying the initial state of one tracedoes not affect other traces since every trace has itsown initial state. This technique can be accomplishedby periodically triggering a new trace. With minormodification to their control circuitry it can be easilyaccomplished by the existing trace compressionengines. Fig. 2. SoC Tracer Architecture Using AHB Bus 3.1 Event Generation Module: The trace and trace mode starting and stopping are decided by event generation module. The triggering events on the bus controlled by event registers. The matching circuit is used to compare bus activities with the events specified in the event registers. We can connect an AHB bus protocol checker (HPChecker) [10] to the Event Generation Module, as shown in Fig.2, to capture the bus protocol related trace. TABLE IIFig. 1. BSM for encoding bus master behaviors. Event Register TABLE I Signal AbstractionAHB Full Bus State MasterSignals Signal OperationProgram All All PartialAddressData All All PartialAddressACS All All PartialPCS All Encoded N/A3. Hardware architecture of bus tracerFig. 2 is the bus tracer overview. It mainly containsfour parts: Event Generation Module, AbstractionModule, Compression Modules, and Packing Module. 1056 | P a g e
  4. 4. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060The Event Generation Module decides the startingand stopping of a trace and its trace mode. The This technique aims at the spatial locality of themodule has configurable event registers which specify program address. Spatial locality exists since thethe triggering events on the bus and a corresponding program addresses are sequential mostly. Softwarematching circuit to compare the bus activity with the programs (in assembly level) are composed by aevents specified in the event registers. Optionally, this number of basic blocks and the instructions in eachmodule can also accept events from external modules. basic block are sequential. Because of theseTable II is the format of an event register. characteristics, Branch/target filtering can records only the first instruction‟s address (Target) and the3.2 Abstraction Module last instruction‟s address (Branch) of a basic block. This Module monitors the AMBA bus and The rest of the instructions are filtered since they areselects/filters signals based on the abstraction mode. sequential and predictable.Depending on the abstraction mode, some signals areignored, and some signals are reduced. 3.3.3 Dictionary-Based Compression To further reduce the size, we take the advantage of the temporal locality. Temporal locality exists since the basic blocks repeat frequently (loop structure), which implies the branch and target addresses after Phase 1 repeat frequently. Therefore, we can use the dictionary based compression.Fig.3 Multiresolution abstraction trace modes3.3 Compression Module This Module monitors the AMBA bus andselects/filters signals based on the abstraction mode.Depending on the abstraction mode, some signals are Fig. 4 Block diagram of the dictionary basedignored, and some signals are reduced. compression circuitThe purpose of the Compression Module is to reducethe trace size. It accepts the signals from the 3.3.4 Slicingabstraction module. To achieve real-time The miss address can also be compressedcompression, the Compression Module is pipelined to with the Slicing approach. Because of the spatialincrease the performance. Every signal type has an locality, the basic blocks are often near each other,appropriate compression method, as shown in the which means the high-order bits of branch/targetFigure-1 the program address is compressed by a addresses nearly have no change. Therefore, thecombination of the branch/target filtering, the concept of the Slicing is to reduce the data size bydictionary-based compression, and the slicing. The recording only the different digits of two consecutivedata address and the data value are compressed by a miss addresses. To implement this concept incombination of the differential and encoding methods. hardware, the address is partitioned into several slicesThe ACS and PCS signals are compressed by the of a equal size. The comparison between twodictionary-based compression. Details will be consecutive miss addresses is at the slice level. Fordiscussed below Compression Mechanism. example, there are three address sequences: A (0001_0010_0000), B (0001_0010_0110), C3.3.1 Program Address Compression (0001_0110_0110). At first, we record instruction A‟s We divide the program address compression full address. Next, since the upper two slices ofinto three phases for the spatial locality and the address B are the same as that of the address A, onlytemporal locality. Figure-1 shows the compression the leastsignificant slice is recorded. For address C,flow. There are three approaches: branch/target filter, since the most significant slice is the same to that ofdictionary based compression, and slicing. Here we the address B, only the lower two slices are recorded.have three parts in address compression: Figure 5 shows the hardware architecture. It has the register REG storing the previous data (dini-1).--Branch/Target Filtering--Dictionary based Compression--Slicing3.3.2 Branch/Target Filtering 1057 | P a g e
  5. 5. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060 3.4 Packing Module This Module receives the compressed data from the compression module. It processes them and writes them to the trace memory. Packet management, circular buffer management, and mode change control are managed by this module. 4. Compression Mechanism Fig. 5 Block diagram of slicing circuit To reduce the size, the compression approaches are necessary. Since the signal3.3.5 Data Address/Value Compression characteristics of the address value, the data value, Data address and data value tends to be and the control signals are quite different, we proposeirregular and random. Therefore, there is no effective different compression approaches for them. They arecompression approach for data address/value. Program address compression, Branch/TargetConsidering using minimal hardware resources to filtering, Dictionary based compression, Slicing, Dataachieve a good compression ratio, we use a address/value compression, Control signaldifferential approach based on the subtraction. Figure compression. Integrating the bus tracer into a SoC is5 shows the hardware compressor. The register REG done by simply tapping the bus tracer to the AHBsaves the current datum dini and outputs the previous bus.An on-chip processor or an external debuggingdatum dini-1. By comparing the current datum with host controls the bus tracer.the previous data value, the three modules comp,differential, and sizeof output the encoded results. The Real-time tracing is achieved when the bus tracer iscomp module computes the sign bit (signed_bit) of pipelined to meet the on-chip bus frequency. Since thethe difference value. The differential module trace data processing is stream-based, the bus tracercalculates the absolute difference value (value). Since can be easily divided into more pipeline stages tothe absolute difference between two data value may meet aggressive performance requirements.be small, we can neglect the leading zeros and usefewer digits to record it. Therefore, the sizeof module 4. Simulation Resultscalculates the nonzero digit number (sizei) of thedifference. Finally, the encoded datum is sent to the 4.1 Checker Resultpacking module along with sizei Fig.6 Block diagram of differential compression circuit3.3.6 Control Signal Compression Fig.7. Checker Simulation Result We classify the AHB control signals into twogroups: access control signals (ACS) and protocol The output for this module is ERROR register of 44control signals (PCS). ACS are signals about the data bit length, in which each bit represents variousaccess aspect, such as read/write, transfer size, and protocol errors of AHB. For example when resetburst operations[4s]. PCS are signals controlling the signal is high (HRESETn) then all the control signalstransfer behavior, such as master request, transfer should be at initial state otherwise they will producetype, arbitration, and transfer response. Control an error. The protocol list is given in table.signals have two characteristics. First, the samecombinations of the control signals repeat frequently, 4.2 Event Generator Resultwhile other combinations happen rarely or never This module is responsible for producing thehappen. control signal for the tracer, which represents the start and stop point of the trace. Trace_In_Progreess is the output signal for this module. And this module also 1058 | P a g e
  6. 6. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060produces mode of trace on which basis the tracer isworking.Fig.8 Event Generation Simulation Result4.3 Abstraction Result Fig.10 Compression Simulation Result 4.5 Packing Result Fig.9 Abstraction Simulation ResultAbstraction module takes the inputs from the AHBbus and the Event Generation module. If divides theAHB signals into ADDRESS signals, DATA signalsand control signals. It is also responsible for Fig. 11 Packing Simulation Resultproducing the output depends on the mode ofoperation. For example if the trace mode is in Full The compressed data is in the form of bits only. If wecycle signal (FC) then it produces the output for every transmit it directly to the memory, then there will be aclock cycle. If it is in Bus transaction mode first it great problem at the decoder to differentiate the data.encodes the PCS control signals and generates the So we have to attach the header for each data.output on transactions only. According to the header only decoder can find out the different packets. Each buffer is of 32bits. Whenever4.4 Compression Result the data in one buffer is full, then that buffer gives the The address, data and control signals from data to the memory.the abstraction module are the inputs for thecompression module. The signals are compressed 5. Conclusionbased on different compression techniques. We have presented an on-chip bus tracer SYS-HMRBT for the development, integration, debugging, monitoring, and tuning of AHB based SoC‟s. It is attached to the on-chip AHB bus and is capable of capturing and compressing in real time the bus traces with five modes of resolution. This is the 1059 | P a g e
  7. 7. S.Jagadeesh 1, K.Ravinder 2, Dr.M.Ashok / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 1, January -February 2013, pp.1054-1060advantage of the Bus Tracer used in SoC.These Scale Integration (VLSI) Systems, vol 18, Issuemodes could be dynamically switched while tracing. : 5, p732 – 741, May 2010.The bus tracer also supports both directions of traces: [12] Y.-T. Lin, W.-C. Shiue, and I.-J. Huang, ―Apre-T trace (trace before the triggering event) and multi-resolution AHB bus tracer for read-timepost-T trace (trace after the triggering event). In compression of forward/backward traces in aaddition, a graphical user interface, running on a host curcular buffer,‖ in Proc. Des. Autom. Conf.PC, has been developed to configure the bus tracer (DAC), Jul. 2008, pp. 862–865.and analyze the captured traces. With the [13] AMBA Navigator Spec Sheet, First Siliconaforementioned features, SYS-HMRBT supports a Solutions (FS2) Inc.diverse range of design/debugging/ monitoringactivities, including module development, chipintegration, hardware/software integration anddebugging, system behavior monitoring, systemperformance/power analysis and optimization, etc.The users are allowed to tradeoff between tracegranularity and trace depth in order to make the mostuse of the on-chip trace memory or I/O pins.In thefuture, we would extend this work to more advancedbuses/connects such as AXI or OCP. In addition, withits real time abstraction capability, we would like toexplore the possibility of bridging our bus tracer withESL design methodology for advancedhardware/software co development/debugging/monitoring/analysis, etc.REFERENCES[1] E. E. Johnson, J. Ha, and M. B. Zaidi, ―Lossless trace compression,‖ IEEE Trans. Comput., vol. 50, pp. 158–173, Feb. 2001.[2] CoreSight: V1.0 Architecture Specification, ARM.[3] R. Leatherman and N. Stollon, ―An embedded debugging architecture for SoCs,‖ IEEE Potentials, vol. 24, no. 1, pp. 12–16, Feb-Mar 2005.[4] ARM Ltd., San Jose, CA, ―ARM. AMBA AHB Trace Macrocell (HTM) technical reference manual ARM DDI 0328D,‖ 2007.[5] T. A. Welch, ―A technique for high- performance data compression,‖IEEE Trans. Comput., pp. 8–19, 1984.[6] Embedded Trace Macrocell Architecture Specification, ARM.[7] C. MacNamee and D. Heffernan, ―Emerging on-chip debugging techniques for real-time embedded systems,‖ IEE Comput. Contr. Eng. J., pp. 295–303, Dec. 2000.[8] B. Tabara and K. Hashmi, ―Transaction-level modeling and debug of SoCs,‖ presented at the IP SoC Conf., France, 2004.[9] ARM, AMBA Specification (Rev 2.0) ARM IHI0011A, May 1999.[10] C.C.Wang, AHB On-Chip Bus Protocol Checker, 2007[11] Fu-Ching Yang, Cheng-Lung Chiang and Ing- Jer Huang, ―A Reverse-Encoding-Based On- Chip Bus Tracer for Efficient Circular-Buffer Utilization‖, IEEE Transactions on. Very Large 1060 | P a g e