Your SlideShare is downloading. ×
  • Like
  • Save
Amoeba - Heterogeneous Multiprocessor Debugging in a Single Session of GDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Amoeba - Heterogeneous Multiprocessor Debugging in a Single Session of GDB

  • 2,780 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
2,780
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Heterogeneous Multiprocessor Debugging in a Single Session of GDB Santosh Kumar and Kalpak Shah {santosh.raghuram, kalpak.shah} @gmail.com Pune Institute of Computer Technology, Pune. 1. INTRODUCTION ABSTRACT SoCs today try to maximize power efficiency and processor utilization by using heterogeneousWith the proliferation of both homogeneous and multiprocessor systems employing multiple RISCs andheterogeneous multiprocessor systems, debugging DSPs [4]. The role of the RISC processors is to provideapplications on such multiprocessors becomes a overall control of the system, managing and monitoring ofchallenge. Todays SoCs combine the power of multiple the system activity. Hence the RISC processor generallyRISCs and DSPs to produce powerful handheld devices. hosts an operating system and avails the usage ofBut debuggers have not evolved for such SoCs. GDB is thousands of applications written for the operating systeman open-source debugger which can currently be it hosts. DSPs on the other hand, have only a primitive setconfigured only for a single architecture, and can only of functions. However, a DSP can operate faster and moredebug one target of that architecture at a time. Hence it efficiently when it comes to executing specific kinds ofcannot debug homogeneous or heterogeneous multi- calculations/tasks. The typical applications that can benefitprocessor systems as a unit. We propose a GDB that can from DSP are media players, in which DSP can be asupport multiple CPU target architectures and ABI decoder.(Application Binary Interface) and simultaneously debugall the targets in a single session. This allows harnessing Such systems demand the development of concurrentGDB’s powerful scripting interface which could be used applications. The debugging of such parallel applicationsin regression suites. Features like barrier break-points, poses a great challenge to conventional debuggers Thelockstep, stop/continue-all, etc. will be provided for debugging process, which take at least 50% of themultiple processors. This will entail enhancing the design development effort [3], together with testing poses theand user interface provided by GDB and allow GDB to following problems when it comes to concurrentbe the preferred debugger for such architectures. Each debugging.multiprocessor vendor can then design a GDB port fortheir processors and use the powerful features offered by • The state of different targets and synchronizationGDB. All the targets are multiplexed and will deal with objects is not visible during debugging due to lack ofthe same debugger. Hence the debugger can coordinate debugger information. Knowledge about this state istheir activities and run them as a unit. A single session of essential to allow inferences about the execution stageGDB will be able to handle tasks such as processor of the program and its progress relative to the partialinterfacing, inter-processor communication, run-time order of synchronization.execution and coordination. The processors can be run inlockstep to debug synchronization errors, mutex and • Since each debugger session can control only onesemaphore problems. We also provide the design and processor, no inter-processor control is possible. Thisimplementation of a GUI in Eclipse CDT. is shown in Fig 1. where each session of GDB can communicate with only one processor.Keywords: Multiprocessor Debugging, GDB,Heterogeneous Multiprocessors, Interprocessor- • Also each debugger session can only be configured forCoordination, Concurrent Debugging. only a single architecture and ABI.
  • 2. • Non-availability of a GUI which can show code, GUI.. Section 8 explains the implementation of a usecase disassembly and breakpoint windows for multiple for the multiprocessor debugger. Section 9 compares our programs. work to that of similar proprietary debuggers. Section 10 concludes the paper. GDB Session 1 GDB Session N 2. DESIGN OVERVIEW Local Localmemory memory The components of the framework for multiprocessor debugging comprise of two executable components and Processor 1 Processor N two interfaces. The executable components are the Eclipse CDT GUI on one side and the GDB on the other side. The MI is the machine interface which acts as a parser for No interprocessor GDB and an output formatter and annotator for Eclipse. control Shared Memory Fig 1. Difficulty posed due to different sessions of GDB Since the debugger could not impose inter-processor control, it was very dificult to debug data races, synchronization errors, etc. An ideal multiprocessor debugger should provide the following features: • Ability to simultaneously debug multiple targets of varied architectures. Fig 2. Framework for the debugger • Interprocessor breakpoints block a processor until all processors reach that breakpoint. GDB is a widely used open-source debugger with in-built support for embedded debugging. GDB supports a wide • Status inquiries about processors. range of features like.• Scheduling control provides the means to forcibly Breakpoints suspend and resume processors, and also to Single-stepping stop/continue all processors at a time. Support for multiple source languages Symbol handling The technical issues of these features are presented in Disassemblydetail in the description of the design. Remote debugging facility Multi-threaded support Scripting facilityThis paper is structured as follows. Section 2 gives anoverview of the design and describes the variouscomponents of the multiprocessor debugger. Section 3 Most of GDBs target manipulation passes through anexplains the asynchronous behaviour of GDB. Section 4 abstraction known as the target vector. A target vector isdetails the design of the single GDB session and its similar in concept to a C++ class, with about 30-40advantages. Section 5 explains the salient features of the methods. GDB, being an open-source debugger includesdebugger. Section 6 looks at the implementation details support for hundreds of targets. A multiprocessor system vendor just has to write his own target vector and plug itof the multiprocessor debugger. Section 7 enunciates into GDB and use all the existing features of GDB.the design and implementation of the Eclipse CDT
  • 3. GDB/MI is a line based machine oriented text interface to Making GDB asynchronous required a completeGDB. The Eclipse CDT communicates with GDB through rerganization of the wait_for_inferior structure. The newthe MI. The MI has been added to GDB to provide a wait_for_inferior now periodically queries the processorsseamless and consistent interface to the UIs. Hence the MI for events and reports these events to the interface or to awas chosen to be the interface between GDB and Eclipse. target vector. Here is the design for the new wait_for_inferior.Eclipse [8], was initially developed as a Java IDE, andsince then been expanded as an IDE for C and C++ /* Checks if any of the targets has raised any event or not */programs too. On Linux, the IDE has been developed to algorithm check_inferior()combine with GDB and gdbserver installations, as {debuggers for C programs. Thus Eclipse IDE being store current_processor in temp_processor;platform independent was the ideal choice for for all processors in processor_chainimplementation of the GUI. { /* Usually there is spurt of communication3. ASYNCHRONOUS OPERATION between GDB and gdbserver after long gaps. So we try to increase the responsiveness by Original GDB could only debug one processor at a time getting all the events in one go, until inferior and hence it worked in synchronization with the times out. */ execution of the program. So once the inferior began its while(timeout does not occur) execution, GDB had to wait till control returned back to { it, i.e. until the program stopped. The algorithm below /* Set the GDB state to that of “temp” explains the problem in existing GDB. processor */ select_processor(temp); algorithm wait_for_inferior() /* target_wait() returns an inferior processed / threaded if inferior has raised an event. Else if{ inferior is executing, then target_wait returns a while(1) “timeout” */ { /* Wait for target to return */ call ret_ptid=target_wait(); while (!target_wait()); if ret_ptid == valid ptid then { /* Check the event and take appropriate /* GDB handles the event raised action depending on whether event was a by the inferior Like breakpoint hit, breakpoint hit, inferior exit, signal, etc. */ inferior started, inferior stopped */ handle_inferior_event(); handle_inferior_event(); if (control is with GDB) } break; } } }} }So GDB was forced to stay in this loop and process theevents of only inferior. The following GDB session explains 4. ADVANTAGES OF A SINGLEthis vestige of GDB. SESSION OF GDB (gdb) file a.out A single session of GDB now needs to maintain the information of all the processors being debugged. Each (gdb) r processor may have a different architecture and ABI Continuing……
  • 4. (Application Binary Interface). Hence a single session of breakpoints, continue/stop-all. Its possible to pause theGDB had to be configured for more than one architecture. entire system to examine the state of each core, preventing data from being processed (lost) while examining the stateGDB’s target architecture defines what sort of machine- of the multi core system.language programs GDB can work with, and how it workswith them.GDB provides a mechanism for handlingvariations in OS ABIs. There are two major components in 5. FEATURESthe OS ABI mechanism: sniffers and handlers. A snifferexamines a file matching a BFD architecture/flavour pair The various features added to enable multiprocessorin an attempt to determine the OS ABI of that file. But debugging can be enumerated as follows:GDB can only sniff those architectures with which it isconfigured. So we had to make sure that GDB was • Ability to maintain processor groups. This allows theconfigured with multiple architectures. We have added a programmer the flexibility to handle differentnew target “i386-arm”, and when configured with this processors as units. The processors can be added ortarget GDB can recognize i386 and arm binaries in any removed from groups at random, thereby allowing theformat (elf, coff, stabs, etc.). programmer to let some processors execute unaffected until a certain point and then include them in a group. Single GDB Session • The processors in a group can be run in lockstep. Instead of stepping individual processors separately, a Processor 1 Context group of processors can be stepped together, easing With architecture and ABI the job of the programmer. Processor 1 Context • Barrier breakpoints will help to debug synchronization With architecture and ABI problems. These breakpoints act as joins for the processors, as a processor in a barrier cannot proceed until all the processors have reached a specific point. Local Localmemory memory • We can also exercise inter-processor control by being able to stop all the processors in a group when one Processor 1 Processor N processor hits a breakpoint, or when a processor crashes. Complete interprocessor • The programmer can also, continue all processors in a control group at the same time, or stop all the processors at almost the same time with very less skid. This is very Shared Memory useful because sometimes the programmer needed to stop all the processors instantaneously but because of different sessions of GDB, he could not do so. Now the Fig 3. Single session of GDB debugging multiple processors processors can be stopped in a single command and their state can be studied.Since a single session can now seamlessly interact with allthe processors, inter-processor control is possible. Thisallows the programmer to work with the multi-core sytem asa whole instead of dealing with each processor individually.Hence our debugger has features like lockstep, barrier
  • 5. NO COMMAND DESCRIPTION 1 processor pid Add a processor pid 2 group gid Add a group gid 3 select-processor pid Select processor pid as the current processor to work with 4 select-group gid Select a processor group gid as the current group to work with 5 lockstep Lockstep multiple processors in the current group 6 barrier 1.2, 2.4 Add a barrier breakpoint.between {processor 1, breakpoint no. 2} and {processor 2, breakpoint no. 4} 7 continue-all Continues all processors 8 stop-all Stops all processors in the currently selected group 9 info processors Get the status of all the processors 10 info groups Get the status of the groups Table 1. List of multiprocessor commands added to GDB communicate with a single GDB process, which implies, 6. ECLIPSE CDT GUI sharing all the above mentioned streams. The original and modified launch frameworks are shown in figures 4 and 5 respectively.The main issues which were to be resolved within Eclipse Eclipse implements the original model with twowhich would enable multi-processor debugging can be threads running in concurrency,listed as: i.e: receiving thread: rxThread transmitting thread: txThread.1. Multiple debug launches should be made to correspond This model has now been modified byto a single GDB session. implementing the framework as in figure 5 by running2. Information about which processor the user is currently three threads concurrently,debugging should be conveyed to GDB.3. Additional commands, which are added to GDB as partof multi-processor GDB, have to be added on the Eclipseside also. Original Launch6.1 Single session of GDB Launch 1 Launch 2 Eclipse in its original form used to open a virtualterminal for a debug launch and then exec a GDB process 1. Send 1. Sendon the same. Each debug launch hence would possess a set command commandof streams; error, log, input, output; which Eclipse would 2. Wait 2. Waittap for effective communication between Eclipse and 3. Get output 3. Get outputGDB. Since eclipse uses versions of MI to communicate 4. Process 4. Processwith GDB, a debug launch also possesses a MI session. output output In the multi-processor GDB, however, we have 5. Create 5. Createonly a single GDB process. This GDB process is ‘exec’-edon a virtual terminal, as originally. However, multiple events. events.debug launches communicate with the same GDB process.Thus, Eclipse should identify the presence of a primary Receive Send Receivesession and hence should not exec another GDB processbut instead share the resources of the primary GDBprocess => session. Thus, Eclipse will now have to GDB GDB Fig 4. Normal Launch of Eclipse
  • 6. Further, a processor with a unique ID is created with every launch and the corresponding MISession is Multi-processor Launch given the same ID. Thus, the rxThread, txThread and multiProcStreamer which belong to the MISession know the ID of the processor which they are enabling to debug. Launch 1 Launch 2 6.2 Token Management 1. Send 1. Send Every MI command is preceded by a number command command which indicates the number of the command, i.e: token. 2. Wait 2. Wait For effective multi-processor debugging, we have utilized 3. Get output 3. Get output this token to indicate the current processor/group which 4. Process 4. Process the user is debugging. As per GDB implementation, token output output has been implemented as an integer, i.e: 2^32. Hence, as 5. Create 5. Create per the original configuration, a total of 2^32 commands events. events. can be sent before the token wraps around. The token format we propose and implement for Proc 1 Proc 2 multi-processor debugging is: Send Send g/p! p/g id cmd token 31 30 16 0 Where, g/p!, if 1 => user is debugging a processor group gid. if 0 => user is debugging a processor pid. Multi-processor p/g id => id of processor / group which is currently being debugged. Streamer cmd token => token representing current command. Proc 1 Proc 2 Thus, with this new configuration, the token wraps GDB around every 2^16 commands. However, the debug configuration remains unaffected because of this as Eclipse and GDB are both independent of commands, once Fig 5. Multiprocessor Launch of Eclipse their response is received. As mentioned in the above section, thei.e: receiving thread: rxThread multiProcStreamer checks the token for its p/ g id. The transmitting thread: txThread. command response/output which is received by the multiProcStreamer is then forwarded to the rxThread Multi-processor streamer: multiProcStreamer which is part of the miSession with id same as that of the GDB output. The txThread checks for any commands to be sentto GDB and if present, sends them to GDB. The multiProcStreamer reads the output given byGDB, line by line and sends it for processing to the 6.3 Additional Commandsappropriate rxThread. This appropriation is explained insection 6.2. The following commands have been added for The rxThread originally used to do what the effective multi-processor debugging:multiProcStreamer used to do and sent the output forprocessing, after which events are generated, to be reflectedto the user. Now, the rxThread gets the output from themultiProcStreamer and sends it for processing, etc.
  • 7. Command Corresponding MI Communication) libraries between the RISCs and DSPs Command [6]. The debugging of such concurrent environments can Lockstep over -exec-next(1) be better tackled by this debugger. Lockstep into -exec-step(1) Continue All -exec-continue In order to showcase the utility of our multiprocessor Barrier Insert -barrier-insert(procs, lines) debugger, we have created the following usecase. We Barrier Remove -barrier-remove(procs, lines) have created a .RAW to .PGM convertor as a proof of Add to Group -group(proc, group) concept. We have used QEMU [7] as a simulator for the Processor -processor(proc, group) RISC processor. We run a process under QEMU which Script - script(path) does the file management and system administration. The File Exec and -file-exec-and-symbols(path) DSP processor is simulated by the native x86 processor. Symbols The process which runs native on x86, does the calculations required for the decoding. We have simulated inter-processor communication between the7. MOTIVATING EXAMPLES two processors by way of semaphores.Most concurrent programming problems can be attributed 8. RELATED WORKto a lack of proper synchronization in the access ofshared resources (for example, CPU and bus cycles, Each multiprocessor vendor has their own debugger butmemory, and various devices). The problems are these debuggers are proprietary. Cradle has theirmanifested in the form of data corruptions, race multiprocessor debugger called InspectorT [10] but it isconditions, deadlocks, stalls, and starvation. designed to support only the features of Cradle’s processors. Hence it cannot be used by any other entity working in the embedded systems field, and neither can it Single session of be extended. TotalviewTM is a multiprocess debugger developed by Etnus [5]. It can be used for distributed GDB concurrent debugging but it is proprietary and expensive. Also it cannot be used in heterogeneous environments. Serial link TCP/IP link ARM’s Realview Developer Suite has a MULTI debugger which can simultaneously develop and debug applications on a system with multiple ARM cores or an RISC DSP ARM core plus a DSP core, within the same debug processor/s processor/s environment. It is a proprietary product of ARM and does not include support for processors. GDB on the other IPC hand includes support for ARM, i86, MIPS, PowerPC, SPARC, and a lot more. Hence our debugger can be used out of the box for all these processors. Shared memory 9. CONCLUSION Fig 4. An example of multicore debugging This paper enunciates the design and implementation for a heterogeneous multiprocessor debugger. The work isConsider the above example, which is prominently used based on the paradigm of concurrent debugging. Thein many embedded systems. Generally the RISC paper also describes the extensions implemented in GDBprocessors do the system management while the DSPs and Eclipse, to enable multiprocessor debugging. Sampleperform the calculation intensive part [9]. Generally the implementations of the usecases for this debugger haveRISCs put the data to be processed in the shared memory shown its adequacy for debugging multiple processors.and resume their processing. During this time the DSPs Our debugger enables a framework for seamlessprocess the data and upon completion notify the RISC. integration of new targets while simultaneously providingThere are generally shared IPC (Inter Processor extensibility hiterto unavailable.
  • 8. REFERENCES[1] Stan Shebs. The Future of GDBwww.redhat.com/support/wpapers/cygnus/cygnus_gdb/[2] www.gnu.org/softwares/gdb[3] Daneil Shulz. “Thread Aware-Debugging”[4] Amar Shan “Heterogeneous Processing: a Strategy for Augmenting Moores Law”. www.linuxjournal.com[5] http://www.etnus.com/[6] DSP gateway[7] Fabrice Bellard. “QEMU, a Fast and Portable Dynamic Translator”. In USENIX Conference 2005[8] www.eclipse.com[9] http://focus.ti.com/omap/docs/omaphomepage.tsp[10] www.cradle.com/products/rds_sdk.shtml