Slide 1 - hArtes Web Site - Home


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Slide 1 - hArtes Web Site - Home

  1. 1. Hartes Toolchain 18 September 2009 Andrea Michelotti WP2 leader
  2. 2. AGENDA <ul><li>hArtes toolchain installation </li></ul><ul><ul><li>Q/A </li></ul></ul><ul><li>hArtes toolchain flow </li></ul><ul><ul><li>Overview </li></ul></ul><ul><ul><li>Partitioning </li></ul></ul><ul><ul><li>Mapping </li></ul></ul><ul><ul><li>Synthesis </li></ul></ul><ul><li>Hartes Projects </li></ul><ul><ul><li>Build an hArtes project </li></ul></ul><ul><ul><li>Build examples </li></ul></ul><ul><li>hArtes API </li></ul><ul><ul><li>Memory Allocation </li></ul></ul><ul><ul><li>Benchmarking </li></ul></ul><ul><ul><li>Thread </li></ul></ul><ul><ul><li>OpenMP </li></ul></ul><ul><li>Hartes hands on </li></ul>18 September 2009
  3. 3. READY?? Requirements <ul><li>TARGET BOARD (DEB) </li></ul><ul><ul><li>Ethernet cable to be connected to HOST PC (Required to download and debug APP) </li></ul></ul><ul><ul><li>Serial cable connected to the DBGU(USART). To control RT applications (optional if the network is ok) </li></ul></ul><ul><ul><li>SDCARD with linux EABI (hArtes uses linux 2.6.24-rt3) </li></ul></ul><ul><li>HOST PC needed for hArtes cross-compilation </li></ul><ul><ul><li>Ubuntu/Debian distribution or it's virtualization via our hartes Virtualbox image. </li></ul></ul><ul><ul><li>Hartes Packages </li></ul></ul><ul><li>file:///home/michelo/progetti/hArtes/SW/GNAM/doc/html/Installation.html </li></ul>18 September 2009 hArtes_vbox_hdd.vdi.bz2
  4. 4. Hartes toolchain overview <ul><li>Objective: </li></ul><ul><li>Take a c-project , and map it to a platform composed of etherogeneous Processing Elements </li></ul><ul><li>(i.e GPP , DSP , FPGA ). </li></ul><ul><li>Obtain a speed-up respect the execution only on the GPP. </li></ul>18 September 2009
  5. 5. Hartes toolchain overview <ul><li>The toolchain is composed by a: </li></ul><ul><li>- Partitioner ( zebu ) </li></ul><ul><li>- Mapper ( hArmonic ) </li></ul><ul><li>- GPP C-compiler ( HGCC ) </li></ul><ul><li>- DSP C-compiler ( chesscc) </li></ul><ul><li>- FPGA C-compiler ( dwarv + xilinx ) </li></ul><ul><li>- Linker & binutils ( LD, mex2elf.. ) </li></ul><ul><li>- Debugger ( HGDB ) </li></ul><ul><li>Partitioner, Mapper and compilers exchange annotated C files + XML annotation. </li></ul><ul><li>These tools run on the HOST PC. </li></ul><ul><li>The toolchain produce a single ELF executable image for the target board. </li></ul>18 September 2009
  6. 6. Partitioning 18 September 2009 <ul><li>Group operations (functions, loops...) into tasks (c-functions) . In order to exploit the parallelism of the application. </li></ul><ul><li>Inputs: </li></ul><ul><ul><li>C source files, source annotations, XML file </li></ul></ul><ul><li>Outputs </li></ul><ul><ul><li>C source files, partitioning and mapping annotations, XML file </li></ul></ul><ul><ul><li>Parallelism expressed as openMP compliant C cod e </li></ul></ul>
  7. 7. Mapping 18 September 2009 <ul><li>Generates optimized versions (implementations) of tasks for different Processing Elements . </li></ul><ul><li>Determines the cost of each task when executed on a particular Processing Elemen t </li></ul><ul><li>Assigns (schedule) each task to a PE </li></ul><ul><li>Inputs: </li></ul><ul><ul><li>C source files, source annotations, XML file </li></ul></ul><ul><li>Outputs </li></ul><ul><ul><li>C source files for each PE + mapping annotations, XML file </li></ul></ul>
  8. 8. Synthesis <ul><li>Synthesis tools include compilers, Runtime, libraries and utilities for PEs (GPP,DSP, FPGA). </li></ul><ul><li>The hArtesFramework use them to generate a single executable ELF image for the HW platform. </li></ul><ul><li>Synthesis tools offer a “ single processor abstraction ” to the user. The application flow is executed on the GPP. DSP & FPGA are transparently used to execute some GPP functions (like co-processors) to speed up the overall execution time. </li></ul>18 September 2009 GPP Source(s) + annotations DSP Source(s) ‏ L I NK Unified elf executable FPGA Source(s) ‏ GPP Compiler: HGCC DSP Compiler: chesscc FPGA Compiler: C2VHDL M A K E
  9. 9. <ul><li>It’s the hArtes customization of the ARM-GCC compiler. It produces binaries for the ARM. </li></ul><ul><li>It produces the code needed to perform a “remote” function call on the DSP or FPGA. This code is what we call “stitch code” or “molen APis” </li></ul><ul><li>The stitch code performs: remote parameter passing, initializes and starts the remote function on the PE, synchronization (wait for remote function termination), it retrieves results. THIS OVERHEAD MUST BE COMPENSATED BY A FASTER EXECUTION </li></ul>GPP C-Compiler: HGCC 18 September 2009 ARM main() ‏ DSP func3 FPGA func2 ARM func0 ARM func1 remote calls stitch code & overhead
  10. 10. HGCC mapping pragmas <ul><li>HGCC will output stitch code for the GPP functions marked with “mapping annotations” like: # pragma map call_hw < component id > [<implementation id>]. The component id identifies the targeted component. The implementation id , identifies a particular implementation (many implementations of the same function can be generated). </li></ul>18 September 2009 ARM main() ‏ DSP func3 FPGA func2 ARM func0 ARM func1 remote calls stitch code #pragma map call_hw dsp void func3(); #pragma map call_hw fpga void func2(); /* MY APPLICATION*/ main(){ func1(); /*on ARM*/ func2(); /*on FPGA*/ func0(); /*on ARM*/ func3(); /*on DSP*/ } ARM func2 EXECUTION TIME NO MAP MAPPED TODAY ONLY SEQUENTIAL GAIN ARM func0 ARM func1 ARM func3 ARM func1 FPGA func2 ARM func0 DSP func3 overhead overhead
  11. 11. DSP C-compiler <ul><li>The hArtes toolchain use the command line interface of the DSP C compiler/linker/librarian: chesscc. </li></ul><ul><li>The DSP disassembler can be called : darts –d <magic image> </li></ul><ul><li>DOCs in <TARGETDIRINSTALL>/magic-07Q2.3.hartes/doc/manuals/ </li></ul>18 September 2009
  12. 12. GPP binutils:disassembler, librarian, debugger, linker & scripts <ul><li>The GPP disassembler arm-softfloat-linux-gnueabi-objdump –d <elf image> </li></ul><ul><li>The GPP librarian arm-softfloat-linux-gnueabi-objdump-ar to make libraries </li></ul><ul><li>The GPP linker arm-softfloat-linux-gnueabi-objdump-ld that links all the hArtes object produced to build a single ARM ELF linux executable. </li></ul><ul><li>The hArtes debugger arm-elf-gdb </li></ul><ul><li>An hArtes linker script defines the memory areas where to allocate text , bss , data sections of all PEs. It defines also the shared sections (used to allocate data in shared memories). It can be found in <hartes framework> / lib/target_dek_linux.ld. </li></ul>18 September 2009
  13. 13. DEBUGGING <ul><li>IT’S A HARD HARD WORK!, the architecture is not homogeneous, different compiler chains (different ABI, different debug informations). </li></ul><ul><li>First release of an integrated debugger mid-september 09 </li></ul><ul><li>I use command line arm-elf - gdb on the HOST + gdbserver on the HW board to debug the hArtes application </li></ul><ul><li>We plan to offer more integrated debugging capabilities into the Eclipse </li></ul>18 September 2009
  14. 14. Hartes Runtime <ul><li>The hArtes runtime is in <hartesframework>/lib/libhartes.a </li></ul><ul><li>It contains the code needed to map the HW resources (shared memories, DSP/FPGA I/O spaces) </li></ul><ul><li>It contains molen APIs for DSP and FPGA used by HGCC to make transparent remote calls </li></ul><ul><li>It contains the code to load PE sections. </li></ul><ul><li>It contains APIs and MACROS to access HW resources (that you shouldn't use) ‏ </li></ul><ul><li>It will contain APIs to support hArtes memory allocation & thread creation </li></ul>18 September 2009
  15. 15. Supported Programming models <ul><li>Current Programming model is co-processor where tasks are serial </li></ul><ul><li>Next hArtes optimization is automatic task parallelization where task can be overlapped </li></ul><ul><li>Explicit thread creation will is also supported via hthread_create, hthread_join.. </li></ul>18 September 2009 ARM func2 EXECUTION TIME NO MAP SERIAL MAPPED PARALLEL MAPPED FPGA func2 ARM func0 ARM func1 ARM func3 ARM func1 ARM func0 DSP func3 ARM func1 FPGA func2 ARM func0 DSP func3 overhead overhead
  16. 16. AGENDA (hands on) ‏ <ul><li>Example Flow </li></ul><ul><ul><li>Example Application compiled only for ARM </li></ul></ul><ul><ul><ul><li>makefile </li></ul></ul></ul><ul><ul><ul><li>looking at the outputs </li></ul></ul></ul><ul><ul><ul><li>debugging </li></ul></ul></ul><ul><ul><ul><li>Hwprofiling </li></ul></ul></ul><ul><ul><li>Example Application fully partitioned and mapped automatically </li></ul></ul><ul><ul><ul><li>makefile </li></ul></ul></ul><ul><ul><ul><li>looking at the outputs </li></ul></ul></ul><ul><ul><ul><li>Hwprofiling </li></ul></ul></ul><ul><ul><li>Example Application manually mapped </li></ul></ul><ul><ul><ul><li>makefile </li></ul></ul></ul><ul><ul><ul><li>debugging </li></ul></ul></ul><ul><ul><ul><li>Hwprofiling </li></ul></ul></ul><ul><ul><li>Contributing </li></ul></ul>18 September 2009
  17. 17. Looking at the Outputs (Synthesis tools) ‏ <ul><li>release/<example>_dek_linux.magic, is the image generated by CHESSDE tools (DSP tools). This is the entry point for other DSP tools like: debugger or disassembler. </li></ul><ul><li>release/<example>, is the MAP file generated by the DSP linker, that list functions and variable allocation </li></ul><ul><li>release/<example>_dek_linux.magic.elf is the conversion of the release/<example>_dek_linux.magic into an ARM-ELF object that can be linked with ARM objects. </li></ul><ul><li>release/<example>_dek_linux.elf is the final hArtes executable image that can be uploaded in to HW target. </li></ul><ul><li>release/<example> is the MAP file generated by the hArtes linker, associated with the hArtes executable image </li></ul>18 September 2009
  18. 18. Looking at the Output (DSE tools) ‏ <ul><li>release/partitioned/main_partitioned.c it’s the output of the partitioning tool ( zebu ). We remember that zebu processes files in the DSECOMPILESRC </li></ul><ul><li>release/mapped/ARM.c and release/mapped/MAGIC.c are the outputs of the mapping tools. ARM.c will be processed by the HGCC while MAGIC.c will be processed by CHESSCC compiler (DSP compiler). Mapping tools generate a file for each PE involved in the mapping. </li></ul><ul><li>hartes.xml contains a list of implementations generated by mapping tools. It is the updated version of the <hartesframework>/lib/ dek_arch.xml (copied into <project>/<project>.xml>) that describes an “ARM+DSP+FPGA” architecture. </li></ul>18 September 2009
  19. 19. Contributing to make the toolchain more robust and efficient <ul><li>It’ important your contribution. </li></ul><ul><li>We ask you: </li></ul><ul><li>Your application must be compiled and working on the platform just using only ARM (Real time applications must be downgraded). Because we must have a reference to be used: </li></ul><ul><ul><li>To benchmark the hArtes toolchain speedup (if any). </li></ul></ul><ul><ul><li>Helps debugging a not working application that pass through the hArtes toolchain. </li></ul></ul><ul><li>know how to improve performance of your application; For instance you could provide us: </li></ul><ul><ul><li>a manual mapped version of your application </li></ul></ul><ul><ul><li>hints of what the compilation chain could make to map more efficiently your application </li></ul></ul><ul><ul><li>request of new features </li></ul></ul>18 September 2009
  20. 20. DEBUGGING session <ul><li>Magic debugger can block magic execution in every moment and show you the position on the code. </li></ul><ul><li>To start a ARM+MAGIC debug session you should start the magic debugger first and type the command gbon on the command line. Then start the application on the HW . By using or not the gdbserver localhost:9000 <appname> debugger, depending if you want to debug ARM or not) </li></ul><ul><li>The application will block before reaching the main() of your application , allowing to ARM and MAGIC debuggers to put breakpoints in the application. </li></ul><ul><li>Then type gboff to make ARM and MAGIC running. </li></ul>18 September 2009