Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

B Compile


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

B Compile

  1. 1. Section B Compilation Process
  2. 2. Optimization Control Philosophy <ul><li>Low optimization levels: </li></ul><ul><ul><li>Shorter compile time </li></ul></ul><ul><ul><li>Safer optimizations </li></ul></ul><ul><ul><li>Greater computational accuracy </li></ul></ul><ul><ul><li>Slower generated code </li></ul></ul><ul><li>High optimization levels: </li></ul><ul><ul><li>Longer compile time </li></ul></ul><ul><ul><li>More aggressive optimizations </li></ul></ul><ul><ul><li>Precision compromised for higher performance </li></ul></ul><ul><ul><li>Faster generated code </li></ul></ul><ul><li>Fine control over optimization via a multitude of options </li></ul><ul><ul><li>PathOpt2 can be a lot of help </li></ul></ul>
  3. 3. Optimization Flags vs. Phases Invoked <ul><li>-O0 (the default under -g) </li></ul><ul><ul><li>Front-end and code generator, all optimizations disabled </li></ul></ul><ul><li>-O1 </li></ul><ul><ul><li>Front-end and code generator, local optimizations only </li></ul></ul><ul><li>-O2 (the default) </li></ul><ul><ul><li>Add WOPT and rest of CG's optimizations </li></ul></ul><ul><li>-O3 </li></ul><ul><ul><li>Add LNO </li></ul></ul><ul><li>-ipa (can be any opt level) </li></ul><ul><ul><li>Add IPA </li></ul></ul><ul><li>-Ofast </li></ul><ul><ul><li>Same as </li></ul></ul><ul><ul><li>-O3 -ipa -OPT:Ofast -fno-math-errno –ffast-math </li></ul></ul><ul><li>-OPT:Ofast is: </li></ul><ul><li>-OPT:ro=2:Olimit=0:div_split=ON:alias=typed </li></ul>
  4. 4. Option Groups <ul><li>Options organized into groups by compiler phase or by class of feature </li></ul><ul><li>General syntax: </li></ul><ul><ul><ul><li>-GROUPNAME:opt[=val]{:opt=[val]} </li></ul></ul></ul><ul><li>Some GNU-style flags map to these options </li></ul><ul><ul><ul><li>-march -ffast-math -ffloat-store -fno-inline </li></ul></ul></ul><ul><li>Group names: </li></ul>Loop nest optimization -LNO: Global scalar optimization -WOPT: Code generation -CG: Language features -LANG: Inter-procedural analysis -IPA: Back-end inlining -INLINE: Target environment -TENV: Target machine -TARG: Optimizations -OPT: User listing -LIST:
  5. 5. Roles of the Compiler Driver <ul><li>Implemented in open64/driver </li></ul><ul><li>Handles all command line options </li></ul><ul><li>Invokes all compilation phases: </li></ul><ul><ul><li>Preprocessor </li></ul></ul><ul><ul><li>Front-end </li></ul></ul><ul><ul><li>Inliner </li></ul></ul><ul><ul><li>Backend (be, lno, wopt, cg) </li></ul></ul><ul><ul><li>Assembler </li></ul></ul><ul><ul><li>Linker </li></ul></ul><ul><li>Maintain compatibility with GNU options </li></ul>
  6. 6. Compiler Driver <ul><li>open64/driver/OPTIONS </li></ul><ul><ul><li>Table of options specifications </li></ul></ul><ul><ul><li>Can map an option to a different option </li></ul></ul><ul><li>Single executable, multiple soft links </li></ul><ul><li>arg[0] string to identify language </li></ul><ul><li>Query compiler-relevant env variables </li></ul><ul><li>Query host processor under -march=auto (default) </li></ul><ul><li>Look up compiler.defaults file for system-specific options </li></ul>
  7. 7. C/C++ Front-end History <ul><li>GNU 2.95 when open-sourced (2000) </li></ul><ul><ul><li>Direct translation from GNU internal trees to WHIRL </li></ul></ul><ul><ul><li>Separate C and C++ front-ends embedded inside Open64 </li></ul></ul><ul><li>Updated to GNU 3.3.1(2004) </li></ul><ul><li>Defined .spin file format as virtual machine target (2006) </li></ul><ul><ul><li>GNU compiler no longer maintained as part of Open64 </li></ul></ul><ul><ul><li>Streamlined efforts for updating to each GNU release </li></ul></ul><ul><ul><li>Duplicate code between C and C++ eliminated </li></ul></ul><ul><li>GNU 4.0.2 front-ends shipped March 2007 </li></ul><ul><li>GNU 4.2.0 front-ends shipped October 2007 </li></ul><ul><li>Path to additional GNU languages in future </li></ul>
  8. 8. Using GNU Compiler as Front-end <ul><li>Start with GNU compiler configured for X86-64 </li></ul><ul><li>Old Way : </li></ul><ul><li>open64/kgccfe for C </li></ul><ul><li>open64/kg++fe for C++ </li></ul><ul><li>Calls for WHIRL generation embedded in GNU code </li></ul><ul><li>C++ requires running entire compilation to assembly to produce complete translation data </li></ul><ul><li>Duplicate source trees between C and C++ </li></ul>
  9. 9. Using GNU Compiler as Front-end <ul><li>Start with GNU compiler configured for X86-64 </li></ul><ul><li>New Way : </li></ul><ul><li>gspin tree nodes – components to model GNU trees </li></ul><ul><ul><li>Utilities implemented in libspin repository </li></ul></ul><ul><ul><li>gspin tree nodes dumped out to .spin file </li></ul></ul><ul><li>Identify points to intercept GNU trees in gcc’s compilation </li></ul><ul><li>gspin tree nodes generated from GNU trees in gcc/tree.c </li></ul><ul><li>open64/wgen translates gspin tree nodes (.spin file) to WHIRL nodes (.B file) </li></ul><ul><ul><li>wgen's mode of operation modeled after kgccfe/kg++fe </li></ul></ul>
  10. 10. Gspin Tree Nodes <ul><li>Purpose: encode complete information in GNU trees for dumping to .spin file </li></ul><ul><li>8-byte sized gspin node as atomic building block </li></ul><ul><ul><li>defined in libspin/gspin-tree.h </li></ul></ul><ul><ul><li>represents a field of information in GNU's tree node </li></ul></ul><ul><ul><li>aggregate of contiguous gspin nodes represents a GNU tree node </li></ul></ul><ul><ul><li>representation scheme defined in libspin/gspin-tel.h </li></ul></ul><ul><li>Allocation of gspin nodes managed by libspin </li></ul><ul><li>I/O of gspin nodes via mmap() </li></ul><ul><li>ASCII dumper </li></ul><ul><ul><li>Each node only dumped once to avoid infinite recursion </li></ul></ul>
  11. 11. Example of gspin nodes <ul><li>GNU’s PLUS tree node has 14 fields </li></ul><ul><li>Root gspin node to encode PLUS </li></ul><ul><li>13 more gspin nodes for rest of fields </li></ul>tree_code = PLUS 0 tree-code_class = BINARY 1 tree_type 2 tree_chain = NULL 3 flags 4 arity = 13 5 file name 6 line no 7 operand 0 8 operand 1 9 unused 10 unused 11 unused 12 unused
  12. 12. FORTRAN Front-end <ul><li>Originated from the Cray Fortran compiler </li></ul><ul><li>Front-end consists of: </li></ul><ul><li>Cray Fortran front-end (crayf90/fe90) </li></ul><ul><li>Adaptor for WHIRL generation (crayf90/sgi) </li></ul><ul><li>Multiple run-time library directories </li></ul><ul><li>libF77, libI77, libU77, libfi, libf, libu </li></ul><ul><li> contains all of these </li></ul><ul><li>Numerous bug fixes and enhancements at PathScale </li></ul><ul><li>TR15580 and TR15581 </li></ul>
  13. 13. FORTRAN Front-end Implementation <ul><li>Three sub-phases in fe90: </li></ul><ul><li>1. Lexer (src_input.c, lex.c) and Parser (p_*.c) </li></ul><ul><li>Program is represented in tree form </li></ul><ul><li>Tree nodes are entries in a variety of tables (sytb.*): </li></ul><ul><li>scp-tbl for scopes </li></ul><ul><li>SH_Tbl for statement headers </li></ul><ul><li>global_name_tbl and name_tbl for Fortran identifiers </li></ul><ul><li>AT_Tbl for attribute nodes for variables and procedures </li></ul><ul><li>CN_Tbl for constant values </li></ul><ul><li>IR_Tbl for operator nodes </li></ul><ul><li>IL_Tbl for list linking nodes </li></ul><ul><li>More for array bounds, file names, etc </li></ul>
  14. 14. FORTRAN Front-end Implementation <ul><li>2. Semantic pass (s_*.c) </li></ul><ul><li>Operations that could not be performed on-the-fly during parsing </li></ul><ul><li>3. WHIRL generation </li></ul><ul><li>cvrt_to_pdg() and send*() (in icvrt.c) traverse trees and symbol tables </li></ul><ul><li>They call routines in crayf90/sgi to generate WHIRL </li></ul><ul><li>To debug: </li></ul><ul><li>Build with -D_DEBUG </li></ul><ul><li>Run mfef95 with: </li></ul><ul><li>-uall (dump all tables) </li></ul><ul><li>-uir2 (dump the most frequently useful tables) </li></ul><ul><li>Routines in fe90/debug.c can be called when running debugger on mfef95 </li></ul>
  15. 15. Goto Conversion <ul><li>Converts loops written in goto's to high-level loop forms to be friendly to LNO </li></ul><ul><li>Based on paper by Ana Erosa and Laurie Hendren </li></ul><ul><li>Originally applied once before LNO (be/com/opt_goto.cxx) </li></ul><ul><li>GNU 4.2 front-end no longer generates high-level loop constructs </li></ul><ul><li>Added new phase to cater to VH WHIRL at beginning of backend (be/be/goto_conv.cxx) </li></ul>
  16. 16. Very High WHIRL Optimizer <ul><li>Lower to High WHIRL while performing optimizations </li></ul><ul><li>First part deals with common language constructs (be/vho/vho_lower.cxx): </li></ul><ul><li>Bit-field optimizations </li></ul><ul><li>Short-circuit boolean expressions </li></ul><ul><li>Switch statement optimization </li></ul><ul><li>Simple if-conversion </li></ul><ul><li>Assignments of small structs: lower struct copy to assignments of individual fields </li></ul><ul><li>Convert patterns of code sequences to intrinsics: </li></ul><ul><ul><li>Saturated subtract, abs() </li></ul></ul><ul><li>Other pattern-based optimizations </li></ul><ul><ul><li>max, min </li></ul></ul>
  17. 17. Very High WHIRL Optimizer <ul><li>Second part generates efficient code from FORTRAN 90 constructs (be/vho/f90_lower.cxx): </li></ul><ul><li>array section operations expanded to loops </li></ul><ul><li>introduce array temporaries in order to preserve parallel semantics </li></ul><ul><li>A(1:n) = A(B(1:n)) </li></ul><ul><ul><li>expands to </li></ul></ul><ul><li>do i = 1, n </li></ul><ul><li>t(i) = A(B(i)) </li></ul><ul><li>enddo </li></ul><ul><li>do i = 1, n </li></ul><ul><li>A(i) = t(i) </li></ul><ul><li>enddo </li></ul>
  18. 18. Lowering <ul><li>All lowering actions performed after VHO performed by calling wn_lower() (be/com/wn_lower.cxx) </li></ul><ul><li>Each bit in LOWER_ACTIONS parameter controls one class of lowering </li></ul><ul><li>Recursively walk the tree and apply the lowering relevant to each node </li></ul><ul><li>Mostly simple tree transformation </li></ul>
  19. 19. WHIRL Simplifier <ul><li>Simplify a WHIRL tree to a more efficient form </li></ul><ul><li>Implemented in common/com/wn_simp_code.h </li></ul><ul><li>Node types mapped by cpp to either wn or coderep when invoked from wopt </li></ul><ul><li>(should have used C++ template) </li></ul><ul><li>Evaluate constant expressions to constants </li></ul><ul><li>Used by front-ends to handle constant expressions in declarations </li></ul><ul><li>Automatically called during WHIRL tree generation </li></ul><ul><li>The cheapest optimization </li></ul><ul><li>Should be called whenever transformation occurs </li></ul>
  20. 20. Linkage Convention <ul><li>Implemented in common/com/x8664/targ_sim.cxx </li></ul><ul><li>Called from the lowerer </li></ul><ul><li>Controls: </li></ul><ul><li>How parameters of different types are passed </li></ul><ul><li>How function return values of different types are returned </li></ul><ul><li>Fake parameters for return structs introduced by lowerer </li></ul>
  21. 21. Data Layout <ul><li>Refers to how program variables are allocated in memory </li></ul><ul><li>Program variables remain discrete until laid out in memory </li></ul><ul><li>Optimization opportunities arise from: </li></ul><ul><li>Alignment </li></ul><ul><li>Locality of references </li></ul><ul><li>Strategy: delay until benefits seen for certain relative positioning </li></ul>
  22. 22. Data Layout Mechanism <ul><li>Designed so it can occur continuously throughout optimization and compilation </li></ul><ul><li>Happens during: </li></ul><ul><li>IPA (common block splitting and padding) </li></ul><ul><li>LNO (enforcing alignment) </li></ul><ul><li>Hierarchical layout representation: for each symbol, </li></ul><ul><li>ST_base : symbol relative to which it is allocated (original set to itself) </li></ul><ul><li>ST_ofst : position of symbol in ST_base 's block </li></ul><ul><li>A symbol is laid out by setting its ST_base and ST_ofst fields </li></ul><ul><li>Symbol ST_base itself may not be laid out till later </li></ul>
  23. 23. Data Layout for Stack Frame <ul><li>Implemented in be/com/data_layout.cxx </li></ul><ul><li>Segments for: </li></ul><ul><li>Formal parameters </li></ul><ul><li>Fixed temporaries for Fortran alternate entry parameters </li></ul><ul><li>Actual (outgoing) parameters </li></ul><ul><li>Locals (user or compiler-generated) </li></ul><ul><li>Different stack models: </li></ul><ul><li>Small ($sp only) </li></ul><ul><li>Large ($fp and $sp) </li></ul><ul><li>Dynamic ($fp and $sp) </li></ul><ul><li>Stack frame finalized at end of code generation </li></ul><ul><li>Final resolution of ST_base either $sp or $fp </li></ul>
  24. 24. Data Layout Example <ul><li>Variable A is 32 bytes off $sp </li></ul><ul><li>See Base_Symbol_And_Offset() in common/com/symtab.cxx </li></ul>A base = B offset = 12 base = C offset =20 B C base = $sp offset =0 $sp