A Intro


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Intro

  1. 1. Section A Introduction, Overview and IR
  2. 2. Historical Perspectives 1980-83 1987 Global opt under -O2 1991 Loop opt under -O3 1989 Software pipelining 1994 Floating-pt performance 1997 2000 Stanford SUIF Rice IPA Stanford RISC compiler research MIPS Ucode Compiler (R2000) MIPS Ucode Compiler (R4000) Cydrome Cydra5 Compiler SGI Ragnarok Compiler (R8000) SGI MIPSpro Compiler (R10000) Pro64/Open64 Compiler (Itanium)
  3. 3. Open64 Key Events <ul><li>1994: Development started to compile for MIPS R10000 </li></ul><ul><li>1996: SGI MIPSpro compiler shipped </li></ul><ul><li>1998: Started retargetting to Itanium, changed front-ends to gcc/g++ </li></ul><ul><li>2000: Pro64 compiler open-sourced via GPL </li></ul><ul><li>2001: SGI dropped support, UDel renamed compiler to Open64 </li></ul><ul><li>2001 – 2004: ORC project funded by Intel for Itanium </li></ul><ul><li>2003: PathScale started work to retarget to X86 </li></ul><ul><li>2004: PathScale X86 compiler shipped in April </li></ul>
  4. 4. Important Attributes <ul><li>Most modern design among today's production compilers </li></ul><ul><li>Infrastructure designed to facilitate optimization implementation </li></ul><ul><li>In production since 1996, open-sourced in 2000 </li></ul><ul><li>Compatible and inter-operable with gcc/g++ </li></ul><ul><ul><li>Source constructs, command-line, ABI, linking </li></ul></ul><ul><li>Easy to extend/enhance </li></ul><ul><li>Easy to retarget to new processors </li></ul><ul><li>Widely used today in optimization research </li></ul><ul><li>Cater to small team development environments </li></ul>
  5. 5. PathScale Compiler Key Contributions <ul><li>Retargetted to X86/X86-64 </li></ul><ul><li>Performance leader for 64-bit X86 Linux since 2004 </li></ul><ul><li>Bridge between GNU and Open64 code base to ease tracking GNU releases </li></ul><ul><li>Proprietary OpenMP runtime library </li></ul><ul><li>QA and Build infrastructure </li></ul>
  6. 6. Open64 Overall Design <ul><li>Compiler Infrastructure that can support a full spectrum of optimization capabilities </li></ul><ul><li>Major ingredient: a common Intermediate Representation WHIRL </li></ul><ul><ul><li>Single back-end for multiple front-ends </li></ul></ul><ul><ul><li>One IR, multiple levels of representation </li></ul></ul><ul><ul><li>Compilation process continuously lowers representation </li></ul></ul><ul><li>Specialized components based on optimization disciplines: </li></ul><ul><ul><li>LNO: loop-oriented, based on data dependency </li></ul></ul><ul><ul><li>WOPT: global scalar optimization based on SSA </li></ul></ul><ul><ul><li>IPA: inter-procedural, requiring whole-program analysis </li></ul></ul><ul><ul><li>CG: target-dependent </li></ul></ul><ul><li>No duplicate efforts among the phases </li></ul><ul><ul><li>WHIRL simplifier callable from all phases </li></ul></ul><ul><ul><li>Share analysis results </li></ul></ul><ul><ul><li>Call each other to get work done </li></ul></ul>
  7. 7. Role of IR in a Compiler <ul><li>Support multiple front-ends (languages) </li></ul><ul><li>Support multiple processor targets </li></ul><ul><li>Medium for performing optimizing transformations </li></ul><ul><li>Common interface among phases </li></ul><ul><li>Promote modularity in compiler design </li></ul><ul><li>Reduce duplicate functionalities </li></ul><ul><li>Key infra-structure of a modern compiler </li></ul>
  8. 8. Semantic Level of IR Source program Machine instruction High Low <ul><li>At higher level: </li></ul><ul><ul><li>More kinds of constructs </li></ul></ul><ul><ul><li>Shorter code sequence </li></ul></ul><ul><ul><li>More program info present </li></ul></ul><ul><ul><li>Hierarchical constructs </li></ul></ul><ul><ul><li>Cannot perform many optimizations </li></ul></ul><ul><li>At lower level: </li></ul><ul><ul><li>Less program info </li></ul></ul><ul><ul><li>Fewer kinds of constructs </li></ul></ul><ul><ul><li>Longer code sequence </li></ul></ul><ul><ul><li>Flat constructs </li></ul></ul><ul><ul><li>All optimizations can be performed </li></ul></ul>
  9. 9. Open64’s WHIRL IR <ul><li>WHIRL developed by the Open64 team at SGI: </li></ul><ul><li>One IR, multiple levels of representation </li></ul><ul><li>Compilation process continuously lowers representation </li></ul><ul><li>Each optimization has best level to perform at </li></ul><ul><li>Share analysis results </li></ul><ul><li>No duplicate efforts among phases </li></ul>
  10. 10. Compilation Flow
  11. 11. Optimization Design Philosophy <ul><li>Perform optimization at the highest level where the transformation is expressible </li></ul><ul><li>More program info to help analysis </li></ul><ul><li>Shorter code sequence to manipulate </li></ul><ul><li>Less variations for same computation </li></ul><ul><li>Results: </li></ul><ul><li>Less implementation efforts </li></ul><ul><li>Faster, more efficient optimization </li></ul><ul><li>Greater robustness (easier to test, quicker stabilization </li></ul>
  12. 12. Phase Ordering Design Principles <ul><li>Dictated by lowering process </li></ul><ul><ul><li>Optimizations exposed at lower level must be done late </li></ul></ul><ul><li>Early phases transform to expose optimization opportunities </li></ul><ul><ul><li>Inlining </li></ul></ul><ul><ul><li>Constant propagation </li></ul></ul><ul><ul><li>Loop fusion </li></ul></ul><ul><li>Early phases compute information for the benefits of later phases </li></ul><ul><ul><li>Aliasing and pointer disambiguation </li></ul></ul><ul><ul><li>Use-def </li></ul></ul><ul><ul><li>Data dependency </li></ul></ul><ul><li>Early phases canonicalize the code </li></ul><ul><ul><li>Less code variation for later phases </li></ul></ul><ul><ul><li>May not speed up the program </li></ul></ul><ul><ul><li>Depends on clean-up by later phases </li></ul></ul><ul><li>Cheap optimizations applied multiple times </li></ul><ul><li>Optimizations that destroy program information applied as late as possible </li></ul>
  13. 13. Design of WHIRL <ul><li>WHIRL tree nodes for executable code </li></ul><ul><li>WHIRL node defined in common/com/wn_core.h </li></ul><ul><li>Each function body represented by one big tree </li></ul><ul><li>WHIRL symbol table for declarations </li></ul><ul><li>Different tables for different declaration constructs </li></ul><ul><li>See common/com/symtab*.h </li></ul><ul><li>Smallest WHIRL node is 24 bytes </li></ul><ul><li>Packed fields to save space </li></ul><ul><li>Binary reader and writer </li></ul><ul><li>WHIRL file in ELF file format </li></ul><ul><li>Unique WHIRL file suffix according to phase: </li></ul><ul><li>Front-end: .B </li></ul><ul><li>IPA/inliner: .I </li></ul><ul><li>LNO: .N </li></ul><ul><li>WOPT: .O </li></ul><ul><li>ASCII dumper </li></ul>
  14. 14. Important WHIRL Concepts <ul><li>operator: name of operation </li></ul><ul><li>desc: machine (scalar) type of operands </li></ul><ul><li>rtype: machine (scalar) type of result </li></ul><ul><li>opcode: the triplet (operator, rtype, desc) </li></ul><ul><li>symbol table index: uniquely identify a program symbol </li></ul><ul><li>high level type: type construct as declared in program </li></ul><ul><li>Important for implementing ANSI alias rule </li></ul><ul><li>field-id: uniquely identify a field in a struct or union </li></ul><ul><li>new128-bit SIMD machine types defined for X86’s MMX/SSE </li></ul>
  15. 15. WHIRL Concepts for Optimization <ul><li>Statement nodes are sequence points </li></ul><ul><li>Side effects only possible at statement boundaries </li></ul><ul><li>Statements with side effects limited to: </li></ul><ul><ul><li>Stores </li></ul></ul><ul><ul><li>Calls </li></ul></ul><ul><ul><li>asms </li></ul></ul><ul><li>Expression nodes are not sequence points </li></ul><ul><li>Expression nodes perform computation with no side effects </li></ul><ul><li>Allows aggressive expression transformations </li></ul><ul><li>Source position information (to support debugging) only applies to statement nodes </li></ul>
  16. 16. WHIRL Maps <ul><li>For annotating WHIRL nodes with additional info </li></ul><ul><li>Overcome fixed size of WHIRL nodes </li></ul><ul><li>Good for transient information </li></ul><ul><li>WHIRL nodes classified into categories </li></ul><ul><li>Map_id stored in each WHIRL node </li></ul><ul><li>map_id unique only within a category </li></ul><ul><li>Information stored in map tables with map_id as index </li></ul>
  17. 17. Very High WHIRL <ul><li>Preserve abstraction present in the source language </li></ul><ul><li>Can be translated back to C/F90 with minor loss of semantics </li></ul><ul><li>Originally defined for FORTRAN 90, later extended usage to C/C++ </li></ul><ul><li>Constructs allowed only in VH WHIRL: </li></ul><ul><li>Comma operator </li></ul><ul><li>Nested function calls </li></ul><ul><li>C select operator (? and : ) </li></ul><ul><li>For F90: triplet, arrayexp, arrsection, where </li></ul><ul><li>Inliner can work at this level </li></ul>
  18. 18. High WHIRL <ul><li>Constructs that support loop-level optimizations </li></ul><ul><li>Fixed (though not explicit) control flow </li></ul><ul><li>Key constructs: </li></ul><ul><li>ARRAY (data dependency analysis, vectorization) </li></ul><ul><li>DO loops </li></ul><ul><li>IF statements </li></ul><ul><li>FORTRAN I/O statements </li></ul><ul><li>IPA, PREOPT and LNO work at this level </li></ul><ul><li>Can be translated back to source language </li></ul><ul><li>Allow user to view effects of inlining and LNO </li></ul>
  19. 19. Mid WHIRL <ul><li>One-to-one mapping to RISC instructions </li></ul><ul><li>Control flow explicit via jumps </li></ul><ul><li>Address computation exposed </li></ul><ul><li>Bitfield accesses exposed </li></ul><ul><li>Complex numbers expanded to operations on float </li></ul><ul><li>WOPT work at this level </li></ul><ul><li>Uniform representation enhances optimization opportunites </li></ul>
  20. 20. Low WHIRL <ul><li>Final form of WHIRL to facilitate translation to machine instructions in CG </li></ul><ul><li>Some intrinsics replaced by calls </li></ul><ul><li>Linkage convention exposed </li></ul><ul><li>Data layout completed </li></ul>