Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
AN LLVM BACKENDFOR GHCDavid Terei & Manuel ChakravartyUniversity of New South WalesHaskell Symposium 2010Baltimore, USA, 3...
What is ‘An LLVM Backend’?• Low Level Virtual Machine• Backend Compiler framework• Designed to be used by compiler develop...
Motivation• Simplify  • Reduce ongoing work  • Outsource!• Performance  • Improve run-time
Examplecollat :: Int -> Word32 -> Int                          Run-time (ms)collat c 1 = c                                ...
Competitors         C Backend (C)            Native Code Generator (NCG)• GNU C Dependency               • Huge amount of ...
GHC’s Compilation Pipeline                                                     C   Core            STG             Cmm    ...
Compiling to LLVMWon’t be covering, see paper for full details:• Why from Cmm and not from STG/Core• LLVM, C-- & Cmm langu...
Handling the STG Registers                               STG Register    X86 RegisterImplement either by:           Base  ...
Handling the STG RegistersLLVM handles by implementing a new calling convention:             STG Register    X86 Register ...
Handling the STG Registers• Issue: If implemented naively then all the STG registers have a live range of the entire funct...
Handling the STG Registers• We handle this by storing undef into the STG register when it is no longer needed. We manually...
Handling Tables-Next-To-Code   Un-optimised Layout        Optimised Layout                         How to implement in LLVM?
Handling Tables-Next-To-CodeUse GNU Assembler sub-section feature.• Allows code/data to be put into numbered sub-section• ...
Handling Tables-Next-To-Code        LLVM Mangler                     C Mangler• 180 lines of Haskell (half   • 2,000 lines...
Evaluation: Simplicity         Code Size         LLVM25000                      • Half of code is representation of       ...
Evaluation: Performance                                      Run-time against LLVM (%)Nofib:                           3• ...
Repa Performance                   runtimes (s)16141210 8                                      LLVM                       ...
Compile Times, Object Sizes      Compile Times Vs         Object Sizes Vs LLVM           LLVM           060               ...
Result• LLVM Backend is simpler.• LLVM Backend is as fast or faster.• LLVM developers now work for GHC!
Get It• LLVM   • Our calling convention has been accepted upstream!   • Included in LLVM since version 2.7                ...
Questions?
Why from Cmm?A lot less work then from STG/CoreBut…Couldn’t you do a better job from STG/Core?Doubtful…Easier to fix any d...
Dealing with SSALLVM language is SSA form:• Each variable can only be assigned to once• ImmutableHow do we handle converti...
Type Systems?LLVM language has a fairly high level type system• Strings, Arrays, Pointers…When combined with SSA form, gre...
Upcoming SlideShare
Loading in …5
×

Haskell Symposium 2010: An LLVM backend for GHC

1,407 views

Published on

Slides from talk at Haskell Symposium 2010, on an LLVM backend for GHC.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Haskell Symposium 2010: An LLVM backend for GHC

  1. 1. AN LLVM BACKENDFOR GHCDavid Terei & Manuel ChakravartyUniversity of New South WalesHaskell Symposium 2010Baltimore, USA, 30th Sep
  2. 2. What is ‘An LLVM Backend’?• Low Level Virtual Machine• Backend Compiler framework• Designed to be used by compiler developers, not by software developers• Open Source• Heavily sponsored by Apple
  3. 3. Motivation• Simplify • Reduce ongoing work • Outsource!• Performance • Improve run-time
  4. 4. Examplecollat :: Int -> Word32 -> Int Run-time (ms)collat c 1 = c 2000collat c n | even n = 1800 collat (c+1) $ n `div` 2 1600 | otherwise = 1400 collat (c+1) $ 3 * n + 1 1200 1000pmax x n = x `max` (collat 1 n, n) 800 600main = print $ foldl pmax (1,1) 400 [2..1000000] 200 0 LLVM C NCGDifferent, updated run-times compared to paper
  5. 5. Competitors C Backend (C) Native Code Generator (NCG)• GNU C Dependency • Huge amount of work • Badly supported on • Very limited portability platforms such as Windows • Does very little• Use a Mangler on optimisation work assembly code• Slow compilation speed • Takes twice as long as the NCG
  6. 6. GHC’s Compilation Pipeline C Core STG Cmm NCG LLVMLanguage, Abstract Machine, Machine Configuration Heap, Stack, Registers
  7. 7. Compiling to LLVMWon’t be covering, see paper for full details:• Why from Cmm and not from STG/Core• LLVM, C-- & Cmm languages• Dealing with LLVM’s SSA form• LLVM type systemWill be covering:• Handling the STG Registers• Handling GHC’s Table-Next-To-Code optimisation
  8. 8. Handling the STG Registers STG Register X86 RegisterImplement either by: Base ebx• In memory Heap Pointer edi• Pin to hardware registers Stack Pointer ebp R1 esiNCG?• Register allocator permanently stores STG registers in hardwareC Backend?• Uses GNU C extension (global register variables) to also permanently store STG registers in hardware
  9. 9. Handling the STG RegistersLLVM handles by implementing a new calling convention: STG Register X86 Register Base ebx Heap Pointer edi Stack Pointer ebp R1 esi define f ghc_cc (Base, Hp, Sp, R1) { … tail call g ghc_cc (Base, Hp’, Sp’, R1’); return void; }
  10. 10. Handling the STG Registers• Issue: If implemented naively then all the STG registers have a live range of the entire function.• Some of the STG registers can never be scratched (e.g Sp, Hp…) but many can (e.g R2, R3…).• We need to somehow tell LLVM when we no longer care about an STG register, otherwise it will spill and reload the register across calls to C land for example.
  11. 11. Handling the STG Registers• We handle this by storing undef into the STG register when it is no longer needed. We manually scratch them.define f ghc_cc (Base, Hp, Sp, R1, R2, R3, R4) { … store undef %R2 store undef %R3 store undef %R4 call c_cc sin(double %f22); … tail call ghc_cc g(Base,Hp’,Sp’,R1’,R2’,R3’,R4’); return void;}
  12. 12. Handling Tables-Next-To-Code Un-optimised Layout Optimised Layout How to implement in LLVM?
  13. 13. Handling Tables-Next-To-CodeUse GNU Assembler sub-section feature.• Allows code/data to be put into numbered sub-section• Sub-sections are appended together in order• Table in <n>, entry code in <n+1> .text 12 sJ8_info: movl ... movl ... jmp ... [...] .text 11 sJ8_info_itable: .long ... .long 0 .long 327712
  14. 14. Handling Tables-Next-To-Code LLVM Mangler C Mangler• 180 lines of Haskell (half • 2,000 lines of Perl is documentation) • Needed for every platform• Needed only for OS X
  15. 15. Evaluation: Simplicity Code Size LLVM25000 • Half of code is representation of LLVM language20000 C15000 • Compiler: 1,100 lines • C Headers: 2,000 lines10000 • Perl Mangler: 2,000 lines 5000 NCG • Shared component: 8,000 lines 0 • Platform specific: 4,000 – 5,000 LLVM C NCG for X86, SPARC, PowerPC
  16. 16. Evaluation: Performance Run-time against LLVM (%)Nofib: 3• Egalitarian benchmark suite, everything is equal 2.5• Memory bound, little 2 room for optimisation once at Cmm stage 1.5 1 0.5 0 NCG C -0.5
  17. 17. Repa Performance runtimes (s)16141210 8 LLVM NCG 6 4 2 0 Matrix Mult Laplace FFT
  18. 18. Compile Times, Object Sizes Compile Times Vs Object Sizes Vs LLVM LLVM 060 NCG C -240 -420 -6 0 NCG C -8-20 -10-40 -12-60-80 -14
  19. 19. Result• LLVM Backend is simpler.• LLVM Backend is as fast or faster.• LLVM developers now work for GHC!
  20. 20. Get It• LLVM • Our calling convention has been accepted upstream! • Included in LLVM since version 2.7 http://llvm.org• GHC • In HEAD • Should be released in GHC 7.0• Send me any programs that are slower!
  21. 21. Questions?
  22. 22. Why from Cmm?A lot less work then from STG/CoreBut…Couldn’t you do a better job from STG/Core?Doubtful…Easier to fix any deficiencies in Cmm representation andcode generator
  23. 23. Dealing with SSALLVM language is SSA form:• Each variable can only be assigned to once• ImmutableHow do we handle converting mutable Cmm variables?• Allocate a stack slot for each Cmm variable• Use load and stores for reads and writes• Use ‘mem2reg’ llvm optimisation pass • This converts our stack allocation to LLVM variables instead that properly obeys the SSA requirement
  24. 24. Type Systems?LLVM language has a fairly high level type system• Strings, Arrays, Pointers…When combined with SSA form, great for development• 15 bug fixes required after backend finished to get test suite to pass• 10 of those were motivated by type system errors• Some could have been fairly difficult (e.g returning pointer instead of value)

×