Haskell Symposium 2010: An LLVM backend for GHC


Published on

Slides from talk at Haskell Symposium 2010, on an LLVM backend for GHC.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Haskell Symposium 2010: An LLVM backend for GHC

  1. 1. AN LLVM BACKENDFOR GHCDavid Terei & Manuel ChakravartyUniversity of New South WalesHaskell Symposium 2010Baltimore, USA, 30th Sep
  2. 2. What is ‘An LLVM Backend’?• Low Level Virtual Machine• Backend Compiler framework• Designed to be used by compiler developers, not by software developers• Open Source• Heavily sponsored by Apple
  3. 3. Motivation• Simplify • Reduce ongoing work • Outsource!• Performance • Improve run-time
  4. 4. Examplecollat :: Int -> Word32 -> Int Run-time (ms)collat c 1 = c 2000collat c n | even n = 1800 collat (c+1) $ n `div` 2 1600 | otherwise = 1400 collat (c+1) $ 3 * n + 1 1200 1000pmax x n = x `max` (collat 1 n, n) 800 600main = print $ foldl pmax (1,1) 400 [2..1000000] 200 0 LLVM C NCGDifferent, updated run-times compared to paper
  5. 5. Competitors C Backend (C) Native Code Generator (NCG)• GNU C Dependency • Huge amount of work • Badly supported on • Very limited portability platforms such as Windows • Does very little• Use a Mangler on optimisation work assembly code• Slow compilation speed • Takes twice as long as the NCG
  6. 6. GHC’s Compilation Pipeline C Core STG Cmm NCG LLVMLanguage, Abstract Machine, Machine Configuration Heap, Stack, Registers
  7. 7. Compiling to LLVMWon’t be covering, see paper for full details:• Why from Cmm and not from STG/Core• LLVM, C-- & Cmm languages• Dealing with LLVM’s SSA form• LLVM type systemWill be covering:• Handling the STG Registers• Handling GHC’s Table-Next-To-Code optimisation
  8. 8. Handling the STG Registers STG Register X86 RegisterImplement either by: Base ebx• In memory Heap Pointer edi• Pin to hardware registers Stack Pointer ebp R1 esiNCG?• Register allocator permanently stores STG registers in hardwareC Backend?• Uses GNU C extension (global register variables) to also permanently store STG registers in hardware
  9. 9. Handling the STG RegistersLLVM handles by implementing a new calling convention: STG Register X86 Register Base ebx Heap Pointer edi Stack Pointer ebp R1 esi define f ghc_cc (Base, Hp, Sp, R1) { … tail call g ghc_cc (Base, Hp’, Sp’, R1’); return void; }
  10. 10. Handling the STG Registers• Issue: If implemented naively then all the STG registers have a live range of the entire function.• Some of the STG registers can never be scratched (e.g Sp, Hp…) but many can (e.g R2, R3…).• We need to somehow tell LLVM when we no longer care about an STG register, otherwise it will spill and reload the register across calls to C land for example.
  11. 11. Handling the STG Registers• We handle this by storing undef into the STG register when it is no longer needed. We manually scratch them.define f ghc_cc (Base, Hp, Sp, R1, R2, R3, R4) { … store undef %R2 store undef %R3 store undef %R4 call c_cc sin(double %f22); … tail call ghc_cc g(Base,Hp’,Sp’,R1’,R2’,R3’,R4’); return void;}
  12. 12. Handling Tables-Next-To-Code Un-optimised Layout Optimised Layout How to implement in LLVM?
  13. 13. Handling Tables-Next-To-CodeUse GNU Assembler sub-section feature.• Allows code/data to be put into numbered sub-section• Sub-sections are appended together in order• Table in <n>, entry code in <n+1> .text 12 sJ8_info: movl ... movl ... jmp ... [...] .text 11 sJ8_info_itable: .long ... .long 0 .long 327712
  14. 14. Handling Tables-Next-To-Code LLVM Mangler C Mangler• 180 lines of Haskell (half • 2,000 lines of Perl is documentation) • Needed for every platform• Needed only for OS X
  15. 15. Evaluation: Simplicity Code Size LLVM25000 • Half of code is representation of LLVM language20000 C15000 • Compiler: 1,100 lines • C Headers: 2,000 lines10000 • Perl Mangler: 2,000 lines 5000 NCG • Shared component: 8,000 lines 0 • Platform specific: 4,000 – 5,000 LLVM C NCG for X86, SPARC, PowerPC
  16. 16. Evaluation: Performance Run-time against LLVM (%)Nofib: 3• Egalitarian benchmark suite, everything is equal 2.5• Memory bound, little 2 room for optimisation once at Cmm stage 1.5 1 0.5 0 NCG C -0.5
  17. 17. Repa Performance runtimes (s)16141210 8 LLVM NCG 6 4 2 0 Matrix Mult Laplace FFT
  18. 18. Compile Times, Object Sizes Compile Times Vs Object Sizes Vs LLVM LLVM 060 NCG C -240 -420 -6 0 NCG C -8-20 -10-40 -12-60-80 -14
  19. 19. Result• LLVM Backend is simpler.• LLVM Backend is as fast or faster.• LLVM developers now work for GHC!
  20. 20. Get It• LLVM • Our calling convention has been accepted upstream! • Included in LLVM since version 2.7 http://llvm.org• GHC • In HEAD • Should be released in GHC 7.0• Send me any programs that are slower!
  21. 21. Questions?
  22. 22. Why from Cmm?A lot less work then from STG/CoreBut…Couldn’t you do a better job from STG/Core?Doubtful…Easier to fix any deficiencies in Cmm representation andcode generator
  23. 23. Dealing with SSALLVM language is SSA form:• Each variable can only be assigned to once• ImmutableHow do we handle converting mutable Cmm variables?• Allocate a stack slot for each Cmm variable• Use load and stores for reads and writes• Use ‘mem2reg’ llvm optimisation pass • This converts our stack allocation to LLVM variables instead that properly obeys the SSA requirement
  24. 24. Type Systems?LLVM language has a fairly high level type system• Strings, Arrays, Pointers…When combined with SSA form, great for development• 15 bug fixes required after backend finished to get test suite to pass• 10 of those were motivated by type system errors• Some could have been fairly difficult (e.g returning pointer instead of value)