Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Oh the compilers you'll build

25 views

Published on

Compilers have been improving programmer productivity ever since IBM produced the first FORTRAN compiler in 1957. Today, we mostly take them for granted but even after more than 60 years, compiler researchers and practitioners continue to push the boundaries for what compilers can achieve as well as how easy it is to leverage the sophisticated code bases that encapsulate those six decades of learning in this field. In this talk, I want to highlight how industry trends like the migration to cloud infrastructures and data centers as well as the rise of flexibly licensed open source projects like LLVM and Eclipse OMR are paving the way towards even more effective and powerful compilation infrastructures than have ever existed: compilers with the opportunity to contribute to programmer productivity in even more ways than simply better hardware instruction sequences, and with simpler APIs so they can be readily used in scenarios where even today's most amazing Just In Time compilers are not really practical.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Oh the compilers you'll build

  1. 1. Oh, the compilers you’ll build! • Mark Stoodley @mstoodle mstoodle@ca.ibm.com • Eclipse OpenJ9 and OMR project lead • Production JIT developer since 2002 • Creator of the OMR JitBuilder library
  2. 2. Important Disclaimers • THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. • WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. • ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. • ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. • IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. • IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. • NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: • CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS 2
  3. 3. Outline • Musings on role of compilation in software • Making compilers more accessible with Eclipse OMR • Not one, but TWO demos! • Where can we go from here? 3
  4. 4. ”Textbook” goal of a compiler Translate software programs written in one language to another language 4
  5. 5. What first commercial compiler did in 1957 Translate software programs written in one language (a high level language called “FORTRAN”) to another language (machine instructions) 5
  6. 6. What compilers do in 2018 Translate software programs written in one language (usually high level language like Java or Javascript) to another language (usually machine instructions) 6
  7. 7. What compilers do in 2018 Translate software programs written in one language (usually high level language like Java or Javascript) to another language (usually machine instructions) (+ runs alongside program, profiling, optimizing more aggressively and speculatively, responding to the program changing itself and its objects, etc.) 7
  8. 8. Compilers today are SO MUCH more sophisticated than compilers were 60 years ago BUT 8
  9. 9. Our model for how to employ compiler technology has not really changed much in 60+ years 9
  10. 10. Role of the compiler To translate the operations and data structures mostly written by programmers to efficient native code source code input to compiler, generally speaking, is designed to solve some general class of problems 10
  11. 11. A thought experiment: What if…. 11
  12. 12. Role for the compiler To translate the operations and data structures of the precise problem being solved to efficient native code as described by a program designed to solve a general class of problems 12
  13. 13. Nah, that won’t work! 1. Runtime cost to generate and compile all the code to solve a problem probably won’t pay off in many cases 2. Obviously requires a dynamic compiler for all but incredibly trivial scenarios 3. Would require a different approach to writing software 13
  14. 14. But maybe there’s the kernel of a good idea here Accelerating specific tasks by generating part(s) of their programs at runtime using a dynamic compiler to optimize 14
  15. 15. Kinda like a JIT compiler! program ~ interpreter “generator” ~ JIT compiler 15
  16. 16. Application Start with an application 16
  17. 17. Application Perf critical task With a performance critical task 17
  18. 18. Application Perf critical task Task Accelerator Specialized performance critical task Native Code Cache …can be specialized by an accelerator 18
  19. 19. Application Perf critical task Task Accelerator Specialized performance critical task Native Code Cache Integrate with the rest of application through (hopefully) lower frequency paths 19
  20. 20. Not really a new idea... E.g. Apache Spark: Catalyst compiler generates problem-specific Java bytecodes at runtime that are shipped to all the worker JVMs E.g. Rosie Pattern Language compiles a pattern-specific LPEG bytecode program at runtime to execute its pattern matching E.g. Many regex engines create representations (e.g. NFA) specific to the pattern at runtime and then “execute” on that data structure (e.g. TRegex project uses Graal to convert the NFA to native code) … and probably hundreds of other examples 20
  21. 21. Are these solutions really as effective as they could be? 21
  22. 22. My (biased?) opinion: only a partial solution • Few go all the way down to native code • Spark -> Java bytecodes which then need to be compiled by Java’s JIT compiler • Rosie PL -> lpeg byte codes which are interpreted (could be JIT compiled by lpeg JIT) • Don’t know of any regex compiler that goes direct to native (TRegex takes a generated NFA to native code) • The *actual* native code compiler still probably has little to no idea about the characteristics of the actual problem being solved • Hampered by the expressiveness of the input (which may not be easy to change) • Limited by what JITs can afford to prove at runtime 22
  23. 23. How could we do better? 23
  24. 24. How could we do better? One option: Use strong native code compilers directly (just generate IL and you’re off to the races!) 24
  25. 25. How could we do better? One option: Use strong native code compilers directly (just generate IL and you’re off to the races!) 25
  26. 26. It’s too late for direct compiler use (my opinion) • Maybe could have worked in the 1980s: • Fewer popular languages • Developers generally knowledgeable about how compilers and processors work • But dynamic JIT compiler technology was in its infancy • Challenges today: • Tons of programming languages in active use, less familiarity with C • Many developers have little idea how processors work • Many developers never exposed to even basic compiler course, let alone to the details governing how sophisticated JIT compilers operate 26
  27. 27. What about in this decade? Any realistic solution will need to be: “Easy” to use for non compiler developers Available natively from lots of languages Dynamic compilation story that can make it cost effective Strong cross-platform dynamic native compiler technology 27
  28. 28. What about in this decade? Any realistic solution will need to be: “Easy” to use for non compiler developers Available natively from lots of languages Dynamic compilation story that can make it cost effective Strong cross-platform dynamic native compiler technology Up next: what we’re doing about it in Eclipse OMR 28
  29. 29. http://www.eclipse.org/omr https://github.com/eclipse/omr https://developer.ibm.com/open/omr/ Dual License: Eclipse Public License V2.0 Apache 2.0 Users and Contributors very welcome https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md Eclipse OMR Language Runtime Technology Components
  30. 30. 1. How can we make it easier for non compiler developers to use compiler technologies? 30
  31. 31. Q: What’s the interface to most compilers? 31
  32. 32. Q: What’s the interface to most compilers? A: Functions to build intermediate language (e.g. LLVM: routines to build SSA) (so you have to understand SSA to start!) 32
  33. 33. That’s why I created OMR JitBuilder library • Goal: a higher level API that can evolve independently of compiler IL • Make it easy to describe the structure of data • Make it easy to describe the structure of functions • Expose no compiler IL details • Still a lower level API at the moment, but still evolving “up” 33
  34. 34. Simple Example SimpleMB::SimpleMB(TR::TypeDictionary *d) : MethodBuilder(d) { DefineName("increment"); DefineParameter("value", Int32); DefineReturnType(Int32); } bool SimpleMB::buildIL() { Return( Add( Load("value"), ConstInt32(1))); return true; } 34 “How you’ll call the function” “What should the function do”
  35. 35. Building fib() code with JitBuilder int fib(int n) { if (n < 2) { return n; } else { return fib(n-1) + fib(n-2); } } MethodBuilder mb 35
  36. 36. Building fib() code with JitBuilder int fib(int n) { if (n < 2) { } else { } } int t1 = n; int t2 = 2; int t3 = t1 < t2; If (!t3) goto label_else; label_else: label_merge: thenBuilder elseBuilder MethodBuilder mb IlBuilder *thenBuilder=NULL; IlBuilder *elseBuilder=NULL; mb->IfThenElse( &thenBuilder, &elseBuilder, mb-> LessThan( mb-> Load(“n”), mb-> ConstInt32(2))); 36
  37. 37. Building fib() code with JitBuilder int fib(int n) { if (n < 2) { return n; } else { } } int t1 = n; int t2 = 2; int t3 = t1 < t2; If (!t3) goto label_else; label_else: label_merge: thenBuilder elseBuilder MethodBuilder mb t4 = n; return t4; b = thenBuilder; b->Return( b-> Load(“n”)); 37
  38. 38. Building fib() code with JitBuilder int fib(int n) { if (n < 2) { return n; } else { return fib(n-1) + fib(n-2); } } int t1 = n; int t2 = 2; int t3 = t1 < t2; If (!t3) goto label_else; label_else: label_merge: thenBuilder elseBuilder MethodBuilder mb t4 = n; return t4; t5 = n – 1; t6 = fib(t5); t7 = n – 2; t8 = fib(t7); t9 = t6 + t8; return t9; b = elseBuilder; b->Add( b-> Call(“fib”, b-> Sub( b-> Load(“n”), b-> ConstInt32(1))), b-> Call(“fib”, b-> Sub( b-> Load(“n”), b-> ConstInt32(2)))); 38
  39. 39. JitBuilder API still evolving • Already lots of “lower level” programming constructs • Conditional structures, looping, etc. • Familiar procedural concepts to software developers! • Increasing support for modelling virtual machine states • E.g. operand stack • Enables better JIT optimization of bytecode-based languages • Probably not useful for most software developers • New services being added and more are welcome: • E.g. VectorLoopBuilder: goal to write code once to generate vector and residue loop 39
  40. 40. But all that’s C++ code… :( 40
  41. 41. 2. How can we enable developers to access the API natively in their language of choice? 41
  42. 42. Taking JitBuilder to a language near you! • JitBuilder API used to be defined by C++ classes • Now defined by a JSON file • Generate public (client) API classes automatically from the JSON • We now have a C++ API generator • Currently working on the C API generator • Once we have C: easier to expand out to other languages like • Java, Python, Rust, Javascript, Ruby, Lua, …. • Generators reduce cost to support N languages for an evolving API while offering opportunity to use appropriate idioms in each language Leonardo Bandarali gave a talk earlier today at the Turbo 2018 workshop: https://2018.splashcon.org/event/turbo-2018-tutorial-taking-eclipse-omr-jitbuilder-to-a-language-near-you 42
  43. 43. Example: Transaction(persistentFailureBuilder, transientFailureBuilder, transactionBuilder) … { "name": "Transaction” , "overloadsuffix": "" , "flags": [] , "return": "none” , "parms": [ {"name":"persistentFailureBuilder","type":"IlBuilder","attributes":["in_out"]}, {"name":"transientFailureBuilder","type":"IlBuilder","attributes":["in_out"]}, {"name":"transactionBuilder","type":"IlBuilder","attributes":["in_out"]} ] }, … + hundreds more validated by a schema 43
  44. 44. Once generators for different languages written… • Generate efficient native code for dynamic tasks from: • C++ • C • Java • Python • Etc. • Easy and efficient native code generation at runtime directly by writing code in your language of choice 44
  45. 45. But how to overcome the costs? CPU cycles Dynamic memory requirement Time until compiled code is ready Time to complete the task 45
  46. 46. • Has to do all the work and pay all the CPU and memory costs • AND has to be able to reap the reward quickly enough to pay off • But can work for long running or repetitive tasks with good specialization opportunities • What if… • Move JIT compilation effort to an independent service (in the cloud!) • Aggregate JIT work from multiple clients (either in time or via horizontal scaling) • Deliver code from earlier compilations if the “same” compile is requested • Classify compiles that are “almost” the same: share less optimized code? 46 In-process JIT has a tough challenge
  47. 47. We call this JIT as a Service 47
  48. 48. IBM ExtremeBlue student project over the summer: 1. Record JitBuilder API calls on the client side 2. Send that record to a JIT server 3. If no code generated already for the provided record: 4. Replay the recorded calls to JitBuilder implementation in the JIT server 5. Store generated code using the record as key 6. Send the code back to be installed in client’s code cache 7. Client can call the native code via a function pointer Created a client JIT for ultra simple (but Turing complete!) “BF” language https://en.wikipedia.org/wiki/Brainf*** Demo 1: JitBuilder as a Service 48
  49. 49. Looks neat, but…. Would it work for anything real? 49
  50. 50. http://www.eclipse.org/openj9 https://github.com/eclipse/openj9 Dual License: Eclipse Public License v2.0 Apache 2.0 Users and contributors very welcome https://github.com/eclipse/openj9/blob/master/CONTRIBUTING.md Eclipse OpenJ9 Created Sept 2017 New 0.10 release Oct 2, 2018! Now for JDK11! New 0.11 release Oct 19, 2018! New goodies! 50
  51. 51. The OpenJ9 JITaaS Architecture • A client-server model • Bidirectional communication • Java bytecode + metadata sent to server • Compiled code + metadata returned to client Client Server Compilation begins VM queries ×N Compilation ends Time 51
  52. 52. Demo 2: OpenJ9 Java JITaaS 52
  53. 53. Private 53
  54. 54. Where to take this? 54
  55. 55. State of the world • Current state at Eclipse OMR and Eclipse OpenJ9 • Lots of plumbing completed and in progress • IBM production JIT team actively pushing forward in both projects • Other projects also have some of the basic pieces • LLVM: ORC JIT remote compilation service code • Java developers:Truffle for describing tasks as ASTs wit hookuhp to Graal compiler • GCC: LibJIT project • + others 55
  56. 56. Going forward • JitBuilder bindings for different languages • Table stakes to put a strong JIT into as many developer’s hands as possible • Expanding the API to meet developers where and how they develop • E.g. dynamic specialized data structure services such as generating map<int, double> or, who knows, map< struct<int,int,double,float>, struct<double,int,float,int> > • Improved vectorizing and parallel (GPU, FPGA?) services • Target developers using functional or other paradigms • Accelerate tasks, not just write JITs for languages • E.g. JSON/regex/graph processing/query tasks, your input needed here! • Machine learning and optimizations applied to client optimization in the cloud • How to classify different clients as doing “the same” thing? • How to leverage that for optimization versus sharing? • How to spread profiling overheads across a set of clients? • What are the first not-for-JIT optimizations we should consider applying? 56
  57. 57. Learn more! • Eclipse OMR @EclipseOMR • Web site: https://www.eclipse.org/omr • Repo: https://github.com/eclipse/omr • Slack: https://eclipse-omr.slack.com • TURBO Workshop in Stuart room Monday and Tuesday • Eclipse OpenJ9 @Openj9 • Web site: https://www.eclipse.org/openj9 • Repo: https://github.com/eclipse/openj9 • Slack: https://openj9.slack.com • JitBuilder: google it! 57
  58. 58. Thank you! • Mark Stoodley @mstoodle mstoodle@ca.ibm.com • Eclipse OpenJ9 and OMR project lead • Production JIT developer since 2002 • Creator of the OMR JitBuilder library

×