Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GEM - GNU C Compiler Extensions Framework


Published on

Published in: Technology
  • Be the first to comment

GEM - GNU C Compiler Extensions Framework

  1. 1. GCC Hacks Alexey Smirnov GRC’06
  2. 2. Introduction <ul><li>GNU Compiler Collection includes C, C++, Java, etc. compilers and libraries for them </li></ul><ul><li>Standard compiler for Linux </li></ul><ul><li>Latest release: GCC 4.1 </li></ul><ul><li> </li></ul>
  3. 3. Introduction <ul><li>GEM – compiler extensibility framework </li></ul><ul><li>Examples: syntactic sugar, BCC, Propolice, etc. </li></ul><ul><li>Dynamically loaded modules simplify development and deployment </li></ul>
  4. 4. Overview <ul><li>GCC 3.4 Tutorial </li></ul><ul><li>GEM Overview </li></ul><ul><li>Hacks, hacks, hacks. </li></ul>
  5. 5. GCC Architecture <ul><li>Driver program gcc . Finds appropriate compiler. Calls compiler, assembler, linker </li></ul><ul><li>C language: cc1, as, collect2 </li></ul><ul><li>This presentation: cc1 </li></ul>
  6. 6. GCC Architecture <ul><li>Front end, middle end, back end. </li></ul>
  7. 7. Representations <ul><li>AST – abstract syntax tree </li></ul><ul><li>RTL – register transfer language </li></ul><ul><li>Object – assembly code of target platform </li></ul><ul><li>Other representations used for optimizations </li></ul>
  8. 8. GCC Initialization <ul><li>cc1 is preprocessor and compiler </li></ul><ul><li>toplev.c: toplev_main() command-line option processing, front end/back end initialization, global scope creation </li></ul><ul><li>Front end is initialized with </li></ul><ul><ul><li>standard types: char_type_node , integer_type_node , unsigned_type_node . </li></ul></ul><ul><ul><li>built-in functions: builtin_memcpy , builtin_strlen </li></ul></ul><ul><li>These objects are instances of tree . </li></ul>
  9. 9. Tree data type <ul><li>Code, operands. </li></ul><ul><li>MODIFY_EXPR – an assignment expression. TREE_OPERAND(t,0), TREE_OPERAND(t,1) </li></ul><ul><li>ARRAY_TYPE – declaration of type. TREE_TYPE(t) – type of array element, TYPE_DOMAIN(t) – type of index. </li></ul><ul><li>CALL_EXPR – function call. TREE_OPERAND(t,0) – function definition, TERE_OPERAND(t,1) – function arguments. </li></ul><ul><li>debug_tree() prints out AST </li></ul>
  10. 10. Parser <ul><li>Identifier after identifier </li></ul><ul><li>get_identifier() char* -> tree with IDENTIFIER_NODE code. </li></ul><ul><li>A declaration is a tree node with _DECL code. lookup_name() returns declaration corresponding to the symbol </li></ul><ul><li>Symbol table not constructed. C_DECL_INVISIBLE attribute used instead. </li></ul>
  11. 11. AST to RTL to assembly <ul><li>start_decl() / finish_decl() </li></ul><ul><li>start_function() / finish_function() </li></ul><ul><li>tree build_function_call (tree function, tree params) </li></ul><ul><li>When a function is parsed it is converted to RTL immediately or after the file is parsed. Option –funit-at-a-time </li></ul><ul><li>finish_function() </li></ul><ul><li>Assembly code is generated from RTL. output_asm_insn() is executed for each instruction </li></ul>
  12. 12. GEM Framework <ul><li>The idea is similar to that of LSM </li></ul><ul><li>Module loaded using an option: </li></ul><ul><ul><li>-fextension-module=test.gem </li></ul></ul><ul><li>Hooks throughout GCC code </li></ul><ul><ul><li>AST </li></ul></ul><ul><ul><li>Assembly output </li></ul></ul><ul><ul><li>New hooks added when needed </li></ul></ul>
  13. 13. GEM Framework Hooks <ul><li>gem_handle_option </li></ul><ul><li>gem_c_common_nodes_and_builtins </li></ul><ul><li>gem_macro_name, gem_macro_def </li></ul><ul><li>gem_start_decl, gem_start_func </li></ul><ul><li>gem_finish_function </li></ul><ul><li>gem_output_asm_insn </li></ul>
  14. 14. Traversing an AST <ul><li>walk_tree </li></ul><ul><li>static tree callback(tree *tp, …) { </li></ul><ul><li>switch (TREE_CODE(*tp)) { </li></ul><ul><li>case CALL_EXPR: </li></ul><ul><li>… </li></ul><ul><li>case VAR_DECL: </li></ul><ul><li>… </li></ul><ul><li>} </li></ul><ul><li>return NULL_TREE; </li></ul><ul><li>} </li></ul><ul><li>walk_tree(&t, callback, NULL, NULL); </li></ul>
  15. 15. Creating trees <ul><li>t =build_int_2(val, 0); </li></ul><ul><li>build1(ADDR_EXPR, build_pointer_type(T_T(t)), t); </li></ul><ul><li>build(MODIFY_EXPR, TREE_TYPE(left), left, val); </li></ul>
  16. 16. Hacks <ul><li>Syntactic sugar </li></ul><ul><li>Operating systems </li></ul><ul><li>Security </li></ul>
  17. 17. Syntactic Sugar <ul><li>When a compiler error occurs, fix compiler rather than program. </li></ul><ul><li>Examples: </li></ul><ul><ul><li>Function overloading as in C++ </li></ul></ul><ul><ul><li>toString() in each structure as in Java </li></ul></ul><ul><ul><li>Invoke block of code from a function Ruby </li></ul></ul><ul><ul><li>Use functions to initialize a variable </li></ul></ul><ul><ul><li>Default argument values </li></ul></ul>
  18. 18. Security <ul><li>DIRA: detection, identification, and repair of control hijacking attacks </li></ul><ul><li>PASAN: signature and patch generation </li></ul><ul><li>Propolice -fstack-protector </li></ul>
  19. 19. Operating Systems <ul><li>Dusk: develop in userland, install at kernel level. </li></ul>
  20. 20. Function Overloading <ul><li>Two functions: </li></ul><ul><ul><li>void add(int, int); </li></ul></ul><ul><ul><li>void add(int, char*); </li></ul></ul><ul><li>The idea is to replace function name so that it includes argument types: </li></ul><ul><ul><li>add_i_i </li></ul></ul><ul><ul><li>add_i_pch </li></ul></ul><ul><li>gem_start_decl() </li></ul><ul><li>gem_start_function() </li></ul><ul><li>gem_build_function_call() </li></ul>
  21. 21. Alias Each Declaraiton <ul><li>cfo_find_symtab(&t_func, func_name); </li></ul><ul><li>if (t_func==NULL_TREE || DECL_BUILT_IN(t_func)) { return; } </li></ul><ul><li>If found then alias and create new declaration. </li></ul>
  22. 22. Alias Each Declaration <ul><li>strcpy(new_name, func_name); </li></ul><ul><li>strcat(new_name, cfo_build_name(TREE_PURPOSE(T_O(declarator, 1)))); </li></ul><ul><li>cfo_find_symtab(&t_func_alias, name); </li></ul><ul><li>If not found: </li></ul><ul><li>t_alias_attr=tree_cons(get_identifier(&quot;alias&quot;), tree_cons(NULL_TREE, get_identifier(name), NULL_TREE), NULL_TREE); TYPE_ATTRIBUTES(T_T(t_func)) = t_alias_attr; DECL_ATTRIBUTES(t_func)=t_alias_attr; </li></ul><ul><li>T_O(declarator,0) = get_identifier(new_name); </li></ul>
  23. 23. Replace function calls <ul><li>name = cfo_build_decl_name(t_func, t_parm); </li></ul><ul><li>t_new_func = get_identifier(name); </li></ul><ul><li>if (t_new_func) { t_new_func = lookup_name(t_new_func); } </li></ul><ul><li>*func = t_new_func; </li></ul>
  24. 24. Conclusion <ul><li>GCC is a big program so we thought it’s a good idea to document it: </li></ul><ul><ul><li> </li></ul></ul><ul><li>GEM allows to implement GCC extensions. </li></ul><ul><li>Examples: programming languages, security, OS. </li></ul>
  25. 25. Thank you <ul><li> </li></ul>