• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
2,757
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
25
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. GCC Hacks Alexey Smirnov GRC’06 http://gcchacks.info
  • 2. Introduction
    • GNU Compiler Collection includes C, C++, Java, etc. compilers and libraries for them
    • Standard compiler for Linux
    • Latest release: GCC 4.1
    • http://gcc.gnu.org
  • 3. Introduction
    • GEM – compiler extensibility framework
    • Examples: syntactic sugar, BCC, Propolice, etc.
    • Dynamically loaded modules simplify development and deployment
  • 4. Overview
    • GCC 3.4 Tutorial
    • GEM Overview
    • Hacks, hacks, hacks.
  • 5. GCC Architecture
    • Driver program gcc . Finds appropriate compiler. Calls compiler, assembler, linker
    • C language: cc1, as, collect2
    • This presentation: cc1
  • 6. GCC Architecture
    • Front end, middle end, back end.
  • 7. Representations
    • AST – abstract syntax tree
    • RTL – register transfer language
    • Object – assembly code of target platform
    • Other representations used for optimizations
  • 8. GCC Initialization
    • cc1 is preprocessor and compiler
    • toplev.c: toplev_main() command-line option processing, front end/back end initialization, global scope creation
    • Front end is initialized with
      • standard types: char_type_node , integer_type_node , unsigned_type_node .
      • built-in functions: builtin_memcpy , builtin_strlen
    • These objects are instances of tree .
  • 9. Tree data type
    • Code, operands.
    • MODIFY_EXPR – an assignment expression. TREE_OPERAND(t,0), TREE_OPERAND(t,1)
    • ARRAY_TYPE – declaration of type. TREE_TYPE(t) – type of array element, TYPE_DOMAIN(t) – type of index.
    • CALL_EXPR – function call. TREE_OPERAND(t,0) – function definition, TERE_OPERAND(t,1) – function arguments.
    • debug_tree() prints out AST
  • 10. Parser
    • Identifier after identifier
    • get_identifier() char* -> tree with IDENTIFIER_NODE code.
    • A declaration is a tree node with _DECL code. lookup_name() returns declaration corresponding to the symbol
    • Symbol table not constructed. C_DECL_INVISIBLE attribute used instead.
  • 11. AST to RTL to assembly
    • start_decl() / finish_decl()
    • start_function() / finish_function()
    • tree build_function_call (tree function, tree params)
    • When a function is parsed it is converted to RTL immediately or after the file is parsed. Option –funit-at-a-time
    • finish_function()
    • Assembly code is generated from RTL. output_asm_insn() is executed for each instruction
  • 12. GEM Framework
    • The idea is similar to that of LSM
    • Module loaded using an option:
      • -fextension-module=test.gem
    • Hooks throughout GCC code
      • AST
      • Assembly output
      • New hooks added when needed
  • 13. GEM Framework Hooks
    • gem_handle_option
    • gem_c_common_nodes_and_builtins
    • gem_macro_name, gem_macro_def
    • gem_start_decl, gem_start_func
    • gem_finish_function
    • gem_output_asm_insn
  • 14. Traversing an AST
    • walk_tree
    • static tree callback(tree *tp, …) {
    • switch (TREE_CODE(*tp)) {
    • case CALL_EXPR:
    • case VAR_DECL:
    • }
    • return NULL_TREE;
    • }
    • walk_tree(&t, callback, NULL, NULL);
  • 15. Creating trees
    • t =build_int_2(val, 0);
    • build1(ADDR_EXPR, build_pointer_type(T_T(t)), t);
    • build(MODIFY_EXPR, TREE_TYPE(left), left, val);
  • 16. Hacks
    • Syntactic sugar
    • Operating systems
    • Security
  • 17. Syntactic Sugar
    • When a compiler error occurs, fix compiler rather than program.
    • Examples:
      • Function overloading as in C++
      • toString() in each structure as in Java
      • Invoke block of code from a function Ruby
      • Use functions to initialize a variable
      • Default argument values
  • 18. Security
    • DIRA: detection, identification, and repair of control hijacking attacks
    • PASAN: signature and patch generation
    • Propolice -fstack-protector
  • 19. Operating Systems
    • Dusk: develop in userland, install at kernel level.
  • 20. Function Overloading
    • Two functions:
      • void add(int, int);
      • void add(int, char*);
    • The idea is to replace function name so that it includes argument types:
      • add_i_i
      • add_i_pch
    • gem_start_decl()
    • gem_start_function()
    • gem_build_function_call()
  • 21. Alias Each Declaraiton
    • cfo_find_symtab(&t_func, func_name);
    • if (t_func==NULL_TREE || DECL_BUILT_IN(t_func)) { return; }
    • If found then alias and create new declaration.
  • 22. Alias Each Declaration
    • strcpy(new_name, func_name);
    • strcat(new_name, cfo_build_name(TREE_PURPOSE(T_O(declarator, 1))));
    • cfo_find_symtab(&t_func_alias, name);
    • If not found:
    • t_alias_attr=tree_cons(get_identifier("alias"), tree_cons(NULL_TREE, get_identifier(name), NULL_TREE), NULL_TREE); TYPE_ATTRIBUTES(T_T(t_func)) = t_alias_attr; DECL_ATTRIBUTES(t_func)=t_alias_attr;
    • T_O(declarator,0) = get_identifier(new_name);
  • 23. Replace function calls
    • name = cfo_build_decl_name(t_func, t_parm);
    • t_new_func = get_identifier(name);
    • if (t_new_func) { t_new_func = lookup_name(t_new_func); }
    • *func = t_new_func;
  • 24. Conclusion
    • GCC is a big program so we thought it’s a good idea to document it:
      • http://en.wikibooks.org/GNU_C_Compiler_Internals
    • GEM allows to implement GCC extensions. http://www.ecsl.cs.sunysb.edu/gem
    • Examples: programming languages, security, OS.
  • 25. Thank you
    • http://gcchacks.info