JVM code reading -- C2

2,063 views

Published on

A material for fifth JVM source code reading

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,063
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
58
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

JVM code reading -- C2

  1. 1. C2 The Server Compiler第5回JVMソースコードリーディングの会 @ytoshima 1
  2. 2. Compilation triggerinvoke* と goto bytecode (to negativeoffset) の呼び出し数を interpreter でカウントし,しきい値を越えると CompilerBrokerにコンパイルのリクエストを出す。キューにリクエストが入り、キューを見ているCompilerThread がコンパイルを開始する。GC動作とも並列して動作できる様になっている。 2
  3. 3. Compilation trigger// Method invocationSimpleCompPolicy::method_invocation_event CompileBroker::compile_method(...)// OSRSimpleCompPolicy::method_back_branch_event CompileBroker::compile_method(...) 3
  4. 4. Compilation triggerCompileBroker::compile_method CompileBroker::compile_method_base : CompileQueue* queue = compile_queue(comp_level); task = create_compile_task(queue, compile_id, method, osr_bci, comp_level, hot_method, hot_count, comment, blocking); 4
  5. 5. CompileCompile::Compile : if ((jvms = cg->generate(jvms))== NULL) // Parse : Optimize(); : Code_Gen(); 5
  6. 6. Compiler data structuresNodeMachNodeTypeciObjectPhaseJVMStatenmethod 6
  7. 7. Compiler data structuresCompile::build_start_state + aload_0(this) 7
  8. 8. Compiler data structuresGraph Text Representation 8
  9. 9. Compiler data structures$ ~/jdk1.7.0-b147/fastdebug/bin/java -XX:+PrintCompilation -XX:+PrintIdeal-XX:CICompilerCount=1 sum    214 1 sum::doit (22 bytes)VM option +PrintIdeal ...  21" ConI" === 0 [[ 180 ]] #int:0 180" Phi" === 184 21 70 [[ 179 ]] #int !orig=[159],[139],[66] !jvms: sum::doit @ bci:10 179" AddI" === _ 180 181 [[ 178 ]] !orig=[154],[137],70,[145] !jvms:sum::doit @ bci:12 178" AddI" === _ 179 181 [[ 177 ]] !orig=[153],[146],[135],86,[71] !jvms: sum::doit @ bci:14 177" AddI" === _ 178 181 [[ 176 ]] !orig=[165],[152],86,[71] !jvms:sum::doit @ bci:14 148" ConI" === 0 [[ 87 ]] #int:97 176" AddI" === _ 177 181 [[ 190 ]] !orig=[168],[152],86,[71] !jvms:sum::doit @ bci:14// <idx> <node type> === <in[]> [[out[]]] <additional desc>// jvms = JVMState, root()->dump(9999) would dump IR as above;Real example: https://gist.github.com/1369656 9
  10. 10. Ideal Graph Visualizer グラフ表示オプションlevel 4 ではパースの各段階と最適 化の各段階の IR が表示可能 10
  11. 11. Ideal Graph Visualizerpublic int getValue() { return value; } enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms }; 11
  12. 12. Ideal Graph VisualizerFinal Code: Mach* node や Epilog,Prolog, Ret などマシン依存のノードになっている 12
  13. 13. Compiler data structuresIdeal Graph Visualizerhttp://ssw.jku.at/General/Staff/TW/igv.htmletc/idealgraphvisualizer.conf:default_options="-J-Xmx400m --brandingidealgraphvisualizer"You need a fastdebug or a debug buildIn OpenJDK build dir$ make fastdebug_build OR$ make debug_build 13
  14. 14. Compiler data structuresOptions to generate data for IdealGraph Visualizer-XX:PrintIdealGraphLevel=0 [0:None, 4: most verbose]-XX:PrintIdealGraphPort=4444-XX:PrintIdealGraphAddress=”127.0.0.1”-XX:PrintIdealGraphFile=<path to IR xml file>IdealGraphVisualizer listens to port 4444 by default. 14
  15. 15. Compiler data structures// -XX:PrintIdealGraphFile=<path> , IdealGraphViewer can display this<graphDocument> <group> <properties> <p name="name"> virtual jint Call.doit()</p> </properties> <graph name="Bytecode 0: aload_0"> <nodes> <node id="159337448"> <properties> <p name="name"> Root</p> <p name="type"> bottom</p> <p name="idx"> 0</p> ... </properties> </node> <node id="159414516"> ... </node> <node id="159337448"> ... </node> ... </nodes> <edges> <edge index="0" to="159337448" from="159337448"></edge> <edge index="0" to="159414516" from="159414516"></edge>Example: https://gist.github.com/1369620 15
  16. 16. Compiler data structuresNode  # _in : Node** // use-def  # _out: Node** // def-use  # _cnt: node_idx_t // # of required inputs  # _max: node_idx_t // actual input array length  # _outcnt: node_idx_t  # _outmax: node_idx_t - _class_id: jushort - _flags: jushort// _in は ordered, 位置も重要// サブクラスの合計 340 個 <-> C1 62 個// _class_id は 16 bit 値、ideal, mach で node の型を判断// _flags は enum NodeFlags: Flag_is_Copy, Flag_is_Call,// Flag_is_macro, Flag_is_con... 16
  17. 17. Compiler data structures// Insert a new required input at the endvoid Node::ins_req( uint idx, Node *n ) { assert( is_not_dead(n), "can not use dead node"); add_req(NULL); // Make space ... _in[idx] = n; // Stuff over old required edge if (n != NULL) n->add_out((Node *)this); // Addreciprocal def-use edge} void add_out( Node *n ) { if (is_top()) return; if( _outcnt == _outmax ) out_grow(_outcnt); _out[_outcnt++] = n; } 17
  18. 18. Compiler data structuresRegionNode Control の mergePhiNode Control の merge に伴うデータのマージ.対応する RegionNode を指す。 18
  19. 19. Compiler data structuresNode // Optimize functions// more ideal node, canonicalizevirtual Node *Ideal(PhaseGVN *phase,bool can_reshape);// set of values this node can takevirtual const Type *Value( PhaseTransform *phase ) const;// existing node which computes samevirtual Node *Identity( PhaseTransform *phase ); 19
  20. 20. Compiler data structuresNode Ideal defined in Add*Node, MinINode, StartNode, ReturnNode,RethrowNode, SafePointNode, AllocateArrayNode, LockNode,UnlockNode, RegionNode, PhiNode, PCTableNode,NeverBranchNode, CMove*Node, ConstraintCastNode,CheckCastPPNode, Conv?2?Node, Div?Node, Mod?Node,IfNode, LoopNode, CountedLoopNode, LoopLimitNode,Load*Node, Store*Node, ClearArrayNode, StrIntrinsicNode,MemBarNode, MergeMemNode, Mul*Node, And*Node,LShift*Node, URShift*Node, RootNode, HaltNode, Sub*Node,Cmp*Node, BoolNodeIdeal, Value, Identity は多くのサブクラスが目的に応じた物を定義している。 20
  21. 21. Compiler data structuresAddNode IdealConvert "(x+1)+2" into "x+(1+2)"Convert "(x+1)+y" into "(x+y)+1"Convert "x+(y+1)" into "(x+y)+1"x Con1 Con2 x Con1 Con2 Add Add Add Add 21
  22. 22. Compiler data structuresAddINode Ideal Node* in1 = in(1); Node* in2 = in(2); int op1 = in1->Opcode(); int op2 = in2->Opcode(); // Fold (con1-x)+con2 into (con1+con2)-x if ( op1 == Op_AddI && op2 == Op_SubI ) { // Swap edges to try optimizations below in1 = in2; in2 = in(1); op1 = op2; op2 = in2->Opcode(); } if( op1 == Op_SubI ) { "(a-b)+(c-d)" into "(a+c)-(b+d)" "(a-b)+(b+c)" into "(a+c)" "(a-b)+(c+b)" into "(a+c)" 22
  23. 23. Compiler data structuresconst Type *AddNode::Value(...) // Either input is TOP ==> the result is TOP // Either input is BOTTOM ==> the result is the localBOTTOM // Check for an addition involving the additiveidentity 23
  24. 24. Compiler data structuresNode *AddNode::Identity(...) // If either input is a constant 0, return the otherinput. const Type *zero = add_id(); // The additive identity if( phase->type( in(1) )->higher_equal( zero ) )return in(2); if( phase->type( in(2) )->higher_equal( zero ) )return in(1); return this; 24
  25. 25. Compiler data structuresNode *AddINode::Identity(...) // Fold (x-y)+y OR y+(x-y) into x if( in(1)->Opcode() == Op_SubI && phase->eqv(in(1)->in(2),in(2)) ) { return in(1)->in(1); } else if( in(2)->Opcode() == Op_SubI && phase->eqv(in(2)->in(2),in(1)) ) { return in(2)->in(1); } return AddNode::Identity(phase); 25
  26. 26. Compiler data structuresNode *PhaseGVN::transform_no_reclaim// Return a node which computes the same function// as this node, but in a faster or cheaper fashion. while( 1 ) { Node *i = k->Ideal(this, /*can_reshape=*/false); if( !i ) break;... } const Type *t = k->Value(this); // Get runtime Valueset k->raise_bottom_type(t); Node *i = k->Identity(this); if (i != k) return i; i = hash_find_insert(k); if( i && (i != k)) return i;Parse で Node を作ると transform する。Parse しながら GVN 26
  27. 27. Compiler data structuresNode // raise_bottom_type related TypeNode // Type* _type ConNode // ConINode, ConPNode .. PhiNode // TypePtr* _adr_type // int _inst_id, // inst_index, _inst_offset ConvI2LNode MemNode // TypePtr* _adr_type LoadNode // Type* _type LoadPNode // load obj or arr LoadINodehttps://gist.github.com/1369608 27
  28. 28. Compiler data structuresNode^RegionNodebasic blocks にマップできる。入力は Control sources.PhiNode は RegionNode を指す入力を持つ。PhiNode へのマージされるデータの入力は RegionNode の入力と一対一の対応を持つ。 PhiNode の 0 の入力は RegionNode で RegionNodeの入力 0 は自身。PhiNode* has_phi() const^LoopNode // Simple Loop Header short _loop_flags^RootNode 28
  29. 29. Compiler data structuresMultiNodeSafePointNode 29
  30. 30. Compiler data structuresTypeNode^ConNode 30
  31. 31. Compiler data structuresNode^ProjNode // project a single elem out of a tuple orsignature type^ParmNode // incoming Parameters const uint _con; // The field in thetuple we are projecting const bool _is_io_use; // Used to distinguishbetween the projections // used on the controland io paths from a macro node 31
  32. 32. Compiler data structuresNode^MergeMem // (See comment in memnode.cpp nearMergeMemNode::MergeMemNode for semantics.) in(AliasIdxTop) = in(1) is always the top node in(0) is NULL in(AliasIdxBot) is a "wide" memory state. For in(AliasIdxRaw) = in(3) and above, mem state foralias type <N> or top base_memory() // wide state memory_at(N) // for alias type <N> Identity: base が empty なら base を返す,さも無ければ this Ideal: Simplify stacked MergeMem 32
  33. 33. Compiler data structuresTypeNode^PhiNode異なるコントロールパスからの値をマージする。Slot 0 は control する RegionNode 33
  34. 34. Compiler data structuresclass ConINode : public ConNode {public: ConINode( const TypeInt *t ) : ConNode(t) {} virtual int Opcode() const; // Factory method: static ConINode* make( Compile* C, int con ) { return new (C, 1) ConINode( TypeInt::make(con) ); }class ConNode : public TypeNode {public: ConNode( const Type *t ) : TypeNode(t,1) { init_req(0, (Node*)Compile::current()->root()); init_flags(Flag_is_Con); }class TypeNode : public Node { const Type* const _type; TypeNode( const Type *t, uint required ) : Node 34
  35. 35. Compiler data structures// Add pointer plus integer to get pointer. NOT commutative, really.// So not really an AddNode. Lives here, because people associate it with// an add.class AddPNode : public Node {public: enum { Control, // When is it safe to do this add? Base, // Base oop, for GC purposes Address, // Actually address, derived from base Offset } ; // Offset added to address AddPNode( Node *base, Node *ptr, Node *off ) : Node(0,base,ptr,off) { init_class_id(Class_AddP); } Identity: if one input is 0, return in(Address), otherwise this Ideal: 左が定数の加算であれば, expression tree を平坦化 raw pointer で NULL なら CastX2PNode(offset) 右が constant の加算なら (ptr + (offset+cn)) を (ptr + offset) +con に変更 35
  36. 36. Compiler data structures// Return from subroutine nodeclass ReturnNode : public Node {public: ReturnNode( uint edges, Node *cntrl, Node *i_o, Node *memory, Node*retadr, Node *frameptr ); virtual int Opcode() const; virtual bool is_CFG() const { return true; } 36
  37. 37. Compiler data structuresJVMState JVMState* _caller // for scope chains uint _depth, _locoff, _stkoff, _monoff, uint _scloff // offset of scalar objs uint _endoff uint _sp int _bci ReexecuteState _reexecute ciMethod* _method SafePointNode* _map 37
  38. 38. Compiler data structuresclass Type {public:  enum TYPES { Bad = 0, Control,    Top,    Int, Long, Half, NarrowOop,    Tuple, Array,    AnyPtr, RawPtr, OopPtr, InstPtr, AryPtr, KlassPtr,    Function, Abio, Return_Address, Memory,    FloatTop, FloatCon, FloatBot,    DoubleTop, DoubleCon, DoubleBot,    Bottom, lasttype };private:  const Type __dual;protected:  const TYPES _base; 38
  39. 39. Compiler data structuresclass Type {  :public:  TYPES base();  static const Type *make(enum TYPES);  static int cmp(Type*, Type*);  int higher_equal( Type *t)  const Type *meet(Type *t);  virtual const Type *widen(Type *old, Type* limit)  virtual const Type *narrow(Type *old) 39
  40. 40. Compiler data structuresclass Dict;class Type;class   TypeD;class   TypeF;class   TypeInt;class   TypeLong;class   TypeNarrowOop;class   TypeAry;class   TypeTuple;class   TypePtr;class     TypeRawPtr;class     TypeOopPtr;class       TypeInstPtr;class       TypeAryPtr;class       TypeKlassPtr; 40
  41. 41. Compiler data structuresPhase PhaseTransform Compile PhaseIdealLoop GraphKit Matcher PhaseCFG PhaseValues PhaseBlockLayout PhaseGVN PhaseCoalesce PhaseIterGVN PhaseAggressiveCoalesce PhaseCCP PhaseConservativeCoalesce PhasePeephole PhaseIFG PhaseStringOpts PhaseLive PhaseMacroExpand PhaseRegAlloc PhaseChaitin PhaseRemoveUseless 41
  42. 42. Compiler data structuresclass Phase : public StackObj {public:  enum PhaseNumber { Compiler, Parser,Remove_Useless, ...}protected:  enum PhaseNumber _pnum;public:  Compile * C;} 42
  43. 43. Compiler data structuresclass Compile : public Phase {  const int        _compile_id;  ciMethod*        _method;  int              _entry_bci;  const TypeFunc*  _tf;  InlineTree*      _ilt;  Arena            _comp_arena;  ConnectionGraph* _congraph;  uint             _unique;  Arena            _node_arena;  RootNode*        _root;  Node*            _top;  :} 43
  44. 44. Compiler data structuresclass Compile : public Phase {  :  PhaseGVN*         _initial_gvn;  Unique_Node_List  _for_igvn;  WarmCallInfo*     _warm_calls;  PhaseCFG*         _cfg;  Matcher*          _matcher;  PhaseRegAlloc*    _regalloc;  OopMapSet*        _oop_map_set;  :} 44
  45. 45. Compiler data structuresclass PhaseTransform : public Phase {protected:  Arena* _arena;  Node_Array _nodes;  Type_Array _types;  ConINode*  _icons[...];  ConLNode*  _lcons[...];  ConNode*   _zcons[...];  :} 45
  46. 46. Compiler data structuresclass PhaseTransform : public Phase {public:  const Type* type(const Node* n) const;  const Type* type_or_null(const Node* n) const;  void set_type(const Node* n, const Type* t);  void set_type_bottom(const Node* n);  void ensure_type_or_null(const Node* n);  ConNode* makecon(const Type* t);  ConINode* intcon(jint i);  ConLNode* longcon(jlong l);  virtual Node *transform(Node *) = 0;  :} 46
  47. 47. Compiler data structures値をテーブルで管理する機能class PhaseValues : public PhaseTransform {protected:  NodeHash       _table; // for value-numberingpublic:  bool   hash_delete(Node *n);  bool   hash_insert(Node *n);  Node  *hash_find_insert(Node* n);  Node  *hash_find(Node* n);  :} 47
  48. 48. Compiler data structuresローカルの悲観的な GVN-style の最適化class PhaseGVN : public PhaseValues {public:  Node  *transform(Node *n);  Node  *transform_no_reclaim(Node *n);  :} 48
  49. 49. Compiler data structures繰り返しのローカル、悲観的 GVN-style 最適化と ideal の変形class PhaseIterGVN : public PhaseGVN {private:  bool _delay_transform;  virtual Node *transform_old(Node *a_node);  void subsume_node(Node *old, Node *nn);protected:  virtual Node *transform(Node *a_node);  void init_worklist(Node *a_root);  virtual const Type* saturate(Type*, Type*, Type*)public:  Unique_Node_List _worklist;  void optimize();  :} 49
  50. 50. Parse最初のパスでブロックを認識、2番目のパスで各ブロックを訪れ、そのなかのバイトコードを処理して、Node のサブクラスのオブジェクトを作ったり、JVMState を作ったり、更新したり、最適化したり。値の伝播がうまくいく様にブロックに入ってくるブロックが極力 Parse されている様にする。 50
  51. 51. Parse#0 Parse::do_one_bytecode()#1 Parse::do_one_block()#2 Parse::do_all_blocks()#3 Parse::Parse(JVMState*, ciMethod*,float) ()#4 ParseGenerator::generate(JVMState*)#5 Compile::Compile(ciEnv*,C2Compiler*, ciMethod*, int, bool,bool) 51
  52. 52. Parse do_one_bytecode switch (bc()) { case Bytecodes::_nop: // do nothing break; case Bytecodes::_lconst_0: push_pair(longcon(0)); break; : case Bytecodes::_iconst_5: push(intcon( 5)); break; case Bytecodes::_bipush: push(intcon(iter().get_constant_u1())); break; case Bytecodes::_sipush: push(intcon(iter().get_constant_u2())); break;makecon, ingcon など定数を表すノードを返す static 関数もある。 52
  53. 53. Parse do_one_bytecode case Bytecodes::_ldc: case Bytecodes::_ldc_w: case Bytecodes::_ldc2_w: // If the constant is unresolved, run this BC oncein the interpreter. { ciConstant constant = iter().get_constant(); if (constant.basic_type() == T_OBJECT && !constant.as_object()->is_loaded()) { int index = iter().get_constant_pool_index(); 53
  54. 54. Parse do_one_bytecode case Bytecodes::_aload_0: push( local(0) ); break; : case Bytecodes::_aload: push( local(iter().get_index()) ); break;push, local は結果的に JVMState, SafePointNode の状態を変更。iter() を使って bytecode の引き数を取って来る事ができる。 54
  55. 55. Parse do_one_bytecode case Bytecodes::_fstore_0: case Bytecodes::_istore_0: case Bytecodes::_astore_0: set_local( 0, pop() ); break; : case Bytecodes::_fstore: case Bytecodes::_istore: case Bytecodes::_astore: set_local( iter().get_index(), pop() ); break; 55
  56. 56. Parse do_one_bytecode case Bytecodes::_pop: _sp -= 1; break; case Bytecodes::_pop2: _sp -= 2; break; case Bytecodes::_swap: a = pop(); b = pop(); push(a); push(b); break; case Bytecodes::_dup: a = pop(); push(a); push(a); break; 56
  57. 57. Parse do_one_bytecode case Bytecodes::_baload: array_load(T_BYTE); break; case Bytecodes::_caload: array_load(T_CHAR); break; case Bytecodes::_iaload: array_load(T_INT); break; case Bytecodes::_saload: array_load(T_SHORT); break; case Bytecodes::_faload: array_load(T_FLOAT); break; case Bytecodes::_aaload: array_load(T_OBJECT); break; case Bytecodes::_laload: { a = array_addressing(T_LONG, 0); if (stopped()) return; // guaranteed null orrange check _sp -= 2; // Pop array and index push_pair( make_load(control(), a, TypeLong::LONG,T_LONG, TypeAryPtr::LONGS)); break; } 57
  58. 58. Parse do_one_bytecode case Bytecodes::_bastore: array_store(T_BYTE); break; case Bytecodes::_castore: array_store(T_CHAR); break; case Bytecodes::_iastore: array_store(T_INT); break; case Bytecodes::_sastore: array_store(T_SHORT); break; case Bytecodes::_fastore: array_store(T_FLOAT); break; case Bytecodes::_aastore: { d = array_addressing(T_OBJECT, 1); if (stopped()) return; // guaranteed null orrange check array_store_check(); c = pop(); // Oop to store b = pop(); // index (already used) a = pop(); // the array itself const TypeOopPtr* elemtype = _gvn.type(a)->is_aryptr()->elem()->make_oopptr(); const TypeAryPtr* adr_type = TypeAryPtr::OOPS; Node* store = store_oop_to_array(control(), a, d,adr_type, c, elemtype, T_OBJECT); 58
  59. 59. Parse do_one_bytecode case Bytecodes::_getfield: do_getfield(); break; case Bytecodes::_getstatic: do_getstatic(); break; case Bytecodes::_putfield: do_putfield(); break; case Bytecodes::_putstatic: do_putstatic(); break; 59
  60. 60. Parse do_one_bytecode // implementation of _get* and _put* bytecodes void do_getstatic() { do_field_access(true, false); } void do_getfield () { do_field_access(true, true); } void do_putstatic() { do_field_access(false, false); } void do_putfield () { do_field_access(false, true); } 60
  61. 61. Parse do_one_bytecodeParse::do_field_access Parse::do_get_xxx(Node* obj, ciField* field, boolis_field) Node *adr = basic_plus_adr(obj, obj, offset); : Node* ld = make_load(NULL, adr, type, bt, adr_type,is_vol);Node* GraphKit::basic_plus_adr(Node* base, Node* ptr,Node* offset) { // short-circuit a common case if (offset == intcon(0)) return ptr; return _gvn.transform( new (C, 4) AddPNode(base, ptr,offset) );} 61
  62. 62. Parse do_one_bytecode// factory methods in "int adr_idx"Node* GraphKit::make_load(Node* ctl, Node* adr, constType* t, BasicType bt,int adr_idx, bool require_atomic_access) { Node* mem = memory(adr_idx); Node* ld; if (require_atomic_access && bt == T_LONG) { ld = LoadLNode::make_atomic(C, ctl, mem, adr,adr_type, t); } else { ld = LoadNode::make(_gvn, ctl, mem, adr, adr_type,t, bt); } return _gvn.transform(ld);} 62
  63. 63. Parse do_one_bytecodeNode* GraphKit::memory(uint alias_idx) { MergeMemNode* mem = merged_memory(); Node* p = mem->memory_at(alias_idx); _gvn.set_type(p, Type::MEMORY); // must be mapped return p;} 63
  64. 64. Parse do_one_bytecode_iaddb = pop(), a = pop()push(_gvn.transform( new (C, 3) AddINode(a,b)))// GraphKit::pop()Node* pop() { ..; return _map->stack(_map->_jvms,--_sp); }// SefePointNode::stackNode *stack(JVMState* jvms, uint idx) const {  return in(jvms->stkoff() + idx);} 64
  65. 65. Parse do_one_bytecodecase Bytecodes::_iinc:         // Increment local    i = iter().get_index();     // Get local index    set_local( i, _gvn.transform(        new (C, 3) AddINode(            _gvn.intcon(iter().get_iinc_con()), local(i) ) ) );    break; 65
  66. 66. Parse do_one_bytecode_goto, _goto_w    int target_bci = (bc() == Bytecodes::_goto) ?        iter().get_dest() : iter().get_far_dest();    // If this is a backwards branch in the bytecodes,add Safepoint    maybe_add_safepoint(target_bci);    // Update method data    profile_taken_branch(target_bci);    // Add loop predicate if it goes to a loop    if (should_add_predicate(target_bci)){      add_predicate();    }    // Merge the current control into the target basicblock    merge(target_bci);    ... 66
  67. 67. Parse do_one_bytecode_goto, _goto_w    ...// See if we can get some profile data and handit off to the next block    Block *target_block = block()->successor_for_bci(target_bci);    if (target_block->pred_count() != 1)  break;    ciMethodData* methodData = method()->method_data();    if (!methodData->is_mature())  break;    ciProfileData* data = methodData->bci_to_data(bci());    assert( data->is_JumpData(), "" );    int taken = ((ciJumpData*)data)->taken();    taken = method()->scale_count(taken);    target_block->set_count(taken);    break; 67
  68. 68. Parse do_one_bytecodecase _ifnull:    btest = BoolTest::eq; goto handle_if_null;case _ifnonnull: btest = BoolTest::ne; goto handle_if_null;handle_if_null:    // If this is a backwards branch in the bytecodes,add Safepoint    maybe_add_safepoint(iter().get_dest());    a = null();    b = pop();    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );    do_ifnull(btest, c);    break; 68
  69. 69. Parse do_one_bytecodecase _if_acmpeq: btest = BoolTest::eq; goto handle_if_acmp;case _if_acmpne: btest = BoolTest::ne; goto handle_if_acmp;handle_if_acmp:    // If this is a backwards branch in the bytecodes,add Safepoint    maybe_add_safepoint(iter().get_dest());    a = pop();    b = pop();    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );    do_if(btest, c);    break; 69
  70. 70. Parse do_one_bytecodecase Bytecodes::_tableswitch:    do_tableswitch();    break;case Bytecodes::_lookupswitch:    do_lookupswitch();    break; 70
  71. 71. Parse do_one_bytecodecase Bytecodes::_invokestatic:case Bytecodes::_invokedynamic:case Bytecodes::_invokespecial:case Bytecodes::_invokevirtual:case Bytecodes::_invokeinterface:    do_call();    break;case Bytecodes::_checkcast:    do_checkcast();    break;case Bytecodes::_instanceof:    do_instanceof();    break; 71
  72. 72. Parse do_one_bytecodegetClass はインライン展開され、LoadKlass -> メモリアクセスに。hashCode は static にpublic class Call { public static void main(String[] args) { Call c = new Call(); for (int i = 0; i < 100000; i++) { c.doit(); } } int doit() { return getClass().hashCode(); }} 72
  73. 73. Parse do_one_bytecodecase Bytecodes::_anewarray:    do_anewarray();    break;case Bytecodes::_newarray:    do_newarray((BasicType)iter().get_index());    break;case Bytecodes::_multianewarray:    do_multianewarray();    break;case Bytecodes::_new:    do_new();    break; 73
  74. 74. Parse do_one_bytecodecase Bytecodes::_jsr:case Bytecodes::_jsr_w:    do_jsr();    break;case Bytecodes::_ret:    do_ret();    break; 74
  75. 75. Parse do_one_bytecodecase Bytecodes::_monitorenter:    do_monitor_enter();    break;case Bytecodes::_monitorexit:    do_monitor_exit();    break; 75
  76. 76. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 76
  77. 77. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 77
  78. 78. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize() // worklist から取り出し, node を transform. // node が変わったら,edge 情報を // 更新して, users を worklist に置く while( _worklist.size() ) { Node *n = _worklist.pop(); if (++loop_count >= K * C->unique()) { // 範囲の確認 ...} if (n->outcnt() != 0) { Node *nn = transform_old(n); } else if (!n->is_top()) { remove_dead_node(n); } } 78
  79. 79. OptimizeNode *PhaseIterGVN::transform_old( Node *n )Ideal に渡す can_reshape が true である事Constant に計算される物は subsume_node で user を新しいノードをさす様に変更する事 79
  80. 80. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 80
  81. 81. OptimizeConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysis if (congraph->compute_escape()) { // There are non escaping objects. C->set_congraph(congraph); }congraph は LockNode, UnlockNode で確認し、これらが non-escape なら処理がなくなる。local なオブジェクトにロック、アンロックは無意味。 81
  82. 82. OptimizeConnectionGraph::compute_escape()java object の allocation がなければ false を返すAddP, MergeMem 等を work list にのせる、それらの out ものせるworklist のノードを細かく調べるGrowableArray<PointsToNode> _nodes に登録して、GlobalEscape, ArgEscape, NoEscape に分類, 到達可能なノードに伝播する。// comment in escape.hpp// flags: PrintEscapeAnalysis PrintEliminateAllocations 82
  83. 83. Optimizeclass ConnectionGraph: publicResourceObj // escape state of a node PointsToNode::EscapeState escape_state(Node *n); // other information we have collected bool is_scalar_replaceable(Node *n) { if (_collecting || (n->_idx >= nodes_size())) return false; PointsToNode* ptn = ptnode_adr(n->_idx); return ptn->escape_state() == PointsToNode::NoEscape&& ptn->_scalar_replaceable; } 83
  84. 84. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 84
  85. 85. OptimizePhaseIdealLoop::PhaseIdealLoop(PhaseIterGVN &igvn, booldo_split_ifs) build_and_optimize(do_split_ifs); 85
  86. 86. Optimize// Convert to counted loops where possiblePhaseIdealLoop::is_counted_loop( Node *x, IdealLoopTree*loop ) PhaseIdealLoop::is_counted_loop で CountedLoop への変換を試みる。再帰的に子のループに関しても counted_loop を呼ぶvoid PhaseIdealLoop::do_peeling( IdealLoopTree *loop,Node_List &old_new )// 1回目の実行を切り出す。loopTransform.cpp に図解void PhaseIdealLoop::do_unroll( IdealLoopTree *loop,Node_List &old_new, bool adjust_min_trip )void PhaseIdealLoop::do_maximally_unroll( IdealLoopTree*loop, Node_List &old_new )// Eliminate range-checks and other trip-counter vsloop-invariant tests.void PhaseIdealLoop::do_range_check( IdealLoopTree*loop, Node_List &old_new ) 86
  87. 87. Optimize -- PhaseIdealLoop After Parsingstatic int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm;} 87
  88. 88. Optimize -- PhaseIdealLoopAfter CountedLoop static int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm; } 88
  89. 89. Optimize -- PhaseIdealLoopOptimization Finished Unrolling? 89
  90. 90. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 90
  91. 91. OptimizePhaseCCP ccp( &igvn )ccp.do_transform C->set_root( transform(C->root())->as_Root() ); 定数置き換え可能な物を置き換える 91
  92. 92. OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this,&igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn) 92
  93. 93. OptimizePhaseMacroExpand mex(igvn)mex.expand_macro_nodes() ... eliminate_allocate_node scalar_replacementEscape Analysis の結果の処理、allocationをスタック操作に変換? 93
  94. 94. Code_GenMatcher m(proj_list)m.match()PhaseCFG cfg(node_arena(), root())cfg.Dominators()cfg.Estimate_Block_Frequency()cfg.GlobalCodeMotion(m,unique(),proj)PhaseChaitin regalloc(unique, cfg, m)regalloc->Register_Allocate()PaseBlockLayoutPhasePeepholeOutput 94
  95. 95. Matcher#0 in addI_eRegNode::Expand(State*,Node_List&, Node*) ()#1 in Matcher::ReduceInst(State*,int, Node*&) ()#2 in Matcher::match_tree(Nodeconst*) ()#3 in Matcher::xform(Node*, int) ()#4 in Matcher::match() ()#5 in Compile::Code_Gen() ()#6 in Compile::Compile(ciEnv*,C2Compiler*, ciMethod*, int, bool,bool) () 95
  96. 96. Matcher// x86_32.ad, an ADLC fileinstruct addI_eReg(eRegI dst, eRegIsrc, eFlagsReg cr) %{ match(Set dst (AddI dst src)); effect(KILL cr); size(2); format %{ "ADD $dst,$src" %} opcode(0x03); ...%} 96
  97. 97. PhaseCFGPhaseCFG::build_cfg()RegionNode, StartNode を元に CFG(Control Flow Graph) を構築。以降のマシンよりの操作が行える様にする。 97
  98. 98. PhaseCFGclass PhaseCFG : public Phase+ _num_blocks: uint+ _blocks: RootNode*+ _bbs: Block_Array+ _broot: Block*+ _rpo_ctr: uint+ _root_loop:+ _node_latency: GrowableAray<uint>* 98
  99. 99. PhaseCFGclass Block : public CFGElement+ _nodes : Node_List+ _succs : Block_Array+ _num_succs: uint+ _pre_order: uint // Pre-order DFS #+ _dom_depth: uint+ _idom : Block*+ _loop : CFGLoop*+ _rpo : uint: // reg pressure, etc 99
  100. 100. PhaseCFGPhaseCFG::Dominators()// Lengauer & Tarjan algorithm// Block の _dom_depth, _idom を設定// Code Motion の元になるデータPhaseCFG::Estimate_Block_Frequency()// IfNode の probabilities から block// の frequency を算出, Block の親の// field _freq に設定 100
  101. 101. PhaseCFGPhaseCFG::GlobalCodeMotion schedule_early schedule_late 101
  102. 102. Register allocation BriggsChaitinレジスタ彩色変数の生存区間の干渉グラフを既定のレジスタ数の色に塗り分け、解けないならスピルを加えて再試行...の改良版読めてません 102
  103. 103. OutputStartNode を MachPrologNode で置き換えUnverified entry point の設定MachEpilogNode を各 return の前に配置ScheduleAndBundle()BuildOopMap()Fill_buffer() CodeBuffer を用意 for (i=0; i < _cfg->numLblocks; i++) for Uj = 0; j < last_inst; j++) … n->emit(*cb, _regalloc)-XX:+PrintOptoAssembly to dump instructionshttps://gist.github.com/1376858 103
  104. 104. おまけ Sheet2 bytecode size vs arena use 14000000 12000000 10000000 8000000 compbytes node res 6000000 4000000 2000000 0 0 1000 2000 3000 4000 5000 6000 7000 bytecode size ページ 1 104
  105. 105. Sheet2 compiler memory use 14000000 12000000 10000000 8000000 comp_arenabytes node_arena res_area 6000000 4000000 2000000 0 0 5000 10000 15000 20000 25000 30000 unique (number of nodes) ページ 1 105
  106. 106. 参考文献http://www.usenix.org/events/jvm01/full_papers/paleczny/paleczny.pdfhttp://ssw.jku.at/Research/Papers/Wuerthinger07Master/Wuerthinger07Master.pdf 106

×