SlideShare a Scribd company logo
What is WALA?
!  Javalibraries for static and dynamic program
  analysis
!  Initially   developed at IBM T.J. Watson Research
  Center
!  Open source release in 2006 under Eclipse
  Public License
!  Key  design goals
   •  Robustness
   •  Efficiency
   •  Extensibility
(Some) Previous Uses of WALA
!    Research
      •  over 40 publications from 2003-present
      •  Including one at PLDI’10 (MemSAT)
      •  http://wala.sf.net/wiki/index.php/Publications.php
!    Products
      •  Rational Software Analyzer: NPEs (Loginov et al.
         ISSTA’08), resource leak detection (Torlak and
         Chandra, ICSE’10)
      •  Rational AppScan: taint analysis (Tripp et al.,
         PLDI’09), string analysis (Geay et al., ICSE’09)
      •  Tivoli Storage Manager: Javascript analysis
      •  WebSphere: analysis of J2EE apps
WALA Features: Static Analysis
!    Pointer analysis / call graph construction
      •  Several algorithms provided (RTA, variants of
         Andersen’s analysis)
      •  Highly customizable (e.g., context sensitivity policy)
      •  Tuned for performance (time and space)
!    Interprocedural dataflow analysis framework
      •  Tabulation solver (Reps-Horwitz-Sagiv POPL’95) with
         extensions
      •  Also tuned for performance
!    Context-sensitive slicing framework
      •  With customizable dependency tracking
Other Key WALA Features
!    Multiple language front-ends
     •  Bytecode: Java, .NET (internal)
     •  Source (CAst): Java, Javascript, X10, PHP (partial,
        internal), ABAP (internal)
     •  Add your own!
!    Generic analysis utilities / data structures
     •  Graphs, sets, maps, constraint solvers, …
!    Limited code transformation
      •  Java bytecode instrumentation via Shrike
      •  But, main WALA IR is immutable, and no code gen
         provided
        "  designed primarily for computing analysis info
What We’ll Cover
!  Overviews of main modules
  •  Important features
  •  Key class names
  •  How things fit together
  •  How to customize
!  “Deep  dives” into real code
  •  Interprocedural dataflow analysis example
  •  CAst Javascript front-end
How to get WALA
!    Walkthrough on “Getting Started” page at wala.sf.net
!    Code available in SVN repository
      •  Trunk or previous tagged releases
      •  Split into several Eclipse projects, e.g.,
         com.ibm.wala.core, com.ibm.wala.core.tests
!    Dependence on Eclipse
      •  Easiest to build / run from Eclipse, but command line
         also supported
      •  Runtime dependence on some Eclipse plugins
         (progress monitors, GUI functionality, etc.); must be
         in classpath
GENERAL UTILITIES
!"#"$%&'&$(')*+'*),-




   4"5+2"-#$6*#*7028$%209&)'


    ()*+,'$*-.$/012)"#,3'


           !"#$%&#'
!"#$%&!$'&()'&*)+,)#)-'"'$.-#
                              !"#$%#



                                   *+#',-%!"#$%#
               $&'()%!"#$%#

                                        ./#0%1#2(.')%




*+#',-%$&'()%!"#$%#           :;;)%#./#0%1#2(           ./#0%1#2(
                               , ,64 ,6"
3=<
5              "                   !!!!!!                435            "
      !!!!!!
                                 ,')%78,9                      !!!!!!
!"#$%&'(%)'*+)',+-$+.+/)0)%1/
                    ('51()$*+'!"#$%&',-#('#
                      !"#$!%&#$'   (%)!%&#$'


         !"#$%&'()$*+',-#('#         .//+'#01#2'3#4*




*-2%)'#%).+)')1'.03+'.-04+'5.%/6'&+/.+'0/&'.-0$.+'-0$).
       ! 7+/.+'81$&.9':;0<'=';%/>'?'#%).'-+$'81$&
       ! *-0$.+'81$&.9'/5;#+$'1@'.+)'#%).
       ! A024520)+'#+.)'5.+'1@'0'.%/62+'&+/.+'-1$)%1/
       ! ,+#020/4+'1/';5)0)%1/B'0;1$)%C%/6')1'.03+'41.)
!"#$%&'()*'!%*'+%,$%-%.*#*)/.
                     !"#$%&'(3$*'4./#0'1#2*,-#('#

            !"#$%&'&()*(+,)&($+-./(*01

        ./#0'1#2*5')2+/#2*6

                              ./#0'1#2*   !"#$%&'()$*+',-#('#


!#0%'-,#1%'23'-"#$).4'1/55/.',/$*)/.-'/6'2)*'-%*-
      ! !*#*%'-,7)*').*/'1/55/.'#.&',$)0#*%',/$*)/.-
      ! +%,/-)*/$3'5#.#4%-'-%*'/6'1/55/.',/$*)/.-
      ! 8/55/.',/$*)/.-'1/5%'#.&'4/'/.'&%5#.&
!"#$%%&'%()&*(+"*,
                           !!"#$%&'!$((%)*+!*,,-&+.!&/!0
                           !!!!$((%)*+!"1!2!%)3.4*,,-&/5
         !"#$%%&'%(')*")   !!!!$((%)*+!61!2!1&78.4*,,-&/5
           +",)   -&%.)
                           !!!!&3!-"1!92!61/!0
                           !!!!!!*66)1.!"1!22!615
                           !!!!:
/$)0#+"(')*")
                           !!!!1).#1+!"15
          /$)0#+"(')*")    !!:


! -".'*(*/(0"12(3"#$%('"4(#&*,"*(&521"5"'*.*&/',
     "   6.7.5"*"7&8"3(#9(*4/(/*0"7(&521"5"'*.*&/',
     "   :,,"7*(*4/(&521"5"'*.*&/',(%&;"(,.5"(7",$1*,
     "   <.=*/79(&'*"7>.="(.11/4,($,"(.,(,*.'3.73(#&*,"*,
     "   </7(3";"1/25"'*(/'19?(5.@/7(*&5"(.'3(,2.="(=/,*,
!"#$%&'(")*&+,)(,#,-.".$/-
           !"#$%&'&($)*+,     -#($%&'&($)*+,

!"#$%"&$'()*+"#$%"&$,-        !"#$%"&$'()*/#";$#60&6#7,(*.-

+."*/#"0123#$450&6#7,-        +."*/#";$#60&6#<&1.",(*.-

8&+6*%660&6#,(*.-             !"#$%"&$'()*/#"=1::0&6#7,(*.-

3&&9#%.*:&."%+.70&6#,(*.-     +."*/#"=1::0&6#<&1.",(*0-

                              8&+6*%66>6/#,(*7$:?*(*67"-

                              3&&9#%.*@%7>6/#,(*7$:?*(*67"-



                         .)&/0*+,
!"#$%&%'()&*+,(
               -%+&%.%/0*012/
            !"#$%&%'!('%)*+*,%&-./

                        !"#$%&%'0',%)*+*,%&-./

!"#$%&#'()*&+,-$'.        4"#8&#$%&#8(99'/0&'()*&+7,-$"/0&.

-$%&#'/0&,!"#$"()*&+.     4"#8&#$%&#:+&0'/0&'()*&+7,-$"/0&.

!"#$%&#123'()*&+,.

4#&+2#/+5-6$!#&+2#&'/0&7,4"#8&#$7.


                !"#$%&%'1&*23-./
!"#$%$&'()"*+',$*)$-$./"/01.
              !"#$%$&'&($)"*"($+,-./01

   !"#$%&$'()*%+(,$*-.
   /%$0(%10234"#$%50$6716$8-3"79"!"*(,$*.
   /%$0(%102:"$;%$<68"!4"#$%50$6+(,$*8-3"7.
   =<%"#$%50$6716$>1)<%-3"79"!"*(,$*.
   /%$0(%102:"$;%$<68"34"#$%?)@@716$8-3"79"!"*(,$*.
   /%$0(%102:"$;%$<68"!4"#$%?)@@+(,$*8-3"7.
   =<%"#$%?)@@716$>1)<%-3"79"!"*(,$*.
   A1=6"(66B6#$-3"80@9"3"68%9"!"*(,$*.
   ,11*$(<"C(8B6#$-3"80@9"3"68%9"!"*(,$*.
   ?$%2:"$;%$<68"!4"#$%B6#$+(,$*8-3"80@9"3"68%.



!"#$%$&5+"67,-.01    234#$+$&!"#$%$&'&($)"*"($+,-.01
!"#"$%&'!$()*'+)"$(,%-#./'
  0$"(1,*'2%$.,'3"($&*
                            (
                        -        .
                            0
                        /        1
                            2

(   -   .   /   0   1   2       !"#$%&'(%)'

        (   -   .   /   0   !)*+,&,!"#$%&'(%)'


                (   -   /   !"#3(%4"5+,&'
!"#"$%&'!$()*'+)"$(,%-#./
       0"),*'1%$.,'2"($&*
            %
        7       8
            9               %   7    6
                                     9   %
                                         :   ;   6
        6       :                   !"#$%&'"()*+,
            ;


6   ;   :   9   7   8   %       !"#"()(-'1(2+3&+,%&/,
                    4526+,+*!"#"()(-'1(2+3&+,%&/,

%   7   9   :   ;   6   8             !"#!(-./0+,1(2+3&+,%&/,
                            4526+,+*!"#!(-./0+,1(2+3&+,%&/,
!"#"$%&'!$()*'+)"$(,%-#./
          011.




         !""#$%&'$(&
!"#$%&'()*"+,%-.

       !"#$%&'"()


               *+#,-(-.!"#$%&'"()


       !"#$%&%/-0("%'$-()
!"#"$%&'()*+#,-+

              !"#$$%&$$'(&)*+,(-

.*/'$,+0(&1234               .*/5(&678*('96:/#,6;(,<#%*(34

                          !5(&678*('96:/#,6;(,<#%*(

                                          .*/>**/?1*(&/,(34
                                            83<!"&'''4=!=<#$%
                                   .*/@,%*5(&678*('96:/#,634
                          83<!"4=!=<#$%
                  .*/A%.*5(&678*('96:/#,634
        83<!"4=!=<#$%
!"#"$%&'()*"+,%-
      ! )
                        12345678)
                        12345678*
      " *
                        9:77578:338;:3<57
# +         $ ,

      % -
              !485:=>8?@A5B846:?715681<?=42@?8:AA782478?<CD56

      & .

' /         ( 0


              !"#$%&'()*+$(,*#%-.#/#)&+01'2/
!"#"$%&'()*"+,%-
!"#$%&'()*+,-!.+*/0+12%/3.&/0+3*+%*#$.4'
5./607.8+*)9:.+;")&/%0)<=/+%)5')07.>'?
''+./"+)').@'2%/3.&/0+()%0)A0)9/*)/<
''''B*$".9C5./D*!!.7E)7.F<)07.>>G'H

!"#$%&'()*+,-!.+*/0+12%/3.&/0+3*+%*#$.4'
5./I75.8+*)9:.+;")&/%0)<=/+%)5':+0JK'=/+%)5'/0>'?
''%:'<:+0J'LL')07.9MNO'PP'/0'LL')07.9MQO>
''''+./"+)').@'2%/3.&/0+;%$/.+<R.+0<>>G
''.$9.'%:'<:+0J'LL')07.9MNO'PP'/0'LL')07.9MSO>
''''+./"+)').@'2%/3.&/0+;%$/.+<0).<>>G
''.$9.'?
''''+./"+)'2%/3.&/0+E7.)/%/,C%)9/*)&.<>G
HH

!"#$%&'T#9/+*&/D../-!.+*/0+12%/3.&/0+3*+%*#$.4'
5./D../-!.+*/0+<>'?
''+./"+)'2%/3.&/0+()%0)C%)9/*)&.<>G'H
!"#"$%&'()*"+,%-
      ! )


      " *         !"#$%&'()%*%+%(%,
                  !"#$%-'.)%*%+%(%.%,
# +         $ /   !"#$%/'0)%*%+%(%0%,
                  !"#$%1'2)%*%+%.%2%,
      % ,         !"#$%3'4)%*%+%(%.%0%2%4%,
                  !"#$%5'6)%*%+%(%.%0%2%4%6%,
                  !"#$%7'8)%*%+%8%,
      & -         !"#$%9':)%*%+%:%,

' .         ( 0
INTERMEDIATE
REPRESENTATION
!"#$%&'()*+,


                                 !"#$%&'()

!"#$%&'!$()*+,-.*/($0(,-11"2*3(4*/15


                !"                      *+(,-.!"#$%&'()


                                         /0&!"#$%&'()
!"#$%&'(%'&)
                       !"


=       #$%&%'#(%&&%#)*              BCD
(
                                     =>)
)            !+%#$
$                              $>-
0   ,0%&%12#345%56789:%#(/#)
                                     ?>@
-
?        ,-%&%.'#$/#0*
A
@          ;5<7;2%#-
!"#$%&'$()"*+,-.#
                       !!"#$%&'()&*+$


  !!",+&+#$%&'()&*+$

!!"-+./0'1#$%&'()&*+$            !!""7%&'0)&#$2+81#$%&'()&*+$

!!"-+$21'%*+$#$%&'()&*+$                !!"#$2+81#$%&'()&*+$




                !!"3*145"))1%%#$%&'()&*+$

   !!",1&#$%&'()&*+$                    !!"6(&#$%&'()&*+$
!"#$%&'$()"*+,-.#
                  !!"#$%&'()&*+$


           !!"",%&'-)&#$.+/0#$%&'()&*+$



!!"#$.+/0#$%&'()&*+$


                       6(5&*70&('$8-5(0#$.+/0#$%&'()&*+$



                            ",%&'-)&304*)-5#$.+/0



                               1-.-!)'*2&#$.+/0
!"#$%&'(%'&)*#+,#-./)0
! +,#1./)0#2&.3,/)#/,0%,1(%#345')#1'67)&0#,1#(.1%)8%
      " 9#:(.2;<#.=#4#345')#,1#4#/,0%,1(%#(.1%)8%
      " )>?>#,10,/)#4#(.1/,%,.145#.&#5..2
! @0)/#%.#/)1.%)#2&)(,0)#,1=.&64%,.1
      " )>?>#45,40,1?#A,%B#.%B)&#345')0
      " )>?>#2&)(,0)#%;2)
! $2)(,=,)/#7;#2.5,(;#/'&,1?#!"#(&)4%,.1


               !!"#$%&'(#&)$*+
                3(1#$4$/012.*1$&/5


                  ,-).(     $/012.*1$&/
!"#$%&'(%'&)*#+,#-./)0
        !"#$#%!&#$$#!'(                BCD

                                       >?'
>            )*#!"
&                                "?,
'         !:#$#;%!'(#
             )*#!"
                                       :?@
"
/   +/#$#01!234#456789#!&.!:
,                              =<64#E<71FG
:         !A#$#;%!'(#
A
                               *7894#E<71FG
@        +,#$#-%!".!/(


           <4=6<1#!,
!"#$%&'&%&()*#+(,$)(
                                               A5+B:5
     !"


=         #$%&%'#(%&&%#)*         C5<A5+'#*
(
                                 D5<A5+'$*%E
)              !+%#$
$
0   ,0%&%12#345%56789:%#(/#)
-                                 C5<B:5:'#*
>         ,-%&%.'#$/#0*
                               D5<B:5:'$*%E
?
@            ;5<7;2%#-
!"#$%&'&%&()*#+,-

      ()*                      +,-./

      !"#
                         $"%           !"#
&"'                                      0
                                       &"'
      $"%
!"#$%&'&%&()*#+,-(#!./(0(.1(
  (           !"#$%&#''#()                    9:;05
  &
            %,#'#/01#2677
  ,
  .         %.#'#/01#2345               677           345

  *                              %,'677-#%.'345-#%*'9:;05
             %*#'#+$%,-%.)
  8
234-5%(#43)%#-0(1&)(#643)%#7(.(08'#%,-(6
    ! $)()#9(1'80(9#%,-()#8.9#3%:(0#;.3<.#%,-()
    ! (=7=#13.10(%(#%,-()#/034#13.)%8.%)
    ! (=7=#13.10(%(#%,-()#/034#8''318%&3.

!.%(0/81(#8''3<)#5)(#8103))#'8.7587()
new TypeInference(ir, doPrimitives); getType(vn)
!"#$%&'()*#+,-.#$%&'()#/01
  !"                !#$%&'(

.$%#$%&'(/0
              )*%#$%&'(

         ,-*%123%,'-+'*,%,'-/,-($40

                                 +'*,%,'-     5$%="7/0
       5$%6,1*%7,-$/0                      5$%7<*%:;;*$%/0
           5$%6,1*%8'9/0      5$%7<*%7,-$/0
                  5$%6,1*%:;;*$%/0       5$%7<*%8'9/0


       ,-*%123%,'-+'*,%,'->%<?$*><->';;*$%>,->%&$>,-*%123%,'-><11<@
!"#$%&'()*#+,-)(%.)#/01
       !"                !#$%&'(

    ,$%#$%&'(-.
               !)*%$+'($#$%&'(




       /$%)*%$+'($!0($1-20%.
                          /$%320$4567$8-20%.




,$%)*%$+'($!0($19%:;$<9:09'==<$%9209%&$920<%85+%2'09:88:*
!"#$%&'()*#+%(,-#.,/)0


!"    !0-%45(%/'02/06$7    &'()*28)*5$205,9$4




          #$%&'()*+),$-./0%12/0%3


                          :44);2'<20),$-
SCOPES AND CLASS
HIERARCHIES
Building a Call Graph
buildCG(String jarFile) {
  // represents code to be analyzed
  AnalyisScope scope = AnalysisScopeReader             (1)
      .makeJavaBinaryAnalysisScope(jarFile, null);
  // a class hierarchy for name resolution, etc.
  IClassHierarchy cha = ClassHierarchy.make(scope);    (2)
  // what are the call graph entrypoints?
  Iterable<Entrypoint> e =                             (3)
      Util.makeMainEntrypoints(scope, cha);
  // encapsulates various analysis options
  AnalysisOptions o = new AnalysisOptions(scope, e);   (4)
  // builds call graph via pointer analysis
  CallGraphBuilder builder =                           (5)
    Util.makeZeroCFABuilder(o, new AnalysisCache(),
                             cha, scope);
  CallGraph cg = builder.makeCallGraph(o, null);       (6)
}
AnalysisScope
!    Represents a set of files to analyze
!    To construct from classpath:
     AnalysisScopeReader.makeJavaBinaryAnalysisScope()
!    To read info from scope text file:
     AnalysisScopeReader.readJavaScope()
     •    Each line of scope file gives loader, lang, type, val
           "  E.g., “Application,Java,jarFile,bcel-5.2.jar”
           "  Common types: classFile, sourceFile, binaryDir, jarFile
           "  Examples in com.ibm.wala.core.tests/dat

!    Exclusions: exclude some classes from consideration
      •  Used to improve scalability of pointer analysis, etc.
      •  Also specified in text file; see, e.g.,
         com.ibm.wala.core.tests/dat/GUIExclusions.txt
Background: Class Loaders
!    In Java, a class is identified by name and class loader
      •  E.g., < Primordial, java.lang.Object >
!    Class loaders form a tree, rooted at Primordial
!    Name lookup first delegates to parent class loader
      •  So, can’t write an app class java.lang.Object
!    User-defined class loaders can provide isolation
      •  Used by Eclipse for plugins, J2EE app servers
!    WALA naming models class loaders
Multiple Names in Bytecode
// this is java.lang.Object
class Object {
  public String toString() { … }
}
// no overriding of toString()
class B extends Object {}
class A extends B {}
Legal names in bytecode:
<Application, A, toString()>,
<Application, B, toString()>,
<Application, java.lang.Object, toString()>,
<Primordial, java.lang.Object, toString()>
Resolved entity:
<Primordial, java.lang.Object, toString()>
WALA Name Resolution

Entity references resolved via IClassHierarchy

Entity      Reference Type    Resolved Resolver Method
                              Type
class       TypeReference     IClass    lookupClass()

method      MethodReference   IMethod   resolveMethod()


Field       FieldReference    IField    resolveField()
More on class hierarchies
!    For Java class hierarchy: ClassHierarchy.make(scope)
!    Supports Java-style subtyping (single inheritance,
     multiple interfaces)
!    Necessary for constructing method IRs, since only
     resolved IMethods have bytecode info
!    Watch out for memory leaks!
     •  Resolved entities (IClass, IMethod, etc.) keep
        pointers back to class hierarchy
     •  In general, use entity references in analysis results
INTERPROCEDURAL
DATAFLOW ANALYSIS
Tabulation-Based Analysis
                 (Reps, Horwitz, Sagiv, POPL95)

!    “Functional approach” to context-sensitive analysis
     (Sharir and Pnueli, 1981)
!    Tabulates partial function summaries on demand
!    Some enhancements in WALA’s implementation
      •  Multiple return sites for calls (for exceptions)
      •  Optional merge operators
      •  Handles partially balanced problems
      •  Customizable worklist orderings
Tabulation Overview
                          T
              TabulationDomain          IFlowFunctionMap
              Provides numbering of     Edge flow functions for
                   domain facts              supergraph

                    ISupergraph
                    Supergraph over            Seeds
                    which analysis is    Initial path edges for
TabulationProblem      computed                  analysis




                    TabulationSolver


                    TabulationResult
Supergraph
!    Collection of “procedure graphs,” connected by calls
      •  ICFGSupergraph: procedure graphs are CFGs
      •  SDGSupergraph: procedure graphs are PDGs
!    Example call representation for ICFGSupergraph
      •  (For general supergraph , possibly many calls, returns, entries, exits)
CALLER                                 CALLEE
               CALL
                                           ENTRY                Normal call-to-ret
                                                                  Exc. call-to-ret
                                                               Normal call-to-entry
             RETURN                                             Normal ret-to-exit
                                            EXIT
   EXC.                                                           Exc. ret-to-exit
                        EXIT
 HANDLER
Domain / Flow Functions / Seeds
!    TabulationDomain
      •  Maintains mapping from facts to integers
      •  Controls worklist priorities (optional)
!    IFlowFunctionMap
      •  Flow functions on supergraph edges
      •  All functions map int (or two ints) to IntSet
         (via TabulationDomain)
      •  Function for each type of edge (normal, call->return,
         call->entry, exit->return)
      •  Also, call->return function for “missing” calls
         "  For handling missing code, CG expansion (Snugglebug)
!    Seeds (TabulationProblem.initialSeeds())
      •  Depends on problem / domain representation
Partially Balanced Problems
!    For when flows can start / end in a non-entrypoint
      •  E.g., slice from non-entrypoint statement
!    PartiallyBalancedTabulationProblem
      •  Additional “unbalanced” return flow function for
         return without a call
      •  getFakeEntry(): source node for path edges of
         partially balanced flows
!    Compute with PartiallyBalancedTabulationSolver
!    Examples: ContextSensitiveReachingDefs, Slicer
Debugging Your Analysis
!    IFDSExplorer
      •  Gives GUI view of analysis result
      •  Needs paths to GraphViz dot executable and PDF
         viewer
!    Set VM property com.ibm.wala.fixedpoint.impl.verbose
     to true for occasional lightweight info
!    Increase TabulationSolver.DEBUG_LEVEL for detailed info
Deep Dive: Reaching Defs
!    The classic dataflow analysis, for Java static fields
!    Three implementations available in
     com.ibm.wala.core.tests
      •  IntraprocReachingDefs
         "  Uses BitVectorSolver, a specialized DataflowSolver
     •  ContextInsensitiveReachingDefs
         "  Uses BitVectorSolver over interprocedural CFG
     •  ContextSensitiveReachingDefs
         "  Uses TabulationSolver

!    We’ll focus on ContextSensitiveReachingDefs
Example
    class StaticDataflow {
      static int f, g;
      static void m() { f = 2; }
      static void testInterproc() {
        f = 3;
        m();                         (1)
        g = 4;
        m();                        (2)
      }
    }

Context-sensitive analysis should give different result
                   after (1) and (2)
The Domain and Supergraph
!    Supergraph: ICFGSupergraph
      •  Procedures: call graph nodes (CGNode)
           "  In context-sensitive call graph, possibly many CGNodes for one
              Imethod
     •    Nodes: BasicBlockInContext<IExplodedBasicBlock>
           "  “exploded” basic block has at most one instruction (eases
              writing transfer functions)
           "  BasicBlockInContext pairs BB with enclosing CGNode

!    Domain (static field writes): Pair<CGNode,Integer>
      •  Integer is index in IR instruction array; only valid way to
         uniquely identify IR instruction
      •  ReachingDefsDomain extends MutableMapping to
         maintain fact numbering
Flow Functions (1)
!    Normal flow for non-putstatics is
     IdentityFlowFunction.identity()
!    Most call-related flow functions are also identity
     •  Since static fields are in global scope
!    Call-to-return function is
     KillEverything.singleton()
      •  Defs must survive callee to reach return
Flow Functions (2)
             Normal flow function for putstatic
                (modified for formatting / clarity)
public IntSet getTargets(int d1) {
  IntSet result = MutableSparseIntSet.makeEmpty();
  // first, gen this statement
  int factNum = domain.getMappedIndex(Pair.make(node, index));
  result.add(factNum);
  // if incoming statement defs the same static field, kill it;
  // otherwise, keep it
  if (d1 != factNum) { // must be different statement
    IField sf = cha.resolveField(putInstr.getField());
    Pair<CGNode, Integer> other = domain.getMappedObject(d1);
    SSAPutInstruction otherPut = getPutInstr(other);
    IField otherSF = cha.resolveField(otherPut.getField());
    if (!sf.equals(otherSF)) { result.add(d1); }
  }
  return result;
}
Seeds
!    Standard tabulation approach: special ‘0’ fact
      •  Add ‘0 -> 0’ edge to all flow functions
      •  Seed with (main_entry, 0) -> (main_entry, 0)
!    Our approach: partially balanced tabulation
     •  For field write numbered n in basic block b of method
        m, add seed (m_entry, n) -> (b, n)
         "  (source fact doesn’t matter)
     •  Unbalanced flow function is just identity
     •  Advantage: keeps other flow functions cleaner
     •  See ReachingDefsProblem.collectInitialSeeds()
Putting it all Together
!    ReachingDefsProblem collects domain, supergraph,
     flow functions, seeds
!    Running analysis (simplified):
     PartiallyBalancedTabulationSolver solver =
       new PartiallyBalancedTabulationSolver(
         new ReachingDefsProblem());
     TabulationResult result = solver.solve();

!    In the real code:
      •  Lots of long generic type instantiations (sigh)
      •  Handling CancelException (enables cancelling
         running analysis from GUI)
CALL GRAPHS / POINTER
ANALYSIS
Call Graph Builder Overview

AnalysisOptions               Heap Model
 Specifies entrypoints,
                               How should       Context Selector
                               objects and        What context to use
how to handle reflection,
                               pointers be       when analyzing call to
          etc.
                               abstracted?          some method?




                            CallGraphBuilder



                   CallGraph             PointerAnalysis
Entrypoints
!    What are entrypoint methods?
     •  main() method
        "  Util.makeMainEntrypoints()
     •  All application methods
        "  AllApplicationEntrypoints
     •  JavaEE Servlet methods, Eclipse plugin entries, …
!    What types are passed to entrypoint arguments?
     •  Just declared parameter types (DefaultEntrypoint)
     •  Some concrete subtype (ArgumentTypeEntrypoint)
     •  All subtypes (SubtypesEntrypoint)
Heap Model
!    Controls abstraction of pointers and object instances
!    InstanceKey: abstraction of an object
      •  All objects of some type (ConcreteTypeKey)
      •  Objects allocated by some statement in some calling
         context (AllocationSiteInNode)
      •  ZeroXInstanceKeys: customizable factory
!    PointerKey: abstraction of a pointer
      •  Local variable in some calling context
         (LocalPointerKey)
      •  Several merged vars (offline substitution), etc.
Context Selector
!    Gives context to use for callee method at some call site
!    Context examples
      •  The default context (Everywhere)
      •  A call string (CallStringContext)
      •  Receiver object (ReceiverInstanceContext)
!    ContextSelector examples
      •  nCFAContextSelector: n-level call strings
      •  ContainerContextSelector: object sensitivity for
         containers
Built-In Algorithms
      (Grove and Chambers, TOPLAS 2001)

       Rapid Type Analysis (RTA)

                     0-CFA




                                                Increasing precision
      context-insensitive, class-based heap


              0-1-CFA
          context-insensitive,
      allocation-site-based heap

              0-1-Container-CFA
     0-1-CFA with object-sensitive containers


For builders, see com.ibm.wala.ipa.callgraph.impl.Util
Performance Tips
!    Use AnalyisScope exclusions
      •  Often, much of standard library (e.g., GUI libraries) is
         irrelevant
!    Analyze older libraries
      •  Java 1.4 libraries much smaller than Java 6
!    Tune context-sensitivity policy
      •  E.g., more sensitivity just for containers
Code Modelling (Advanced)
!    SSAContextInterpreter: builds SSA for a method
      •  Normally, based on bytecode
      •  Customized for reflection, Object.clone(), etc.
!    MethodTargetSelector: determines method dispatch
      •  Normally, based on types / class hierarchy
      •  Customized for native methods, JavaEE, etc.
!    ClassTargetSelector: determines types of allocated
     objects
      •  Normally, type referenced in new expression
      •  Customized for adding synthetic fields, JavaEE, etc.
Refinement-Based Points-To
                     Analysis
!    Refines analysis precision as requested by client
      •  Computes results on demand
      •  See Sridharan and Bodik, PLDI 2006
!    Implemented in DemandRefinementPointsTo
      •  Baseline analysis is context insensitive
           "  Field sensitivity, on-the-fly call graph via refinement
     •    Context sensitivity (modulo recursion), other regular
          properties via additional state machines
     •    Refinement policy can be easily customized
     •    Can also compute “flows to” on demand
!    Usage fairly well documented on wiki
!    See DemandCastChecker for an example client
SLICING
Slicer Overview
 Statement                      DataDependenceOptions
Seed for the slicer                Which data deps to consider


             ControlDependenceOptions
                  Which control deps to consider


       CallGraph                      PointerAnalysis



   Slicer.compute{Forward|Backward}Slice()


                 Collection<Statement>
Statement
!    Identifies a node in the System Dependence Graph (SDG)
     [Horwitz,Reps,Binkley,TOPLAS’90]
!    Key statement types
      •  NormalStatement
           "  Normal SSA IR instruction
           "  Represented by CGNode and instruction index
     •    ParamCaller, ParamCallee
           "  Extra nodes for modeling parameter passing
           "  SDG edges from def -> ParamCaller -> ParamCallee -> use
     •    NormalReturnCaller, NormalReturnCallee
           "  Analogous to ParamCaller, ParamCallee
           "  Also ExceptionalReturnCaller/Callee
     •    HeapParamCaller, HeapParamCallee, etc.
           "  For modeling interprocedural heap-based data deps
           "  Edges via interprocedural mod-ref analysis
Dependence Options
!    DataDependenceOptions
      •  FULL (all deps) and NONE (no deps)
      •  NO_BASE_PTRS: ignore dependencies for memory
         access base pointers
        "  E.g., exclude forward deps from defs of x to y=x.f
     •  NO_HEAP: ignore dependencies to/from heap locs
     •  NO_EXCEPTIONS: ignore deps from throw to catch
     •  Various combinations (e.g., NO_BASE_NO_HEAP)
!    ControlDependenceOptions
      •  FULL (all deps) and NONE (no deps)
      •  NO_EXCEPTIONAL_EDGES: ignore exceptional control
         flow
Thin Slicing
!    Just “top-level” data dependencies
      (see [Sridharan-Fink-Bodik PLDI’07])
!    For context-sensitive thin slicing, use Slicer with
     DataDependenceOptions.NO_BASE_PTRS and
     ControlDependenceOptions.NONE
!    For efficient context-insensitive thin slicing, use the
     CISlicer class
Performance Tips
!    Some configs do not scale to large programs
      •  E.g., context-sensitive slicing with heap deps
      •  Discussion in [SFB07]
!    Run with minimum dependencies needed
!    Apply pointer analysis scalability tips
      •  Exclusions, earlier Java libraries
INSTRUMENTING BYTECODES
WITH SHRIKE
Key Shrike Features
!    Patch-based instrumentation API
      •  Each instrumentation pass implemented as a patch
      •  Several patches can be applied simultaneously to original
         bytecode
           "  No worries about instrumenting the instrumentation
     •    Branch targets / exc. handlers automatically updated
!    Efficient
      •  Unmodified class methods copied without parsing
      •  Efficient bytecode representation / parsing
           "  Array of immutable instruction objects
           "  Constant instrs represented with single shared object

!    Some ugliness hidden
      •  JSRs, exception handler ranges, 64k method limit
Key Shrike Classes

ClassReader              ClassWriter
Immutable view            Generates JVM
of .class file info;    representation of a       ShrikeCT:
reads data lazily              class           reading / writing
                                                  .class files

 MethodEditor              ClassInstrumenter
    Core class for
                           Utility for instrumenting an
    transforming
                            existing class (mutable)
bytecodes via patches

MethodData                   CTCompiler                     ShrikeBT:
 Mutable view of        Compiles ShrikeBT method          instrumenting
  method info              into JVM bytecodes               bytecodes
Instrumenting A Method (1)


instrument(byte[] orig, int i) {
  // mutable helper class
  ClassInstrumenter ci =                     (1)
    new ClassInstrumenter(orig);
  // mutable representation of method data
  MethodData md = ci.visitMethod(i);         (2)
  // see next slide; mutates md, ci
  doInstrumentation(md);                     (3)
  // output instrumented class in JVM format
  ClassWriter w = ci.emitClass();           (4)
  byte[] modified = w.makeBytes();         (5)
}
Instrumenting A Method (2)


doInstrumentation(MethodData md) {
  // manages the patching process
  MethodEditor me = new MethodEditor(md);   (1)
  me.beginPass();                           (2)
  // add patches
  me.insertAtStart(new Patch() { … });       (3)
  me.insertBefore(j, new Patch() { … });    (4)
  …
  // apply patches (simultaneously)
  me.applyPatches(); me.endPass();          (5)
}
Shrike Clients
!    Small example: see com.ibm.wala.shrike.bench.Bench
!    Dila (com.ibm.wala.dila in incubator)
      •  Dynamic call graph construction
          (CallGraphInstrumentation)
      •  Utilities for runtime instrumentation
        "  Instrumenting class loader
        "  Mechanisms for controlling what gets instrumented
     •  Work continues on better WALA integration / docs
Java Annotation Support
!    Supported features
      •  Reading .class file attributes
      •  Parsing some annotation info from attributes
        "  E.g., generics (com.ibm.wala.types.generics)
     •  Manipulating JVM class attributes directly
        "  See ClassWriter.addClassAttribute()

!    Missing features
     •  Higher-level APIs for modifying known annotations
     •  Automatic fixing of StackMapTable attribute after
        instrumentation
        "  Speeds bytecode verification in Java 6
Eclipse Support
!    WALA projects are Eclipse plug-ins
     •  Easy to invoke from other plug-in
!    Various utilities in com.ibm.wala.ide project
      •  EclipseProjectPath: creates AnalysisScope for
         Eclipse project
      •  JdtUtil: find all Java projects, code within projects,
         etc.
!    JDT CAst frontend for Java source analysis
!    Prototype utils in com.ibm.wala.eclipse project
      •  E.g., display call graph for selected project
FRONT ENDS / CAST
!"#"$%&'()$*(+


                                             =/.+5.$>
                             #/*:;/:5
           '())(*$"+,$       -./*+0/,(.
          -($%&$-./*+0/,(.                   =/.+5.$?

!"#"$%&

          12.345$67,58(95                 %*,5.*/0$12.345
          -($%&$-./*+0/,(.   67,58(95
                              <(.)/,
                                          &;*,3)5$03@./.7
!"#"$%&'()*+($,-*.'$/.+


                                 !"#"

                       89:#

                              <=>3$)?5/*@,
            '()*+,$
!"#"$%&   -./,012,
                       ;%#
            31$%&$
          3)45674/1)            21/<A5?

                       "-;

                                34@4)*5
!"#$%&'()'*+,-.#/0.$+,



  !"#$%&                                  )*+*
                 !"#$%&'(
 1&23&#4                                ,-./&#0&#



'50&6-3&4                   '()*+,$%&         !"#"$%&
!"#$%&'(&)*&#+

  !"#$%&    !"#$%&,-
 '&()&#*          ! ./01'23)4)'567&89*&:

            ;<='*97>?@A
                 ! ,B1
+,-&./)&*        ! 2$@7&#@)C'BD0'9@C6:

            EB<FG
                 ! ,B1
                 ! 2BD0H'E$@*9I+'9@C6:

            09J$CC)'-)K)#$@
                  ! LD,'2L87$9@!8#$M7:
                  ! 2A@*&#'*&4&C9MK&@7:
!"#"$%&'()*$+&,*$-(&./$0.,
                        1/2/34.56,
                        -./*+0/,5(*      &C5*(


                           8D8         %9)$6/.+:.
                        -./*+0/,5(*


          '())(*$"+,$                     ;:*<
                           EFG
!"#"$%&      -($%&$     -./*+0/,5(*
           -./*+0/,(.                   8(0?@0(,
                           1/2/
                        -./*+0/,5(*
                                       =4056+:$1>-

                          "7"8
                        -./*+0/,5(*   'A+,()$"B-#&
!"#$%&'%()*)+,$-.*

#)&%'+         94;#
                                     >%+
 <(.+#         <%)=                                     !7
                                     !7
              #)&%'+




                       !"#$%&'$()"
                       *+"+%,$()"

                    -)"$%)./0.)1           449/-)":+%#()"
    >,%#+%
                   *%,23/-%+,$()"

                   4)&%'+/5,22("6
                     7+')%8("6
!"#"$%&'()*+,-)&.%)'/,*01,1&")'/,
                               -,.#
  #)&%'3                                                 4%3
                               1%)2
   1(03#                                                 !5
                              #)&%'3



             4/%#3%*                     !"#$%&'$()"#*
       56(").)-#$.%/"#0/$)%            +,-#$.%/"#0/$)%
2&",-3")1*45',/*-)&.%).&1-*)/*6787*9/::/,*7$2*;97-)<
      ! 9/:=',")'/,*/>*?1,1&'%*",@*!"#"$%&'()*7$2*,/@1-
      ! A,3B*('1%1*/>*%/@1*)5")*.,@1&-)",@-*45',/

2&",-3")1*97-)*7$2*)/*6787*C&1*+4*>/&:
      ! $5"&1@*=B*",/)51&*',)1&,"3*!"#"$%&'()*)&",-3")/&
      ! DE)1,@-*?1,1&'%*)&",-3")'/,*:"%5',1&B
!"#$%&'$()"*+,",%-$()"
   *..%96#:(#;
   ####70827-#9$$:#<<#9=012345%:(>
                                            !"#$#%!&#$$#!'(
   ?

                                                   )*#!"
             70827-
                                       +"#$#,-!./0#012345#!&6!'
               <<

                      ,-!./0#
    9#$$#:
                    9=012345%:(
                                               70827-#!"

@58A73-5438.7#,BC40B0-85#70D275,!0#@EA#8700#F34/
!"#$%"&'(&")'*%+,-'!%.+$/"#
                                  E+,F4"G:#$&H&00(&)
                                  &&IJ/&E+,F4"G:#$14+!,(6
!""#$%&'(&)                       &&I=&/&E+,F4"G:#$1*8EK,(6
&&&&*+,-*.&$//'&00&$1+2-345#'(6   &&I@&/&.+L&F4"G:#(
7                                 &&IJ15-GG+55"*&/&I@6
                                  &&I@13MM#NIJ19&//&,*-+O(
                                  &&I@1!345+&/&I=6
          *+,-*.                  7
                                  &&                BCD

            00                                      ;<=


                   8.9":+&                    @<A
 $&//&'
                 $1+2-345#'(
                                                    ><?


          P5,Q*3.543,"*&I-84M5&BCD&*+G-*589+4'&"9+*&,K+&PRQ
!"#$%&'(")*+*",'-.//*,0
*..%96#:(#;
####70827-#9$$:#<<#9=012345%:(>
                                          !"#$#%!&#$$#!'(
?
                                     F&6&GH#I#F&6&KH
                                                )*#!"
          70827-                     F&6&GH#I#F&6'JH
                                    +"#$#,-!./0#012345#!&6!'
            <<                       F&6&JH#I#F&6'JH

                   ,-!./0#
 9#$$#:                              F&6'H#I#F&6'JH
                 9=012345%:(
                                             70827-#!"

    @58A73-5438.7#B.C,05#5.27B0#C.5,8,.-5#*7.D#@EA
!!"#$%&'()*+%&
    !"#$#%!&#$$#!'(                      !"#$#%!&#$$#!'(


         )*#!"                                )*#!"


!"#$#+,!-./#/01234#!&5!'               !8#$#+,!-./#/01234#
                                              !&5!'

                                          !9#$#:%!"5!8(


       6/716,#!"                            6/716,#!9

         ;<=<#>-/4#?-@A#@6-@2B27+-,#+,#CC<#?-,!/64+-,
         ;<=<#+D@3/D/,74#*133AE@61,/>#CC<#F-,!/64+-,
           ! +G/G#@H+4#-,3A#+,4/67/>#*-6#3+!/#!231/4

More Related Content

Similar to WALA Tutorial at PLDI 2010

Materializing Energy
Materializing EnergyMaterializing Energy
Materializing Energy
James Pierce
 
illustration art market report illustrated gallery
illustration art market report illustrated galleryillustration art market report illustrated gallery
illustration art market report illustrated gallery
Ingrid Bond
 
Startershandboek
StartershandboekStartershandboek
Startershandboek
dijkhuizen
 
QUIETING THE ECHOES - a case study for creatives
QUIETING THE ECHOES - a case study for creativesQUIETING THE ECHOES - a case study for creatives
QUIETING THE ECHOES - a case study for creatives
Let's Make Great!
 
Ssijialiye
SsijialiyeSsijialiye
Ssijialiye
renata7
 
IBC_Profile
IBC_ProfileIBC_Profile
IBC_Profile
Pankaj Sikka
 
2018 Asia Business Outlook
2018 Asia Business Outlook2018 Asia Business Outlook
2018 Asia Business Outlook
Rob Koepp
 
El color
El colorEl color
El color
lcarber
 
Varias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
Varias formas de se ver uma loja Artigo para a Revista Dirigente LojistaVarias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
Varias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
Flávio Radamarker, RDI
 
Rocket Profile (Pan-Africa)
Rocket Profile (Pan-Africa)Rocket Profile (Pan-Africa)
Rocket Profile (Pan-Africa)
Diana Muchoki
 
Ico corporate presentation en
Ico corporate presentation enIco corporate presentation en
Ico corporate presentation en
Harpreet kaur
 
Nearby Startup Pitch for SUU 2013 conference
Nearby Startup Pitch for SUU 2013 conferenceNearby Startup Pitch for SUU 2013 conference
Nearby Startup Pitch for SUU 2013 conference
Adam Nemeth
 
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
Alba Santa
 
Blooms verbs and assessment types
Blooms verbs and assessment typesBlooms verbs and assessment types
Blooms verbs and assessment types
Carol1430
 
Arduino notebook v1-1
Arduino notebook v1-1Arduino notebook v1-1
Arduino notebook v1-1
Anil Yadav
 
Carrot cake
Carrot cakeCarrot cake
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recyclingSydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
The Start Society
 
Sept 19 jobseekers 1
Sept 19 jobseekers 1Sept 19 jobseekers 1
Sept 19 jobseekers 1
Hack the Hood
 
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
Institut Pasteur de Madagascar
 
Observatoire des taxes foncières - Période 2012 / 2017
Observatoire des taxes foncières - Période 2012 / 2017Observatoire des taxes foncières - Période 2012 / 2017
Observatoire des taxes foncières - Période 2012 / 2017
Monimmeuble.com
 

Similar to WALA Tutorial at PLDI 2010 (20)

Materializing Energy
Materializing EnergyMaterializing Energy
Materializing Energy
 
illustration art market report illustrated gallery
illustration art market report illustrated galleryillustration art market report illustrated gallery
illustration art market report illustrated gallery
 
Startershandboek
StartershandboekStartershandboek
Startershandboek
 
QUIETING THE ECHOES - a case study for creatives
QUIETING THE ECHOES - a case study for creativesQUIETING THE ECHOES - a case study for creatives
QUIETING THE ECHOES - a case study for creatives
 
Ssijialiye
SsijialiyeSsijialiye
Ssijialiye
 
IBC_Profile
IBC_ProfileIBC_Profile
IBC_Profile
 
2018 Asia Business Outlook
2018 Asia Business Outlook2018 Asia Business Outlook
2018 Asia Business Outlook
 
El color
El colorEl color
El color
 
Varias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
Varias formas de se ver uma loja Artigo para a Revista Dirigente LojistaVarias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
Varias formas de se ver uma loja Artigo para a Revista Dirigente Lojista
 
Rocket Profile (Pan-Africa)
Rocket Profile (Pan-Africa)Rocket Profile (Pan-Africa)
Rocket Profile (Pan-Africa)
 
Ico corporate presentation en
Ico corporate presentation enIco corporate presentation en
Ico corporate presentation en
 
Nearby Startup Pitch for SUU 2013 conference
Nearby Startup Pitch for SUU 2013 conferenceNearby Startup Pitch for SUU 2013 conference
Nearby Startup Pitch for SUU 2013 conference
 
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
EDUTEC-e: "Innovando en la Educación para la empleabilidad y el desarrollo de...
 
Blooms verbs and assessment types
Blooms verbs and assessment typesBlooms verbs and assessment types
Blooms verbs and assessment types
 
Arduino notebook v1-1
Arduino notebook v1-1Arduino notebook v1-1
Arduino notebook v1-1
 
Carrot cake
Carrot cakeCarrot cake
Carrot cake
 
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recyclingSydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
SydStart 2010 Spring Pitch by Eric Bae CEO of Freally recycling
 
Sept 19 jobseekers 1
Sept 19 jobseekers 1Sept 19 jobseekers 1
Sept 19 jobseekers 1
 
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
Est ce que la mise au point d'un vaccin changerait les stratégies de lutte co...
 
Observatoire des taxes foncières - Période 2012 / 2017
Observatoire des taxes foncières - Période 2012 / 2017Observatoire des taxes foncières - Période 2012 / 2017
Observatoire des taxes foncières - Période 2012 / 2017
 

WALA Tutorial at PLDI 2010

  • 1.
  • 2. What is WALA? !  Javalibraries for static and dynamic program analysis !  Initially developed at IBM T.J. Watson Research Center !  Open source release in 2006 under Eclipse Public License !  Key design goals •  Robustness •  Efficiency •  Extensibility
  • 3. (Some) Previous Uses of WALA !  Research •  over 40 publications from 2003-present •  Including one at PLDI’10 (MemSAT) •  http://wala.sf.net/wiki/index.php/Publications.php !  Products •  Rational Software Analyzer: NPEs (Loginov et al. ISSTA’08), resource leak detection (Torlak and Chandra, ICSE’10) •  Rational AppScan: taint analysis (Tripp et al., PLDI’09), string analysis (Geay et al., ICSE’09) •  Tivoli Storage Manager: Javascript analysis •  WebSphere: analysis of J2EE apps
  • 4. WALA Features: Static Analysis !  Pointer analysis / call graph construction •  Several algorithms provided (RTA, variants of Andersen’s analysis) •  Highly customizable (e.g., context sensitivity policy) •  Tuned for performance (time and space) !  Interprocedural dataflow analysis framework •  Tabulation solver (Reps-Horwitz-Sagiv POPL’95) with extensions •  Also tuned for performance !  Context-sensitive slicing framework •  With customizable dependency tracking
  • 5. Other Key WALA Features !  Multiple language front-ends •  Bytecode: Java, .NET (internal) •  Source (CAst): Java, Javascript, X10, PHP (partial, internal), ABAP (internal) •  Add your own! !  Generic analysis utilities / data structures •  Graphs, sets, maps, constraint solvers, … !  Limited code transformation •  Java bytecode instrumentation via Shrike •  But, main WALA IR is immutable, and no code gen provided "  designed primarily for computing analysis info
  • 6. What We’ll Cover !  Overviews of main modules •  Important features •  Key class names •  How things fit together •  How to customize !  “Deep dives” into real code •  Interprocedural dataflow analysis example •  CAst Javascript front-end
  • 7. How to get WALA !  Walkthrough on “Getting Started” page at wala.sf.net !  Code available in SVN repository •  Trunk or previous tagged releases •  Split into several Eclipse projects, e.g., com.ibm.wala.core, com.ibm.wala.core.tests !  Dependence on Eclipse •  Easiest to build / run from Eclipse, but command line also supported •  Runtime dependence on some Eclipse plugins (progress monitors, GUI functionality, etc.); must be in classpath
  • 9. !"#"$%&'&$(')*+'*),- 4"5+2"-#$6*#*7028$%209&)' ()*+,'$*-.$/012)"#,3' !"#$%&#'
  • 10. !"#$%&!$'&()'&*)+,)#)-'"'$.-# !"#$%# *+#',-%!"#$%# $&'()%!"#$%# ./#0%1#2(.')% *+#',-%$&'()%!"#$%# :;;)%#./#0%1#2( ./#0%1#2( , ,64 ,6" 3=< 5 " !!!!!! 435 " !!!!!! ,')%78,9 !!!!!!
  • 11. !"#$%&'(%)'*+)',+-$+.+/)0)%1/ ('51()$*+'!"#$%&',-#('# !"#$!%&#$' (%)!%&#$' !"#$%&'()$*+',-#('# .//+'#01#2'3#4* *-2%)'#%).+)')1'.03+'.-04+'5.%/6'&+/.+'0/&'.-0$.+'-0$). ! 7+/.+'81$&.9':;0<'=';%/>'?'#%).'-+$'81$& ! *-0$.+'81$&.9'/5;#+$'1@'.+)'#%). ! A024520)+'#+.)'5.+'1@'0'.%/62+'&+/.+'-1$)%1/ ! ,+#020/4+'1/';5)0)%1/B'0;1$)%C%/6')1'.03+'41.)
  • 12. !"#$%&'()*'!%*'+%,$%-%.*#*)/. !"#$%&'(3$*'4./#0'1#2*,-#('# !"#$%&'&()*(+,)&($+-./(*01 ./#0'1#2*5')2+/#2*6 ./#0'1#2* !"#$%&'()$*+',-#('# !#0%'-,#1%'23'-"#$).4'1/55/.',/$*)/.-'/6'2)*'-%*- ! !*#*%'-,7)*').*/'1/55/.'#.&',$)0#*%',/$*)/.- ! +%,/-)*/$3'5#.#4%-'-%*'/6'1/55/.',/$*)/.- ! 8/55/.',/$*)/.-'1/5%'#.&'4/'/.'&%5#.&
  • 13. !"#$%%&'%()&*(+"*, !!"#$%&'!$((%)*+!*,,-&+.!&/!0 !!!!$((%)*+!"1!2!%)3.4*,,-&/5 !"#$%%&'%(')*") !!!!$((%)*+!61!2!1&78.4*,,-&/5 +",) -&%.) !!!!&3!-"1!92!61/!0 !!!!!!*66)1.!"1!22!615 !!!!: /$)0#+"(')*") !!!!1).#1+!"15 /$)0#+"(')*") !!: ! -".'*(*/(0"12(3"#$%('"4(#&*,"*(&521"5"'*.*&/', " 6.7.5"*"7&8"3(#9(*4/(/*0"7(&521"5"'*.*&/', " :,,"7*(*4/(&521"5"'*.*&/',(%&;"(,.5"(7",$1*, " <.=*/79(&'*"7>.="(.11/4,($,"(.,(,*.'3.73(#&*,"*, " </7(3";"1/25"'*(/'19?(5.@/7(*&5"(.'3(,2.="(=/,*,
  • 14. !"#$%&'(")*&+,)(,#,-.".$/- !"#$%&'&($)*+, -#($%&'&($)*+, !"#$%"&$'()*+"#$%"&$,- !"#$%"&$'()*/#";$#60&6#7,(*.- +."*/#"0123#$450&6#7,- +."*/#";$#60&6#<&1.",(*.- 8&+6*%660&6#,(*.- !"#$%"&$'()*/#"=1::0&6#7,(*.- 3&&9#%.*:&."%+.70&6#,(*.- +."*/#"=1::0&6#<&1.",(*0- 8&+6*%66>6/#,(*7$:?*(*67"- 3&&9#%.*@%7>6/#,(*7$:?*(*67"- .)&/0*+,
  • 15. !"#$%&%'()&*+,( -%+&%.%/0*012/ !"#$%&%'!('%)*+*,%&-./ !"#$%&%'0',%)*+*,%&-./ !"#$%&#'()*&+,-$'. 4"#8&#$%&#8(99'/0&'()*&+7,-$"/0&. -$%&#'/0&,!"#$"()*&+. 4"#8&#$%&#:+&0'/0&'()*&+7,-$"/0&. !"#$%&#123'()*&+,. 4#&+2#/+5-6$!#&+2#&'/0&7,4"#8&#$7. !"#$%&%'1&*23-./
  • 16. !"#$%$&'()"*+',$*)$-$./"/01. !"#$%$&'&($)"*"($+,-./01 !"#$%&$'()*%+(,$*-. /%$0(%10234"#$%50$6716$8-3"79"!"*(,$*. /%$0(%102:"$;%$<68"!4"#$%50$6+(,$*8-3"7. =<%"#$%50$6716$>1)<%-3"79"!"*(,$*. /%$0(%102:"$;%$<68"34"#$%?)@@716$8-3"79"!"*(,$*. /%$0(%102:"$;%$<68"!4"#$%?)@@+(,$*8-3"7. =<%"#$%?)@@716$>1)<%-3"79"!"*(,$*. A1=6"(66B6#$-3"80@9"3"68%9"!"*(,$*. ,11*$(<"C(8B6#$-3"80@9"3"68%9"!"*(,$*. ?$%2:"$;%$<68"!4"#$%B6#$+(,$*8-3"80@9"3"68%. !"#$%$&5+"67,-.01 234#$+$&!"#$%$&'&($)"*"($+,-.01
  • 17. !"#"$%&'!$()*'+)"$(,%-#./' 0$"(1,*'2%$.,'3"($&* ( - . 0 / 1 2 ( - . / 0 1 2 !"#$%&'(%)' ( - . / 0 !)*+,&,!"#$%&'(%)' ( - / !"#3(%4"5+,&'
  • 18. !"#"$%&'!$()*'+)"$(,%-#./ 0"),*'1%$.,'2"($&* % 7 8 9 % 7 6 9 % : ; 6 6 : !"#$%&'"()*+, ; 6 ; : 9 7 8 % !"#"()(-'1(2+3&+,%&/, 4526+,+*!"#"()(-'1(2+3&+,%&/, % 7 9 : ; 6 8 !"#!(-./0+,1(2+3&+,%&/, 4526+,+*!"#!(-./0+,1(2+3&+,%&/,
  • 19. !"#"$%&'!$()*'+)"$(,%-#./ 011. !""#$%&'$(&
  • 20. !"#$%&'()*"+,%-. !"#$%&'"() *+#,-(-.!"#$%&'"() !"#$%&%/-0("%'$-()
  • 21. !"#"$%&'()*+#,-+ !"#$$%&$$'(&)*+,(- .*/'$,+0(&1234 .*/5(&678*('96:/#,6;(,<#%*(34 !5(&678*('96:/#,6;(,<#%*( .*/>**/?1*(&/,(34 83<!"&'''4=!=<#$% .*/@,%*5(&678*('96:/#,634 83<!"4=!=<#$% .*/A%.*5(&678*('96:/#,634 83<!"4=!=<#$%
  • 22. !"#"$%&'()*"+,%- ! ) 12345678) 12345678* " * 9:77578:338;:3<57 # + $ , % - !485:=>8?@A5B846:?715681<?=42@?8:AA782478?<CD56 & . ' / ( 0 !"#$%&'()*+$(,*#%-.#/#)&+01'2/
  • 24. !"#"$%&'()*"+,%- ! ) " * !"#$%&'()%*%+%(%, !"#$%-'.)%*%+%(%.%, # + $ / !"#$%/'0)%*%+%(%0%, !"#$%1'2)%*%+%.%2%, % , !"#$%3'4)%*%+%(%.%0%2%4%, !"#$%5'6)%*%+%(%.%0%2%4%6%, !"#$%7'8)%*%+%8%, & - !"#$%9':)%*%+%:%, ' . ( 0
  • 26. !"#$%&'()*+, !"#$%&'() !"#$%&'!$()*+,-.*/($0(,-11"2*3(4*/15 !" *+(,-.!"#$%&'() /0&!"#$%&'()
  • 27. !"#$%&'(%'&) !" = #$%&%'#(%&&%#)* BCD ( =>) ) !+%#$ $ $>- 0 ,0%&%12#345%56789:%#(/#) ?>@ - ? ,-%&%.'#$/#0* A @ ;5<7;2%#-
  • 28. !"#$%&'$()"*+,-.# !!"#$%&'()&*+$ !!",+&+#$%&'()&*+$ !!"-+./0'1#$%&'()&*+$ !!""7%&'0)&#$2+81#$%&'()&*+$ !!"-+$21'%*+$#$%&'()&*+$ !!"#$2+81#$%&'()&*+$ !!"3*145"))1%%#$%&'()&*+$ !!",1&#$%&'()&*+$ !!"6(&#$%&'()&*+$
  • 29. !"#$%&'$()"*+,-.# !!"#$%&'()&*+$ !!"",%&'-)&#$.+/0#$%&'()&*+$ !!"#$.+/0#$%&'()&*+$ 6(5&*70&('$8-5(0#$.+/0#$%&'()&*+$ ",%&'-)&304*)-5#$.+/0 1-.-!)'*2&#$.+/0
  • 30. !"#$%&'(%'&)*#+,#-./)0 ! +,#1./)0#2&.3,/)#/,0%,1(%#345')#1'67)&0#,1#(.1%)8% " 9#:(.2;<#.=#4#345')#,1#4#/,0%,1(%#(.1%)8% " )>?>#,10,/)#4#(.1/,%,.145#.&#5..2 ! @0)/#%.#/)1.%)#2&)(,0)#,1=.&64%,.1 " )>?>#45,40,1?#A,%B#.%B)&#345')0 " )>?>#2&)(,0)#%;2) ! $2)(,=,)/#7;#2.5,(;#/'&,1?#!"#(&)4%,.1 !!"#$%&'(#&)$*+ 3(1#$4$/012.*1$&/5 ,-).( $/012.*1$&/
  • 31. !"#$%&'(%'&)*#+,#-./)0 !"#$#%!&#$$#!'( BCD >?' > )*#!" & "?, ' !:#$#;%!'(# )*#!" :?@ " / +/#$#01!234#456789#!&.!: , =<64#E<71FG : !A#$#;%!'(# A *7894#E<71FG @ +,#$#-%!".!/( <4=6<1#!,
  • 32. !"#$%&'&%&()*#+(,$)( A5+B:5 !" = #$%&%'#(%&&%#)* C5<A5+'#* ( D5<A5+'$*%E ) !+%#$ $ 0 ,0%&%12#345%56789:%#(/#) - C5<B:5:'#* > ,-%&%.'#$/#0* D5<B:5:'$*%E ? @ ;5<7;2%#-
  • 33. !"#$%&'&%&()*#+,- ()* +,-./ !"# $"% !"# &"' 0 &"' $"%
  • 34. !"#$%&'&%&()*#+,-(#!./(0(.1( ( !"#$%&#''#() 9:;05 & %,#'#/01#2677 , . %.#'#/01#2345 677 345 * %,'677-#%.'345-#%*'9:;05 %*#'#+$%,-%.) 8 234-5%(#43)%#-0(1&)(#643)%#7(.(08'#%,-(6 ! $)()#9(1'80(9#%,-()#8.9#3%:(0#;.3<.#%,-() ! (=7=#13.10(%(#%,-()#/034#13.)%8.%) ! (=7=#13.10(%(#%,-()#/034#8''318%&3. !.%(0/81(#8''3<)#5)(#8103))#'8.7587() new TypeInference(ir, doPrimitives); getType(vn)
  • 35. !"#$%&'()*#+,-.#$%&'()#/01 !" !#$%&'( .$%#$%&'(/0 )*%#$%&'( ,-*%123%,'-+'*,%,'-/,-($40 +'*,%,'- 5$%="7/0 5$%6,1*%7,-$/0 5$%7<*%:;;*$%/0 5$%6,1*%8'9/0 5$%7<*%7,-$/0 5$%6,1*%:;;*$%/0 5$%7<*%8'9/0 ,-*%123%,'-+'*,%,'->%<?$*><->';;*$%>,->%&$>,-*%123%,'-><11<@
  • 36. !"#$%&'()*#+,-)(%.)#/01 !" !#$%&'( ,$%#$%&'(-. !)*%$+'($#$%&'( /$%)*%$+'($!0($1-20%. /$%320$4567$8-20%. ,$%)*%$+'($!0($19%:;$<9:09'==<$%9209%&$920<%85+%2'09:88:*
  • 37. !"#$%&'()*#+%(,-#.,/)0 !" !0-%45(%/'02/06$7 &'()*28)*5$205,9$4 #$%&'()*+),$-./0%12/0%3 :44);2'<20),$-
  • 39. Building a Call Graph buildCG(String jarFile) { // represents code to be analyzed AnalyisScope scope = AnalysisScopeReader (1) .makeJavaBinaryAnalysisScope(jarFile, null); // a class hierarchy for name resolution, etc. IClassHierarchy cha = ClassHierarchy.make(scope); (2) // what are the call graph entrypoints? Iterable<Entrypoint> e = (3) Util.makeMainEntrypoints(scope, cha); // encapsulates various analysis options AnalysisOptions o = new AnalysisOptions(scope, e); (4) // builds call graph via pointer analysis CallGraphBuilder builder = (5) Util.makeZeroCFABuilder(o, new AnalysisCache(), cha, scope); CallGraph cg = builder.makeCallGraph(o, null); (6) }
  • 40. AnalysisScope !  Represents a set of files to analyze !  To construct from classpath: AnalysisScopeReader.makeJavaBinaryAnalysisScope() !  To read info from scope text file: AnalysisScopeReader.readJavaScope() •  Each line of scope file gives loader, lang, type, val "  E.g., “Application,Java,jarFile,bcel-5.2.jar” "  Common types: classFile, sourceFile, binaryDir, jarFile "  Examples in com.ibm.wala.core.tests/dat !  Exclusions: exclude some classes from consideration •  Used to improve scalability of pointer analysis, etc. •  Also specified in text file; see, e.g., com.ibm.wala.core.tests/dat/GUIExclusions.txt
  • 41. Background: Class Loaders !  In Java, a class is identified by name and class loader •  E.g., < Primordial, java.lang.Object > !  Class loaders form a tree, rooted at Primordial !  Name lookup first delegates to parent class loader •  So, can’t write an app class java.lang.Object !  User-defined class loaders can provide isolation •  Used by Eclipse for plugins, J2EE app servers !  WALA naming models class loaders
  • 42. Multiple Names in Bytecode // this is java.lang.Object class Object { public String toString() { … } } // no overriding of toString() class B extends Object {} class A extends B {} Legal names in bytecode: <Application, A, toString()>, <Application, B, toString()>, <Application, java.lang.Object, toString()>, <Primordial, java.lang.Object, toString()> Resolved entity: <Primordial, java.lang.Object, toString()>
  • 43. WALA Name Resolution Entity references resolved via IClassHierarchy Entity Reference Type Resolved Resolver Method Type class TypeReference IClass lookupClass() method MethodReference IMethod resolveMethod() Field FieldReference IField resolveField()
  • 44. More on class hierarchies !  For Java class hierarchy: ClassHierarchy.make(scope) !  Supports Java-style subtyping (single inheritance, multiple interfaces) !  Necessary for constructing method IRs, since only resolved IMethods have bytecode info !  Watch out for memory leaks! •  Resolved entities (IClass, IMethod, etc.) keep pointers back to class hierarchy •  In general, use entity references in analysis results
  • 46. Tabulation-Based Analysis (Reps, Horwitz, Sagiv, POPL95) !  “Functional approach” to context-sensitive analysis (Sharir and Pnueli, 1981) !  Tabulates partial function summaries on demand !  Some enhancements in WALA’s implementation •  Multiple return sites for calls (for exceptions) •  Optional merge operators •  Handles partially balanced problems •  Customizable worklist orderings
  • 47. Tabulation Overview T TabulationDomain IFlowFunctionMap Provides numbering of Edge flow functions for domain facts supergraph ISupergraph Supergraph over Seeds which analysis is Initial path edges for TabulationProblem computed analysis TabulationSolver TabulationResult
  • 48. Supergraph !  Collection of “procedure graphs,” connected by calls •  ICFGSupergraph: procedure graphs are CFGs •  SDGSupergraph: procedure graphs are PDGs !  Example call representation for ICFGSupergraph •  (For general supergraph , possibly many calls, returns, entries, exits) CALLER CALLEE CALL ENTRY Normal call-to-ret Exc. call-to-ret Normal call-to-entry RETURN Normal ret-to-exit EXIT EXC. Exc. ret-to-exit EXIT HANDLER
  • 49. Domain / Flow Functions / Seeds !  TabulationDomain •  Maintains mapping from facts to integers •  Controls worklist priorities (optional) !  IFlowFunctionMap •  Flow functions on supergraph edges •  All functions map int (or two ints) to IntSet (via TabulationDomain) •  Function for each type of edge (normal, call->return, call->entry, exit->return) •  Also, call->return function for “missing” calls "  For handling missing code, CG expansion (Snugglebug) !  Seeds (TabulationProblem.initialSeeds()) •  Depends on problem / domain representation
  • 50. Partially Balanced Problems !  For when flows can start / end in a non-entrypoint •  E.g., slice from non-entrypoint statement !  PartiallyBalancedTabulationProblem •  Additional “unbalanced” return flow function for return without a call •  getFakeEntry(): source node for path edges of partially balanced flows !  Compute with PartiallyBalancedTabulationSolver !  Examples: ContextSensitiveReachingDefs, Slicer
  • 51. Debugging Your Analysis !  IFDSExplorer •  Gives GUI view of analysis result •  Needs paths to GraphViz dot executable and PDF viewer !  Set VM property com.ibm.wala.fixedpoint.impl.verbose to true for occasional lightweight info !  Increase TabulationSolver.DEBUG_LEVEL for detailed info
  • 52. Deep Dive: Reaching Defs !  The classic dataflow analysis, for Java static fields !  Three implementations available in com.ibm.wala.core.tests •  IntraprocReachingDefs "  Uses BitVectorSolver, a specialized DataflowSolver •  ContextInsensitiveReachingDefs "  Uses BitVectorSolver over interprocedural CFG •  ContextSensitiveReachingDefs "  Uses TabulationSolver !  We’ll focus on ContextSensitiveReachingDefs
  • 53. Example class StaticDataflow { static int f, g; static void m() { f = 2; } static void testInterproc() { f = 3; m(); (1) g = 4; m(); (2) } } Context-sensitive analysis should give different result after (1) and (2)
  • 54. The Domain and Supergraph !  Supergraph: ICFGSupergraph •  Procedures: call graph nodes (CGNode) "  In context-sensitive call graph, possibly many CGNodes for one Imethod •  Nodes: BasicBlockInContext<IExplodedBasicBlock> "  “exploded” basic block has at most one instruction (eases writing transfer functions) "  BasicBlockInContext pairs BB with enclosing CGNode !  Domain (static field writes): Pair<CGNode,Integer> •  Integer is index in IR instruction array; only valid way to uniquely identify IR instruction •  ReachingDefsDomain extends MutableMapping to maintain fact numbering
  • 55. Flow Functions (1) !  Normal flow for non-putstatics is IdentityFlowFunction.identity() !  Most call-related flow functions are also identity •  Since static fields are in global scope !  Call-to-return function is KillEverything.singleton() •  Defs must survive callee to reach return
  • 56. Flow Functions (2) Normal flow function for putstatic (modified for formatting / clarity) public IntSet getTargets(int d1) { IntSet result = MutableSparseIntSet.makeEmpty(); // first, gen this statement int factNum = domain.getMappedIndex(Pair.make(node, index)); result.add(factNum); // if incoming statement defs the same static field, kill it; // otherwise, keep it if (d1 != factNum) { // must be different statement IField sf = cha.resolveField(putInstr.getField()); Pair<CGNode, Integer> other = domain.getMappedObject(d1); SSAPutInstruction otherPut = getPutInstr(other); IField otherSF = cha.resolveField(otherPut.getField()); if (!sf.equals(otherSF)) { result.add(d1); } } return result; }
  • 57. Seeds !  Standard tabulation approach: special ‘0’ fact •  Add ‘0 -> 0’ edge to all flow functions •  Seed with (main_entry, 0) -> (main_entry, 0) !  Our approach: partially balanced tabulation •  For field write numbered n in basic block b of method m, add seed (m_entry, n) -> (b, n) "  (source fact doesn’t matter) •  Unbalanced flow function is just identity •  Advantage: keeps other flow functions cleaner •  See ReachingDefsProblem.collectInitialSeeds()
  • 58. Putting it all Together !  ReachingDefsProblem collects domain, supergraph, flow functions, seeds !  Running analysis (simplified): PartiallyBalancedTabulationSolver solver = new PartiallyBalancedTabulationSolver( new ReachingDefsProblem()); TabulationResult result = solver.solve(); !  In the real code: •  Lots of long generic type instantiations (sigh) •  Handling CancelException (enables cancelling running analysis from GUI)
  • 59. CALL GRAPHS / POINTER ANALYSIS
  • 60. Call Graph Builder Overview AnalysisOptions Heap Model Specifies entrypoints, How should Context Selector objects and What context to use how to handle reflection, pointers be when analyzing call to etc. abstracted? some method? CallGraphBuilder CallGraph PointerAnalysis
  • 61. Entrypoints !  What are entrypoint methods? •  main() method "  Util.makeMainEntrypoints() •  All application methods "  AllApplicationEntrypoints •  JavaEE Servlet methods, Eclipse plugin entries, … !  What types are passed to entrypoint arguments? •  Just declared parameter types (DefaultEntrypoint) •  Some concrete subtype (ArgumentTypeEntrypoint) •  All subtypes (SubtypesEntrypoint)
  • 62. Heap Model !  Controls abstraction of pointers and object instances !  InstanceKey: abstraction of an object •  All objects of some type (ConcreteTypeKey) •  Objects allocated by some statement in some calling context (AllocationSiteInNode) •  ZeroXInstanceKeys: customizable factory !  PointerKey: abstraction of a pointer •  Local variable in some calling context (LocalPointerKey) •  Several merged vars (offline substitution), etc.
  • 63. Context Selector !  Gives context to use for callee method at some call site !  Context examples •  The default context (Everywhere) •  A call string (CallStringContext) •  Receiver object (ReceiverInstanceContext) !  ContextSelector examples •  nCFAContextSelector: n-level call strings •  ContainerContextSelector: object sensitivity for containers
  • 64. Built-In Algorithms (Grove and Chambers, TOPLAS 2001) Rapid Type Analysis (RTA) 0-CFA Increasing precision context-insensitive, class-based heap 0-1-CFA context-insensitive, allocation-site-based heap 0-1-Container-CFA 0-1-CFA with object-sensitive containers For builders, see com.ibm.wala.ipa.callgraph.impl.Util
  • 65. Performance Tips !  Use AnalyisScope exclusions •  Often, much of standard library (e.g., GUI libraries) is irrelevant !  Analyze older libraries •  Java 1.4 libraries much smaller than Java 6 !  Tune context-sensitivity policy •  E.g., more sensitivity just for containers
  • 66. Code Modelling (Advanced) !  SSAContextInterpreter: builds SSA for a method •  Normally, based on bytecode •  Customized for reflection, Object.clone(), etc. !  MethodTargetSelector: determines method dispatch •  Normally, based on types / class hierarchy •  Customized for native methods, JavaEE, etc. !  ClassTargetSelector: determines types of allocated objects •  Normally, type referenced in new expression •  Customized for adding synthetic fields, JavaEE, etc.
  • 67. Refinement-Based Points-To Analysis !  Refines analysis precision as requested by client •  Computes results on demand •  See Sridharan and Bodik, PLDI 2006 !  Implemented in DemandRefinementPointsTo •  Baseline analysis is context insensitive "  Field sensitivity, on-the-fly call graph via refinement •  Context sensitivity (modulo recursion), other regular properties via additional state machines •  Refinement policy can be easily customized •  Can also compute “flows to” on demand !  Usage fairly well documented on wiki !  See DemandCastChecker for an example client
  • 69. Slicer Overview Statement DataDependenceOptions Seed for the slicer Which data deps to consider ControlDependenceOptions Which control deps to consider CallGraph PointerAnalysis Slicer.compute{Forward|Backward}Slice() Collection<Statement>
  • 70. Statement !  Identifies a node in the System Dependence Graph (SDG) [Horwitz,Reps,Binkley,TOPLAS’90] !  Key statement types •  NormalStatement "  Normal SSA IR instruction "  Represented by CGNode and instruction index •  ParamCaller, ParamCallee "  Extra nodes for modeling parameter passing "  SDG edges from def -> ParamCaller -> ParamCallee -> use •  NormalReturnCaller, NormalReturnCallee "  Analogous to ParamCaller, ParamCallee "  Also ExceptionalReturnCaller/Callee •  HeapParamCaller, HeapParamCallee, etc. "  For modeling interprocedural heap-based data deps "  Edges via interprocedural mod-ref analysis
  • 71. Dependence Options !  DataDependenceOptions •  FULL (all deps) and NONE (no deps) •  NO_BASE_PTRS: ignore dependencies for memory access base pointers "  E.g., exclude forward deps from defs of x to y=x.f •  NO_HEAP: ignore dependencies to/from heap locs •  NO_EXCEPTIONS: ignore deps from throw to catch •  Various combinations (e.g., NO_BASE_NO_HEAP) !  ControlDependenceOptions •  FULL (all deps) and NONE (no deps) •  NO_EXCEPTIONAL_EDGES: ignore exceptional control flow
  • 72. Thin Slicing !  Just “top-level” data dependencies (see [Sridharan-Fink-Bodik PLDI’07]) !  For context-sensitive thin slicing, use Slicer with DataDependenceOptions.NO_BASE_PTRS and ControlDependenceOptions.NONE !  For efficient context-insensitive thin slicing, use the CISlicer class
  • 73. Performance Tips !  Some configs do not scale to large programs •  E.g., context-sensitive slicing with heap deps •  Discussion in [SFB07] !  Run with minimum dependencies needed !  Apply pointer analysis scalability tips •  Exclusions, earlier Java libraries
  • 75. Key Shrike Features !  Patch-based instrumentation API •  Each instrumentation pass implemented as a patch •  Several patches can be applied simultaneously to original bytecode "  No worries about instrumenting the instrumentation •  Branch targets / exc. handlers automatically updated !  Efficient •  Unmodified class methods copied without parsing •  Efficient bytecode representation / parsing "  Array of immutable instruction objects "  Constant instrs represented with single shared object !  Some ugliness hidden •  JSRs, exception handler ranges, 64k method limit
  • 76. Key Shrike Classes ClassReader ClassWriter Immutable view Generates JVM of .class file info; representation of a ShrikeCT: reads data lazily class reading / writing .class files MethodEditor ClassInstrumenter Core class for Utility for instrumenting an transforming existing class (mutable) bytecodes via patches MethodData CTCompiler ShrikeBT: Mutable view of Compiles ShrikeBT method instrumenting method info into JVM bytecodes bytecodes
  • 77. Instrumenting A Method (1) instrument(byte[] orig, int i) { // mutable helper class ClassInstrumenter ci = (1) new ClassInstrumenter(orig); // mutable representation of method data MethodData md = ci.visitMethod(i); (2) // see next slide; mutates md, ci doInstrumentation(md); (3) // output instrumented class in JVM format ClassWriter w = ci.emitClass(); (4) byte[] modified = w.makeBytes(); (5) }
  • 78. Instrumenting A Method (2) doInstrumentation(MethodData md) { // manages the patching process MethodEditor me = new MethodEditor(md); (1) me.beginPass(); (2) // add patches me.insertAtStart(new Patch() { … }); (3) me.insertBefore(j, new Patch() { … }); (4) … // apply patches (simultaneously) me.applyPatches(); me.endPass(); (5) }
  • 79. Shrike Clients !  Small example: see com.ibm.wala.shrike.bench.Bench !  Dila (com.ibm.wala.dila in incubator) •  Dynamic call graph construction (CallGraphInstrumentation) •  Utilities for runtime instrumentation "  Instrumenting class loader "  Mechanisms for controlling what gets instrumented •  Work continues on better WALA integration / docs
  • 80. Java Annotation Support !  Supported features •  Reading .class file attributes •  Parsing some annotation info from attributes "  E.g., generics (com.ibm.wala.types.generics) •  Manipulating JVM class attributes directly "  See ClassWriter.addClassAttribute() !  Missing features •  Higher-level APIs for modifying known annotations •  Automatic fixing of StackMapTable attribute after instrumentation "  Speeds bytecode verification in Java 6
  • 81. Eclipse Support !  WALA projects are Eclipse plug-ins •  Easy to invoke from other plug-in !  Various utilities in com.ibm.wala.ide project •  EclipseProjectPath: creates AnalysisScope for Eclipse project •  JdtUtil: find all Java projects, code within projects, etc. !  JDT CAst frontend for Java source analysis !  Prototype utils in com.ibm.wala.eclipse project •  E.g., display call graph for selected project
  • 82. FRONT ENDS / CAST
  • 83. !"#"$%&'()$*(+ =/.+5.$> #/*:;/:5 '())(*$"+,$ -./*+0/,(. -($%&$-./*+0/,(. =/.+5.$? !"#"$%& 12.345$67,58(95 %*,5.*/0$12.345 -($%&$-./*+0/,(. 67,58(95 <(.)/, &;*,3)5$03@./.7
  • 84. !"#"$%&'()*+($,-*.'$/.+ !"#" 89:# <=>3$)?5/*@, '()*+,$ !"#"$%& -./,012, ;%# 31$%&$ 3)45674/1) 21/<A5? "-; 34@4)*5
  • 85. !"#$%&'()'*+,-.#/0.$+, !"#$%& )*+* !"#$%&'( 1&23&#4 ,-./&#0&# '50&6-3&4 '()*+,$%& !"#"$%&
  • 86. !"#$%&'(&)*&#+ !"#$%& !"#$%&,- '&()&#* ! ./01'23)4)'567&89*&: ;<='*97>?@A ! ,B1 +,-&./)&* ! 2$@7&#@)C'BD0'9@C6: EB<FG ! ,B1 ! 2BD0H'E$@*9I+'9@C6: 09J$CC)'-)K)#$@ ! LD,'2L87$9@!8#$M7: ! 2A@*&#'*&4&C9MK&@7:
  • 87. !"#"$%&'()*$+&,*$-(&./$0., 1/2/34.56, -./*+0/,5(* &C5*( 8D8 %9)$6/.+:. -./*+0/,5(* '())(*$"+,$ ;:*< EFG !"#"$%& -($%&$ -./*+0/,5(* -./*+0/,(. 8(0?@0(, 1/2/ -./*+0/,5(* =4056+:$1>- "7"8 -./*+0/,5(* 'A+,()$"B-#&
  • 88. !"#$%&'%()*)+,$-.* #)&%'+ 94;# >%+ <(.+# <%)= !7 !7 #)&%'+ !"#$%&'$()" *+"+%,$()" -)"$%)./0.)1 449/-)":+%#()" >,%#+% *%,23/-%+,$()" 4)&%'+/5,22("6 7+')%8("6
  • 89. !"#"$%&'()*+,-)&.%)'/,*01,1&")'/, -,.# #)&%'3 4%3 1%)2 1(03# !5 #)&%'3 4/%#3%* !"#$%&'$()"#* 56(").)-#$.%/"#0/$)% +,-#$.%/"#0/$)% 2&",-3")1*45',/*-)&.%).&1-*)/*6787*9/::/,*7$2*;97-)< ! 9/:=',")'/,*/>*?1,1&'%*",@*!"#"$%&'()*7$2*,/@1- ! A,3B*('1%1*/>*%/@1*)5")*.,@1&-)",@-*45',/ 2&",-3")1*97-)*7$2*)/*6787*C&1*+4*>/&: ! $5"&1@*=B*",/)51&*',)1&,"3*!"#"$%&'()*)&",-3")/& ! DE)1,@-*?1,1&'%*)&",-3")'/,*:"%5',1&B
  • 90. !"#$%&'$()"*+,",%-$()" *..%96#:(#; ####70827-#9$$:#<<#9=012345%:(> !"#$#%!&#$$#!'( ? )*#!" 70827- +"#$#,-!./0#012345#!&6!' << ,-!./0# 9#$$#: 9=012345%:( 70827-#!" @58A73-5438.7#,BC40B0-85#70D275,!0#@EA#8700#F34/
  • 91. !"#$%"&'(&")'*%+,-'!%.+$/"# E+,F4"G:#$&H&00(&) &&IJ/&E+,F4"G:#$14+!,(6 !""#$%&'(&) &&I=&/&E+,F4"G:#$1*8EK,(6 &&&&*+,-*.&$//'&00&$1+2-345#'(6 &&I@&/&.+L&F4"G:#( 7 &&IJ15-GG+55"*&/&I@6 &&I@13MM#NIJ19&//&,*-+O( &&I@1!345+&/&I=6 *+,-*. 7 && BCD 00 ;<= 8.9":+& @<A $&//&' $1+2-345#'( ><? P5,Q*3.543,"*&I-84M5&BCD&*+G-*589+4'&"9+*&,K+&PRQ
  • 92. !"#$%&'(")*+*",'-.//*,0 *..%96#:(#; ####70827-#9$$:#<<#9=012345%:(> !"#$#%!&#$$#!'( ? F&6&GH#I#F&6&KH )*#!" 70827- F&6&GH#I#F&6'JH +"#$#,-!./0#012345#!&6!' << F&6&JH#I#F&6'JH ,-!./0# 9#$$#: F&6'H#I#F&6'JH 9=012345%:( 70827-#!" @58A73-5438.7#B.C,05#5.27B0#C.5,8,.-5#*7.D#@EA
  • 93. !!"#$%&'()*+%& !"#$#%!&#$$#!'( !"#$#%!&#$$#!'( )*#!" )*#!" !"#$#+,!-./#/01234#!&5!' !8#$#+,!-./#/01234# !&5!' !9#$#:%!"5!8( 6/716,#!" 6/716,#!9 ;<=<#>-/4#?-@A#@6-@2B27+-,#+,#CC<#?-,!/64+-, ;<=<#+D@3/D/,74#*133AE@61,/>#CC<#F-,!/64+-, ! +G/G#@H+4#-,3A#+,4/67/>#*-6#3+!/#!231/4