ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements

Quantitative analysis of Java/.Net like
programs to understand heap memory
requirements

Diego Garbervetsky
Departamento de Computación
Facultad de Ciencias Exactas y Naturales
Universidad de Buenos Aires (UBA) ByteCode 2012

Motivation
• Understanding resource consumption
– Energy, memory, communication costs

Static analysis of heap memory in Java like programs

• Analysis of memory allocations is very hard
– Problem undecidable in general
– Impossible to find an exact expression for the number on
allocated objects

• Predicting actual heap memory requirements is
harder
– Garbage Collection  memory required <= memory
requested/allocated (live object <= object allocated)
– Requires analysis of object lifetime

#Live Objects != Memory required
• Analysis actual memory consumption also requires understanding
internals of the underlying VM, the memory manager(and
potentially the operating system).
• We still believe analyzing number of allocations and live objects is
cornerstone

Stack allocation vs Heap allocation
• Heap allocation: Associated with data structures
created by the app, controlled by GC
– requires analysis of object lifetime

• Stack allocation: Frame bounding + Stack Depth
– requires analysis of recursive method calls
• Popeea [CNPQ08]: ISMM08, Albert [ISMM07], and approaches for
functional languages

Example

public static D[][] init(int n, int m) {
D[][] matrix = new Object[n][m];

for(int i = 0; i < n; i++)
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i);

return matrix;
}

• n*m objects of type D and an array of n*m references (D[][])
• Ignoring types, we can consider the total allocated as 2(n.m)
or n.m+1 objects (depending how we count arrays)

Verification
public static D[][] init(int n, int m)
ensures memoryAlloc <= 2*n*m;
{
D[][] matrix = new Object[n][m];

for(int i = 0; i < n; i++)
invariant memoryAlloc <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryAlloc <= n*m + n*i+j;

return matrix;
}

• Non linear SMT solver + invariants

Verification
ensures memoryReq <= 2*n*m + 1;
{
escapes D[][] matrix = new Object[n][m];

for(int i = 0; i < n; i++)
invariant memoryReq <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryReq <= n*m + n*i+j + 1;
escapes matrix[i][j] = new D(i);
collectable A a = new A();

return matrix;
}

• Include lifetime/sharing/shape information

Inference of bounds in imperative languages

Some approaches for imperative languages:

• Abstract Interpretation [e.g., BCJP09]

• Recurrence equations [e.g., AGG07/09]

• Iteration patterns / Ranking functions [e.g., GZ10]

• Counting / Iteration spaces [BGY05/06,BFGY08,…]

[e.g., BCJP09]
Abstract interpretation
• Use counters to represent memory allocation (and deallocation)
• Compute program invariants (using a lattice and fixpoint)

{
D[][] matrix = new D[n][m];
objects += n*m
for(int i = 0; i < n; i++)

for(int j = 0; j < m; j++)
objects +=1
return matrix;
}

• Requires inferring non-linear invariants
– Rodriguez-Carbonell, Müller-Olm, Cachera, etc

[e.g., AGG07/09]
Recurrence Equations
• Computes a set of recurrence equations
– Then tries to find a close-form solution to the recurrence
equation
public static D[][] init(int n, int m) init(n,m) = loop1(n,m,0)
{
{i<n,i'=i+1}
D[][] matrix = new D[n][m];
loop1(n,m,i) = loop2(m,0) + loop1(n,m,i')
for(int i = 0; i < n; i++) {i>=n}
loop1(n,m,i) = 0
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i); {j<m, j'=j+1}
loop2(m,j) = 4 + c(D) + loop2(m,j')
return matrix; {j>=m}
}
loop2(m,j) = 0

Solution init(n, m) = 2.n.m
- Albert and collaborators. / Costa Analyzer

BGY05/06

for(i=0;i<n;i++) {0≤ i < n, 0≤j<i}: a set of
for(j=0;j<i;j++) constraints describing a iteration
• new C() space

 Dynamic Memory request number of visits to new statements
 number of possible variable assignments at its control location
 number of integer solutions of a predicate constraining variable
assignments at its control location (i.e. an invariant) j

For linear invariants, # of integer solutions = # of integer

n
points = Ehrhart polynomial

size(C) * ( ½n2+½n) i
n+1 14

Outline

• Inferring parametric upper-bounds of heap
memory usage (or live objects)

• Our new compositional approach

• Verification of .NET programs

• Conclusions

Inferring parametric upper-
bounds of heap memory
usage (or live objects)

Example
• How much memory is required to run m0?
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}

Example
• How much memory is required to run m0?
“Ideal” consumption m0(2)
void m0(int mc) { 14
1: m1(mc); 12
2: B[] m2Arr=m2(2 * mc); 10
} 8
void m1(int k) { 6
3: for (int i = 1; i <= k; i++){ 4
4: A a = new A(); 2
5: B[] dummyArr= m2(i); 0
} ret m1
}
B[] m2(int n) { Ideal consumption m0(7)
80
70
7: for (int j = 1; j <= n; j++) { 60
8: arrB[j-1] = new B(); 50
9: C c = new C(); 40
10: c.value = arrB[j-1]; 30
20
}
10
11: return arrB; 0
} 18
ret m1

Our goal
 An expression over-approximating the peak
amount of memory consumed by a method
 Parametric
 Easy to evaluate at run time
 E. g. :Required(m)(p1,p2) = 2p2 + p1

 Evaluation cost known “a priori”

Given a method m(p1,..,pn)
 peak(m): an expression in terms of p1,…,pn
for the max amount of memory consumed by m

In a nutshell
How: Performing a good approximation of memory
requirements using a region-based memory manager
(RTSJ)
1. Infer total allocations by counting visits to new
statements
2. Compute memory regions using escape analysis
3. Compute peak consumption using a region based
memory manager

BGY05/06


I) Computing dymamic memory
allocations

Víctor A. Braberman, Diego Garbervetsky, Sergio Yovine: A Static Analysis for Synthesizing Parametric Specifications of
Dynamic Memory Consumption. Journal of Object Technology 5(5): 31-58 (2006)

Memory requested by a method
void m0(int mc) {
1: Identify allocation sites
1: m1(mc);
Key of the technique: Manipulate
2: B[] m2Arr=m2(2 * mc); Ex: m0.1.m1.5.m2.8, cs for new B with stack
} linear expressions, easier to handle (m0.1.m1.5). m0.2.m2.8 = stack (m0.2)
void and less expensive in order to
m1(int k) {
3: for (int non-linear expressions
generate i = 1; i <= k; i++){ 2: Find invariants for creation sites
4: A a = new A();
5: B[] dummyArr= m2(i); Im0(m0.1.m1.5.m2.8) {k=mc 1≤i≤k n=i
} 1≤j≤n}
}
B[] m2(int n) { 3: Count the number of solutions (in terms of
6: B[] arrB = new B[n]; MUA parameters)
7: for (int j = 1; j <= n; j++) {
{(k,i,j,n)| (k=mc 1≤i≤k n=i 1≤j≤n) }
8: arrB[j-1] = new B();
9: C c = new C(); = ½ mc2 + ½ mc
10: c.value = arrB[j-1]; 4: Transform number of visits into memory
} consumption
11: return arrB;
} size(B)*½ mc2 + ½ mc
Creation sites reachable from m0: 5: Sum up the resulting expressions
CSm0
={m0.1.m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8, (size(B[]) + size(B) + size(C))(1/2 mc2 +5/2 mc)
m0.1.m1.5.m2.9 +size(A)mc

Memory requested by a method
 How much memory (in terms of m0
parameters) is requested/allocated by m0
totAlloc(m0)(mc) = S(m0, cs)
cs CS_m0
= (size(B[])+size(B)+ size(C))(1/2 mc2 +5/2 mc)
+size(A)mc
= 3/2 mc2 +17/2 mc [considering (size(T) = 1]

28

Problem
• Memory is released by a garbage collector
– Very difficult to predict when, where, and how many object are collected

Ideal consumption Ideal consumption
15 80

60
10
40
5
20

0 0

m0(2) m0(7)

• Our approach: Approximate GC using a scope-based region memory
manager

29


II) Computing region sizes

RTSJ (Real time specification for Java)
• Scoped memory management
– Dynamic memory organized in regions associated to
particular scopes
• Methods, threads, etc.

• Advantages
– Better time predictability
• (compared with non RTGCs)

– More controlled object allocation
and deallocation
• Useful for memory consumption predictability

• But… you must respect scoping restriction
– Potential dangling references
31

Region-based memory management
void m1(int k) {
SM.enter(Regions.rm1);
for (int i = 1; i <= k; i++) {
A a =SM.newInstance(CSs.m1_4, A.class);
B[] dummyArr = m2(i); Rm1(m1.4,m2.6,m2.8)
}
SM.exit(); m1.4 m2.6 m2.8
} m2.6 m2.8
B[] m2(int n) { m2.6 m2.8
SM.enter(Regions.rm2);
B[] arrB = (B[])SM.newAInstance(CSs.m2_6, B.class,n);

• Region-based program
for (int j = 1; j <= n; j++) {
arrB[j - 1] = (B)SM.newInstance(CSs.m2_8, B.class); Rm2(m2.9)
C c = (C)SM.newInstance(CSs.m2_9, C.class);
c.value = arrB[j - 1];
}
m2.9
SM.exit(); m2.9
return arrB; m2.9
}
Regions.rm1 = <“rm1”,{CSs.m1_4, CSs.m2_6, CSs.m2_8}>
Regions.rm1 = <{“rm2”,{CSs.m2_9}>

Diego Garbervetsky, Chaker Nakhli, Sergio Yovine, Hichem Zorgati: Program Instrumentation and Run-Time Analysis of
Scoped Memory in Java. Electr. Notes Theor. Comput. Sci. 113: 105-121 (2005)

• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
}
}
B[] m2(int n) {
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
}
11: return arrB;
}
33

• Escape Analysis to infer regions
• Escape(m): objects that live beyond m
void m0(int mc) {
1: m1(mc); – Escape(mo) = {}
2: B[] m2Arr=m2(2 * mc); – Escape(m1) = {}
}
void m1(int k) { – Escape(m2) = {m2.6, m2.8}
3: for (int i = 1; i <= k; i++){
4: A a = new A();
• Capture(m): objects that do not live
} more that m
}
– Capture(mo) = {m0.2.m2.6, m0.2.m2.8},
B[] m2(int n) {
6: B[] arrB = new B[n]; – Capture(m1) =
7: for (int j = 1; j <= n; j++) { {m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8},
8: arrB[j-1] = new B();
– Capture(m2) = {m2.9}
9: C c = new C();
10: c.value = arrB[j-1]; • Region(m) Capture(m)
}
11: return arrB;
}
34

• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
}
}
B[] m2(int n) {
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
}
11: return arrB;
}
35

Obtaining region sizes
• Region(m) Capture(m)

• memCap(m): an expression in terms of p1,…,pn for
the amount of memory required for the region associated
with m

• memCap(m)is totAlloc(m)applied only to captured
allocations
• memCap(m0) = (size(B[]) + size(B)).2mc
• memCap(m1) = (size(B[]) + size(B)).(1/2 k2 +1/2k) +size(A).k
• memCap(m2)= size(C).n 37


III) Approximating peak consumption

-Víctor A. Braberman, Federico Javier Fernández, Diego Garbervetsky, Sergio Yovine: Parametric prediction of heap
memory requirements. ISMM 2008: 141-150
-Philippe Clauss, Federico Javier Fernández, Diego Garbervetsky, Sven Verdoolaege: Symbolic Polynomial Maximization
Over Convex Sets and Its Application to Memory Requirement Estimation. IEEE Trans. VLSI Syst. 17(8): 983-996 (2009)

Approximating peak consumption
We over approximate an ideal memory manager using a
scoped-based memory regions
– m-regions: one region per method

• When & Where:
• created at the beginning of method
• destroyed at the end

• How much memory is allocated/deallocated in each
region:
• memCap(m) >= actual region size of m for any call context

• How much memory is allocated in outer regions :
• memEsc(m) >= actual memory that is allocated in callers regions

39

Approximating Peak (m)
Region’s stack evolution

 Some region configurations can not happen at the same time
 e.g m0.1.m1.m2 and m0.2.m2

?

rm2 rm2

?
rm1 rm1 rm1 rm1 rm1 rm2

?
rm0 rm0 rm0 rm0 rm0 … rm0 rm0 rm0

peak ( 0, m0) = max size(rk( )) 41

Region sizes may vary according to method calling context
rsize(m2) = n (assume size(C)=1)
m0.1.m1.5.m2
{ k= mc, 1 i k, n = i}
{ k= mc = n} maximizes
maxrsize(m0.1.m1.5.m2,m0) = mc

rm2 rm2
rm2
In terms of m0
rm1 rm1 rm1 parameters!
…
rm0 rm0 rm0

42
peak (m0)

3. Maximizing instantiated regions

maxrsize( .m,m0)(Pm0)
= Maximize rsize(m) subject to I (Pm0 ,Pm, W)

• m-region expressed in terms of m parameters
– rsize(m2)(m0) = n

• A complex non-linear maximization problem
• Maximum according to calling context and in terms
even when parameters are instantiated (in
of MUA parameters
runtime)
– maxrsize(m0.1.m1.5.m2,m0) (mc) = mc
• Too expensive
– maxrsize(m0.2.m2,m0)(mc) = 2mc
• Execution time difficult to predict
43

Solving maxrsize

• Solution: an approach based on Bernstein
basis over polyhedral domains (Clauss et al. 2004)
– Enables bounding a polynomial over a
parametric domain given as a set of linear restraints
– Obtains a parametric solution
• Bernstein(pol, I):
– Input: a polynomial pol and a set of linear (parametric) constrains I
– Return a set of polynomials (candidates)
• Bound the maximum value of pol in the domain given by I

44

maxrsize
max { q(Pmo) C1} if D1(Pmo)
Maxrsize(m0, .mk)=
max { q(Pmo) Ck} if Dk(Pmo)

where {Ci, Di} = Bernstein(rsize(mk), I .mk,Pm0)

• Maxrsize(m0,m0)(mc) = (size(B[]) + size(B)).2mc
• Maxrsize(m0.1.m1,m0)(mc) =
(size(B[]) + size(B)).(1/2 mc2 +1/2mc) +size(A).mc
• Maxrsize(m0.1.m1.5.m2,m0)(mc) = size(C).mc
• Maxrsize(m0.21m2,m0)(mc) = size(C).2mc
45

We consider the largest region for the same calling context
[1..k]

peak (m0) max maxrsize
1 k | |
m0
(m c) = mem (m0)
mo

m0.1.m1.5.m2

m0.1.m1.5 maxrm2
m0.2

maxrm1 maxrm1 maxrm2
m0

maxrm0 maxrm0 maxrm0 maxrm0

47
mem (m0)
max

Dynamic memory required to run a method
Memreqm0(mc) = mc2 +7mc

M2
M1
M0
ideal
memRq(4)

Init start call call ret call ret call ret call ret ret call ret ret end
m0 m1 m2 m2 m2 m2 m2 m2 m2 m2 m1 m2 m2 m0 51


Tool support

Diego Garbervetsky, Sergio Yovine, Víctor A. Braberman, Martín Rouaux, Alejandro Taboada: Quantitative dynamic-
memory analysis for Java. Concurrency and Computation: Practice and Experience 23(14): 1665-1678 (2011)

Inferring total allocations

1) Finds out creation
sites

2) Generate global
invariants

3) Count the number of
solutions (visits)

54

Invariant inference

Also discover “relevant”
variables
• Size of collections
• Iterators

55

Counting
Ensures that variables
concerning visits to
statements are
considered in the
counting (to ensure
soundness)

Try to compute a
minimal set of variables
(to filter out irrelevant
variables, better
precision)
56

Inferring regions
Our tool
1. Automatic region inference of
m-regions
– Using escape analysis

2. Translation to region based
bytecode
– RC (regions library)
– RTSJ
– JikesVM

57

Refining memory regions
• Escape analysis over approximates object lifetime (to be safe)
– It may impact of memory regions

• JScoper: A tool for Region edition and visualization
– Call graph visualization

– Region edition
– Interfacing with Escape Analysis

– Region-based code
generation

– Region-based memory
manager simulator

Andrés Ferrari, Diego Garbervetsky, Víctor A. Braberman, Pablo Listingart, Sergio Yovine: JScoper: Eclipse support for
58
research on scoping and instrumentation for real time Java applications. ETX 2005: 50-54

Peak memory computation component

• Non linear maximization problem
solved using an approach based on
Bernstein basis over polyhedral domains
(Clauss et al. 2004-2009)
• Enables bounding a polynomial over
a parametric domain given as a set of
linear restraints
• Yields a set of candidate
polynomials

59

Limitations
• For loop intensive programs the tool performs very
well
– Not well suited for memory allocating recursions

• Imprecision comes mainly from:
– Escape analysis

– Program invariants

– Inductive variables analysis

– Approximations of maximum regions sizes
61

More limitations
• Global approach make the analysis and tuning of results a
hard task
– Too many variables, parameters and bindings

• Affects scalability and usability

• Sometimes it would be necessary to provide bounds
manually
– Recursion, non analyzable methods, easy to understand but with a
non-linear invariants

Why?
• Scalability
– Symbolic manipulation algorithms complexity heavily
depend on the number of the involved variables

• Usability:
– Manual inspection and tuning of program invariants are
much easier when dealing with local invariants

64

More reasons
• Dealing with non analyzable methods
– User provided annotations (applies also for mutually
recursive components)

• Enables the use of other counting
mechanisms
• Ability to analyze programs fragments
• Better support for Polymorphism

Compositional analysis = method summaries
B[] m2(int n) { MR_m2 = 3n
6: B[] arrB = new B[n]; 6: n (new B[n])
7: for (int j = 1; j <= n; j++) {
8: n (new B(), count {j = 1..n} =n)
8: arrB[j-1] = new B();
9: C c = new C();
9:= n (new C(), count {j = 1..n} =n
}
11: return arrB;
}
4: k (new A(), count {i = 1..k} =k)
void m1(int k) MR_m1 = 3/2k^2 + 5/2k
{
3: for (int i = 1; i <= k; i++){ 5:= call to m2(i) {i = 1..k}
4: A a = new A();
5: B[] dummyArr= m2(i); sum{i = 1..k} 3i = 3*(k(k+1))/2=3/2(k2+k)
}
} symbolic
operation on
polynomials
void m0(int mc) {= 3/2mc^2+17/2mc
MR_m0 1: call to m1(mc) = 3/2mc2+5/2mc
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
2: call to m2(2*mc) = 3*(2mc) = 6mc
}

Challenge
Do not loose too much precision!!
Compositional: 3/2*mc2+17/2mc vs. Global: mc2 +7mc

• Specification of memory reclaiming in a
compositional fashion

• Symbolic manipulation of summaries
– Maximize polynomials over iterations spaces

– Sum polynomials over iterations spaces

Modeling object reclaming
B[] m2(int n) {
6: B[] arrB = new B[n];  6 and 8 live longer than m2
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();  A responsibility of the caller
9: C c = new C();
10: c.value = arrB[j-1];  9 can be safely collected
}
11: return arrB;
}
when m2 finishes

 Idea: Enrich summaries in order to distinguish
escaping objects from captured objects
 6 y 8 are escaping or residual and 9 is auxiliary

Compositional analysis (simplified)
B[] m2(int n) { MR_m2 = 3n 6,8:= n+n = 2n (Escapes)
6: B[] arrB = new B[n]; ME_m2 = 2n
9:= n (do not escape)
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B(); MR_m2(n) = 3n ( ME, 2n, MT = 1)
9: C c = new C();
} Escaping memory
11: return arrB;
is accumulative!
}
4: k (do not escape)
void m1(int k) { MR_m1 = k^2 + 3k 5:= call to m2(i) {i = 1..k}
3: for (int i = 1; ME_m1k;0 i++){
i <= = sum{i=1..k} ME_m2(i) = 2*(k(k+1))/2= (k2+k)
4: A a = new A(); max{i=1..k} MR-ME_m2(i) = k
} MR = k + (k2+k) + k
symbolic
}
2+3mc
operation on
1: call to m1(mc) = mc polynomials
void m0(int mc) {
2: call to m2(2*mc) =
MR_m0 = mc^2+7mc
1: m1(mc); ME_m0 = 0 • From ME_m2 2*2mc = 4mc
2: B[] m2Arr=m2(2 * mc); • From MR-ME_m2 2mc = 2mc
}
MR = max(mc2+3mc, 2mc)+4mc

Precision vs. Granularity
• How we call help the caller to identify
if escaping objects can be collected?
• Subheap: A set of objects that have similar lifetime
– Subheap descriptor: An identifier for a subheap

• Esc can be organized in terms of subheaps
public B two() {
1:this.g = new A();
• Esc(all) = 3
2:b = new B() • Esc({This}) =1
3:b.f = new C();
4:return b;
• Esc({Ret}) = 2
} • Esc({Ret}) = 1 Esc({Ret_f}) = 1

Subheap descriptors and Esc Analysis
public B two() {
1:this.g = new A();
2:b = new B()
3:b.f = new C();
4:return b;
}

Salcianu’s Points –to Graph Steengard’s like equivalences

this

P {b~2~ 3~ret} sh descriptor = eq class
N b sh descriptor = {this~1}
fg • Esc({1}) = 1 • Esc(ret) = 2
• Esc({2}) = 1 • Esc(this) = 1
1 2
f
• Esc({3}) = 1
3

From Code Contracts to Memory Contracts
static public int GCD(int x, int y) {
Contract.Requires(x > 0 && y > 0);
Contract.Ensures(Contract.Result<int>()>0);
while (true) {
if (x < y){
y %= x;
if (y == 0) return x;
} else {
x %= y;
if (x == 0) return y;
}
}
}

• Code Contracts: Pre/posconditions loop and object invariants
– Runtime checking
– Static checking using abstract interpreter
• Automatic Inference of loop invariants!!

From Code Contracts to Memory Contracts

• Memory contracts
– Contract.Memory.X / X= {MemReq, RSD, TMP}

• Annotations for specifying object lifetime
• DestRsd: whether an allocations escapes through a subheap

• AddRsd: Determine the destination of a callee subheap in the
caller

• BindRsd: Associate a subheap name with a expr in code.

DestTmp() indicates the
next object is temporary
logger is a temporary
object since it can be
collected when method
ﬁnishes its ex ecution

Rsd(sh,n) or Esc speficies
the number of objects
escaping by sh is at most n

node is a residual object
because its lifetime
exceeds that of the DestRsd(sh) indicates the
method that creates it next object is escaping and
tagged with sh

Squiggles means errors

How annotations are checked?

• Code is instrumented to include counters
– Each subheap has its counter
– Annotations determine which counter should be added
• Lifetime annotations are checked using a Points-to and Escape
Analysis
Jonathan Tapicer, Diego Garbervetsky, Martín Roaux , "Resource Usage Contracts for .NET" , TOPI 2011: 1st Workshop on
Developing Tools as Plug-ins - 2011

(simplified)
B[] m2(int n) {
B[] m2(int n) {
Contract.ensures(ret_m2<=2n)
Contract.Memory.Rsd(ret,2n) Contract.ensures (mr<= 3n)
Contract.Memory.MR(3n) ret_m2+=n; mr+=n;
Contract.Memory.DestRsd(); 6: B[] arrB = new B[n];
6: B[] arrB = new B[n]; 7: for (int j = 1; j <= n;j++){
7: for (int j = 1; j <= n; j++) { ret_m2++;mr++;
Contract.Memory.DestRsd(); 8: arrB[j-1] = new B();
8: arrB[j-1] = new B(); mr++;
9: C c = new C(); 9: C c = new C();
10: c.value = arrB[j-1]; 10: c.value = arrB[j-1];
} }
11: return arrB; 11: return arrB;
} }

• We leverage of automatic invariant inference from
Cousot (Code Contracts checker) to prove this
automatically

How annotations are checked? (problem)
B[] m2(int n) {
Contract.Memory.Rsd(Ret,2n)
Contract.Memory.MR(3n)
void m1(int k) {
void m1(int k) { Contract.ensures(this_m1<=k*k+k)
Contract.Memory.Rsd(this,k*K+K) Contract.ensures (mr<= k*k+3k)
Contract.Memory.MR(k*K+3K) 3: for (int i = 1; i <= k; i++){
3: for (int i = 1; i <= k; i++){ mr++;
4: A a = new A(); 4: A a = new A();
Contract.Memory.AddRsd(This,Ret); this_m1+=2i; mr+=2i
5: this.f[i] = m2(i); mr+=max{1,i};
} 5: this.f[i] = m2(i);
} }
}

• To prove this_m1<=k*k+k we need a non-linear invariant!
• Beyond the capabilities of Clousot

Good news!
• We know how to count/sum solutions to iteration spaces
• We just need linear invariants

void m1(int k) { Inferred by
Contract.ensures(this_m1<=k*k+k) Clousot
Contract.ensures (mr<= k*k+3k)
3: for (int i = 1; i <= k; i++){
Contract.itSpace(1<=i<=k)
4: mr+;
A a = new A(); Computed using the
loop=sum(itSpace2,2i);
mr+=max{1,i};
symbolic calculator
5: this.f[i] = m2(i);
}
Contract.assume(loop<=k*k+k) We force the checker to
this_m1+=loop; mr+=loop; accept the bound
}

Polymorphism
public void test(List l) { ME_A1do = 2
foreach(e:l){ Class A1 extends A {
if(cond) a = new A1(); public int do(int n) { .. }
else a= new A2(); }
a.do(e);
} ME_A2do = n
Class A2 extends A {
}
public int do(int n) { .. }
}

sum{i=1..size(l)} (max{ ME_A1.do(l[i]), ME_A2.do(l[i])})

• Hard to solve simbolicaly…. 
– Need to over approximate the max operation

Polymorphism
public void test(List l) { ME_A1do = 2
if(cond) a = new A1(); Class A1 extends A {
else a= new A2(); public int do(int n) { .. }
}
foreach(e:l){
a.do(e); ME_A2do = n
Class A2 extends A {
} public int do(int n) { .. }
} }

• The object remains the same in all iterations

loop_A1 = sum{i=1..size(l)} ME_A1.do(l[i])

loop_A2 = sum{i=1..size(l)} ME_A2.do(l[i])

Max(loop_A1, loop_A2)

About Flow sensitiveness
public void m1_m2() { public void m1_m2() {
m1(); m2();
m2(); m1();
} }

M1 M2
M2 M1
M1_M2 M1_M2

Is more complicated with loops and conditionals….
public void for_m1_m2(int n) { public void for_m1_m2(int n) {
for(int i = 0; i < n; i++) { M1,M2 for(int i = 0; i < n; i++) { M1,M2
m1(i); M1,M2 if(i%2) m1(i); M2,M1
m2(i); … m2(i); M2,M2,
} } …
} }

From objects to actual memory consumption
• Consider all objects
– Model VM behavior
– VM Memory vs System memory
• Improve lifetime inference
– Reachable objects, live objects
– (parametric in GC)
• Consumption patterns
– Lazy initialization, global fields that are constantly overrriten
• Specification language (independent of the EA)
– For non-analyzable, interfaces

Current + Future work

• Improving the interprocedural inference analysis
– “Plugleable” object lifefime analysis

– Support for recursive methods invocations

• Better invariant inference
– Tool suppot for annotations / checking

• A new specification language
88

Current + Future work
• Experimenting with static verification of memory
consumption
– Code Contracts, SMT solver + barvinok library

• A intermediate language (a sort of Boogie)
– With translations to Java, C#, C, etc.

• Inference for new memory models
– ISMM 2011: Short-term memory for self-collecting
mutators.

89

Conclusions
• The analysis of memory requirements is feasible (but
very difficult…).
• We (and other groups) had made good progress
• We need to improve in order to analyze real programs
– Compute actual consumption
– Seriously improve scalability

• We believe compositional approaches is a promising
direction

Credits
• Victor Braberman • Javier Tapicer
• Sergio Yovine • Martin Rouaux
• Philippe Clauss • Andres Ferrari
• Samuel Hym • Alejandro Taboada
• Daniel Gorin • Guillaume Salagnac
• Federico Fernandez
• Matias Grumberg

ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements

Similar to ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements (20)

Recently uploaded

Recently uploaded (20)

ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements

Editor's Notes