Stochastic Differentiation

Lecture 3

Stochastic Differentiation
Leonidas Sakalauskas
Institute of Mathematics and Informatics
Vilnius, Lithuania <sakal@ktl.mii.lt>

EURO Working Group on Continuous Optimization

Content
 Concept of stochastic gradient
 Analytical differentiation of expectation
 Differentiation of the objective function of two-
stage SLP
 Finite difference approach
 Stochastic perturbation approximation
 Likelihood approach
 Differentiation of integrals given by inclusion
 Simulation of stochastic gradient
 Projection of Stochastic Gradient

Expected objective function

The stochastic programming deals with the objective
and/or constraint functions defined as expectation of
random function:
F ( x) Ef x, ,
n
f :R R
-elementary event in the probability space:
, ,Px ,

Px - the measure, defined by probability density function:
p : Rn R

Concept of stochastic gradient

The methods of nonlinear stochastic
programming are built using the concept of
stochastic gradient.

The stochastic gradient of the function F (x)
is the random vector g (x, ) such that:

F ( x)
Eg ( x, )
x

Methods of stochastic differentiation

 Several estimators examined for
stochastic gradient:
 Analytical approach (AA);

 Finite difference approach (FD);

 Likelihood ratio approach (LR)

 Simulated perturbation
approximation.

Stochastic gradient:
an analytical approach

F ( x) f ( x, z) p( x, z)dz
n

F ( x)
x f ( x, z) f ( x, z) x ln p( x, z) p( x, z)dz
x n

g ( x, ) x f ( x, ) f ( x, ) x ln p( x, )

Analytical approach (AA)
Assume, density of random variable
doesn’t depends on the decision
variable.
Thus, the analytical stochastic gradient
coincides with the gradient of
random integrated function:

1 f ( x, )
g ( x, )
x

Let consider the two-stage SLP:

F ( x) c x E min y q y min

W y T x h, y Rm ,

Ax b, x X,

vectors q, h, and matrices W, T
can be random in general


The stochastic analytical gradient is defined as

g 1 ( x, ) c T u *
by the a set of solutions of the dual problem

(h T x)T u* maxu [(h T x)T u | u W T q 0, u m
]

Finite difference (FD) approach
Let us approximate the gradient of the random
function by finite differences.
Thus, the each ith component of the stochastic
gradient g 2 ( x, ) is computed as:

2 f (x i , y) f ( x, y)
g ( x, )
i

i is the vector with zero components except ith
one, equal to 1, is some small value.

Simulated perturbation stochastic
approximation (SPSA)

3 f (x , y) f ( x , y)
g ( x, y)
2

where is the random vector obtaining values 1 or -1
with probabilities p=0.5, is some small value
(Spall 1992).

Likelihood ratio (LR) approach

F ( x) f ( x z ) p( z )dz
n

4 ln( p ( y ))
g ( x, y ) ( f (x y) f ( x))
y

Stochastic differentiation of
integrals given by inclusion

Let consider the integral on the set given by
inclusion

F ( x) p( x, z ) dz
f ( x, z ) B

Stochastic differentiation of
integrals given by inclusion

The gradient of this function is defined as

p ( x, z )
G ( x) q( x, z ) dz
f ( x, z ) B
x

where q( x, z ) is defined through
derivatives of p and f (see, Uryasev (1994),
(2002))

Simulation of stochastic gradient
We assume here that the Monte-Carlo sample of a
certain size N are provided for any x R n

Z ( z1 , z 2 ,..., z N ),

i
z are independent random copies of ,

i.e., distributed according to the density p ( x, z ).

Sampling estimators of the
objective function

the sampling estimator of the objective function:
N
~ i 1
f ( x, z i )
F ( x)
N
and the sampling variance are computed

N i ~ 2
f ( x, z ) F ( x )
D 2 ( x) i 1
N

Sampling estimator of the gradient

The gradient is evaluated using the same
random sample:

N
~ i 1
g ( x, z i )
G ( x)
N

Sampling estimator of the gradient

The sampling covariance matrix is applied
1 N i ~ i ~ T
A( x) i 1
g ( x, z ) G ( x) g ( x, z ) G ( x)
N n

later on for normalising of the gradient
estimator.
Say, the Hotelling statistics can be used for
testing the zero value of the gradient:
2 N n~ T 1 ~
T G ( x) A( x) G ( x)
n

Wrap-Up and conclusions
 The methods of nonlinear stochastic programming
are built using the concept of stochastic gradient

 Several methods exist to obtain the stochastic
gradient by evaluating the objective function and
stochastic gradient by the same random sample.

Stochastic Differentiation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Stochastic Differentiation

Similar to Stochastic Differentiation (20)

More from SSA KPI

More from SSA KPI (20)

Recently uploaded

Recently uploaded (20)

Stochastic Differentiation