0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Stochastic Differentiation

1,569

Published on

AACIMP 2010 Summer School lecture by Leonidas Sakalauskas. "Applied Mathematics" stream. "Stochastic Programming and Applications" course. Part 3. …

AACIMP 2010 Summer School lecture by Leonidas Sakalauskas. "Applied Mathematics" stream. "Stochastic Programming and Applications" course. Part 3.

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
1,569
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
22
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

1. Lecture 3 Stochastic Differentiation Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania <sakal@ktl.mii.lt> EURO Working Group on Continuous Optimization
2. Content  Concept of stochastic gradient  Analytical differentiation of expectation  Differentiation of the objective function of two- stage SLP  Finite difference approach  Stochastic perturbation approximation  Likelihood approach  Differentiation of integrals given by inclusion  Simulation of stochastic gradient  Projection of Stochastic Gradient
3. Expected objective function The stochastic programming deals with the objective and/or constraint functions defined as expectation of random function: F ( x) Ef x, , n f :R R -elementary event in the probability space: , ,Px , Px - the measure, defined by probability density function: p : Rn R
4. Concept of stochastic gradient The methods of nonlinear stochastic programming are built using the concept of stochastic gradient. The stochastic gradient of the function F (x) is the random vector g (x, ) such that: F ( x) Eg ( x, ) x
5. Methods of stochastic differentiation  Several estimators examined for stochastic gradient:  Analytical approach (AA);  Finite difference approach (FD);  Likelihood ratio approach (LR)  Simulated perturbation approximation.
6. Stochastic gradient: an analytical approach F ( x) f ( x, z) p( x, z)dz n F ( x) x f ( x, z) f ( x, z) x ln p( x, z) p( x, z)dz x n g ( x, ) x f ( x, ) f ( x, ) x ln p( x, )
7. Analytical approach (AA) Assume, density of random variable doesn’t depends on the decision variable. Thus, the analytical stochastic gradient coincides with the gradient of random integrated function: 1 f ( x, ) g ( x, ) x
8. Analytical approach (AA) Let consider the two-stage SLP: F ( x) c x E min y q y min W y T x h, y Rm , Ax b, x X, vectors q, h, and matrices W, T can be random in general
9. Analytical approach (AA) The stochastic analytical gradient is defined as g 1 ( x, ) c T u * by the a set of solutions of the dual problem (h T x)T u* maxu [(h T x)T u | u W T q 0, u m ]
10. Finite difference (FD) approach Let us approximate the gradient of the random function by finite differences. Thus, the each ith component of the stochastic gradient g 2 ( x, ) is computed as: 2 f (x i , y) f ( x, y) g ( x, ) i i is the vector with zero components except ith one, equal to 1, is some small value.
11. Simulated perturbation stochastic approximation (SPSA) 3 f (x , y) f ( x , y) g ( x, y) 2 where is the random vector obtaining values 1 or -1 with probabilities p=0.5, is some small value (Spall 1992).
12. Likelihood ratio (LR) approach F ( x) f ( x z ) p( z )dz n 4 ln( p ( y )) g ( x, y ) ( f (x y) f ( x)) y
13. Stochastic differentiation of integrals given by inclusion Let consider the integral on the set given by inclusion F ( x) p( x, z ) dz f ( x, z ) B
14. Stochastic differentiation of integrals given by inclusion The gradient of this function is defined as p ( x, z ) G ( x) q( x, z ) dz f ( x, z ) B x where q( x, z ) is defined through derivatives of p and f (see, Uryasev (1994), (2002))
15. Simulation of stochastic gradient We assume here that the Monte-Carlo sample of a certain size N are provided for any x R n Z ( z1 , z 2 ,..., z N ), i z are independent random copies of , i.e., distributed according to the density p ( x, z ).
16. Sampling estimators of the objective function the sampling estimator of the objective function: N ~ i 1 f ( x, z i ) F ( x) N and the sampling variance are computed N i ~ 2 f ( x, z ) F ( x ) D 2 ( x) i 1 N
17. Sampling estimator of the gradient The gradient is evaluated using the same random sample: N ~ i 1 g ( x, z i ) G ( x) N
18. Sampling estimator of the gradient The sampling covariance matrix is applied 1 N i ~ i ~ T A( x) i 1 g ( x, z ) G ( x) g ( x, z ) G ( x) N n later on for normalising of the gradient estimator. Say, the Hotelling statistics can be used for testing the zero value of the gradient: 2 N n~ T 1 ~ T G ( x) A( x) G ( x) n
19. Wrap-Up and conclusions  The methods of nonlinear stochastic programming are built using the concept of stochastic gradient  Several methods exist to obtain the stochastic gradient by evaluating the objective function and stochastic gradient by the same random sample.