Your SlideShare is downloading. ×
Qualifier
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Qualifier

485
views

Published on

An oral presentation I gave for my PhD qualifier examination

An oral presentation I gave for my PhD qualifier examination

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
485
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Ball and beam, Navier-Stokes, stability of an airplane wrt to pilot commands. nonlinear systems, number of variables = number of equations The solution means determining some values of variables/parameters that strictly satisfy the given relationships (equations). Impossible to solve them by hand even if they are homogeneous.
  • Direct solve – reuse the factorization (quasi-implicit Newton, chord method)
  • Computational fluid dynamics – millions of variables (chesapeake bay)
  • Examples of cost functions Talk about differentiability
  • Convergence rates - LINEAR We want Bk to mimic the behavior of newton => symmetric positive definite (at least in a neighborhood)
  • Trust regions determine a region around the current iterate where the approximate model of the objective function is accurate. The step is taken to be the minimizer on the trust region.
  • Explain symmetry, positive definiteness Limited memory quasi-newton
  • Z_k can be taken to be the
  • Transcript

    • 1. Qualifier Exam in HPC February 10 th , 2010
    • 2. Quasi-Newton methods Alexandru Cioaca
    • 3. Quasi-Newton methods (nonlinear systems)
      • Nonlinear systems:
      • F(x) = 0, F : R n  R n
      • F(x) = [ f i (x 1 ,…,x n ) ] T
      • Such systems appear in the simulation of processes (physical, chemical, etc.)
      • Iterative algorithm to solve nonlinear systems
      • Newton’s method != Nonlinear least-squares
    • 4. Quasi-Newton methods (nonlinear systems)
      • Standard assumptions
      • F – continuously differentiable in an open convex set D
      • F – Lipschitz continuous on D
      • There is x * in D s.t. F(x * )=0, F’(x * ) nonsingular
      • Newton’s method:
      • Starting from x 0 (initial iterate)
      • x k+1 = x k – F’(x k ) -1 * F(x k ), {x k }  x *
      • Until termination criterion is satisfied
    • 5. Quasi-Newton methods (nonlinear systems)
      • Linear model around x k :
      • M n (x) = F(x n ) + F’(x n )(x-x n )
      • M n (x) = 0  x n+1 = x n - F’(x n ) -1 *F(x n )
      • Iterates are computed as:
      • F’(x n ) * s n = F(x n )
      • x n+1 = x n - s n
    • 6. Quasi-Newton methods (nonlinear systems)
      • Evaluate F’(x n )
      • Symbolically
      • Numerically with finite differences
      • Automatic differentiation
      • Solve the linear system F’(x n ) * s n = F(x n )
      • Direct solve: LU, Cholesky
      • Iterative methods: GMRES, CG
    • 7. Quasi-Newton methods (nonlinear systems)
      • Computation:
      • F(xk) n scalar functions
      • F’(xk) n 2 scalar functions
      • LU O(2n 3 /3)
      • Cholesky O(n 3 /3)
      • Krylov methods (depends on condition number)
    • 8. Quasi-Newton methods (nonlinear systems)
      • LU and Cholesky are useful when we want to reuse the factorization (quasi-implicit)
      • Difficult to parallelize and balance the workload
      • Cholesky is faster and more stable but needs SPD (!)
      • For n large, factorization is very impractical (n~10 6 )
      • Krylov methods contain elements easily parallelizable (updates, inner products, matrix-vector products)
      • CG is faster and more stable but needs SPD
    • 9. Quasi-Newton methods (nonlinear systems)
      • Advantages:
      • Under standard assumptions, Newton’s method converges locally and quadratically
      • There exists a domain of attraction S which contains the solution
      • Once the iterates enter S, they stay in S and eventually converge to x*
      • The algorithm is memoryless (self-corrective)
    • 10. Quasi-Newton methods (nonlinear systems)
      • Disadvantages:
      • Convergence depends on the choice of x 0
      • F’(x) has to be evaluated for each x k
      • Computation can be expensive: F(x k ), F’(x k ), s k
    • 11. Quasi-Newton methods (nonlinear systems)
      • Implicit schemes for ODEs
      • y’ = f(t,y)
      • Forward Euler: y n+1 = y n + hf(t n ,y n ) (explicit)
      • Backward Euler: y n+1 = y n + hf(t n+1 , y n+1 ) (implicit)
      • Implicit schemes need the solution of a nonlinear system
      • (also CN, RK, LMF)
    • 12. Quasi-Newton methods (nonlinear systems)
      • How to circumvent evaluating F’(x k ) ?
      • Broyden’s method
      • B k+1 = B k + (y k – B k *s k )*s k T / <s k , s k >
      • x k+1 = x k – B k -1 * F(x k )
      • Inverse update (Sherman-Morrison formula)
      • H k+1 =H k +(s k -H k *y k )*s k T *H k /<s k ,H k *y k >
      • x k+1 = x k – H k * F(x k )
      • ( s k+1 = x k+1 – x k , y k+1 = F(x k+1 ) – F(x k ) )
    • 13. Quasi-Newton methods (nonlinear systems)
      • Advantages:
      • No need to compute F’(x k )
      • For inverse update – no linear system to solve
      • Disadvantages:
      • Superlinear convergence
      • No longer memoryless
    • 14. Quasi-Newton methods (unconstrained optimization)
      • Problem:
      • Find the global minimizer of a cost function
      • f : R n  R, x * = arg min f
      • f differentiable means the problem can be attacked by looking for zeros of the gradient
    • 15. Quasi-Newton methods (unconstrained optimization)
      • Descent methods
      • x k+1 =x k – λ k *P k *  f(x k )
      • P k = I n - steepest descent
      • P k =  2 f(x k ) -1 - Newton’s method
      • P k = B k -1 - Quasi-Newton
      • Angle between P k ,  f(x k ) less than 90
      • B k has to mimic the behavior of the Hessian
    • 16. Quasi-Newton methods (unconstrained optimization)
      • Global convergence
      • Line search
      • Step length: backtracking, interpolation
      • Sufficient decrease: Wolfe conditions
      • Trust regions
    • 17. Quasi-Newton methods (unconstrained optimization)
      • For Quasi-Newton, B k has to resemble  2 f(x k )
      • Single-Rank:
      • Symmetry:
      • Positive def.:
      • Inverse update:
    • 18. Quasi-Newton methods (unconstrained optimization)
      • Computation
      • Matrix updates, inner products
      • DFP, PSB 3 matrix-vector products
      • BFGS 2 matrix-matrix products
      • Storage
      • Limited memory versions (L-BFGS)
      • Store {sk, yk} for the last m iterations and recompute H
    • 19. Further improvements
      • Preconditioning the linear system
      • For faster convergence one may solve K*B k *p k = K*F(x k )
      • If B is spd (and sparse) we can use sparse approximate inverses to generate the preconditioner
      • This preconditioner can be refined on a subspace of B k using an algebraic multigrid technique
      • We need to solve the eigenvalue problem
    • 20. Further improvements
      • Model reduction
      • Sometimes the dimension of the system is very large
      • Smaller model that captures the essence of the original
      • An approximation of the model variability can be retrieved from an ensemble of forward simulations
      • The covariance matrix gives the subspace
      • We need to solve the eigenvalue problem
    • 21. QR/QL algorithms for symmetric matrices
      • Solves the eigenvalue problem
      • Iterative algorithm
      • Uses QR/QL factorization at each step
      • (A=Q*R, Q unitary, R upper triangular)
      • for k = 1,2,..
      • A k =Q k *R k
      • A k+1 =R k *Q k
      • end
      • Diagonal of A k converges to eigenvalues of A
    • 22. QR/QL algorithms for symmetric matrices
      • The matrix A is reduced to upper Hessenberg form before starting the iterations
      • Householder reflections (U=I-v*v’)
      • Reduction is made column-wise
      • If A is symmetric, it is reduced to tridiagonal form
    • 23. QR/QL algorithms for symmetric matrices
      • Convergence to a triangular form can be slow
      • Origin shifts are used to accelerate it
      • for k = 1,2,..
      • A k -z k *I=Q k *R k
      • A k+1 =R k *Q k +z k *I
      • end
      • Wilkinson shift
      • QR makes heavy use of matrix-matrix products
    • 24. Alternatives to quasi-Newton
      • Inexact Newton methods
      • Inner iteration – determine a search direction by solving the linear system with a certain tolerance
      • Only Hessian-vector products are necessary
      • Outer iteration – line search on the search direction
      • Nonlinear CG
      • Residual replaced by gradient of cost function
      • Line search
      • Different flavors
    • 25. Alternatives to quasi-Newton
      • Direct search
      • Does not involve derivatives of the cost function
      • Uses a structure called simplex to search for decrease in f
      • Stops when further progress cannot be achieved
      • Can get stuck in a local minima
    • 26. More alternatives
      • Monte Carlo
      • Computational method relying on random sampling
      • Can be used for optimization (MDO), inverse problems by using random walks
      • In the case where we have multiple correlated variables, the correlation matrix is spd so we can use Cholesky to factorize it
    • 27. Conclusions
      • Newton’s method is a very powerful method with many applications and uses (solving nonlinear systems, finding minima of cost functions). Newton’s method can be used together with many other numerical algorithms (factorizations, linear solvers)
      • The optimization and parallelization of matrix-vector, matrix-matrix products, decompositions and other numerical methods can have a significant impact in overall performance
    • 28.
      • Thank you for your time!

    ×