Implementation of Model Predictive Controller for a drone

2. Problem with LQR Controller LQR Controller MPC Controller Only consider current state at step t* • Unable to track the reference trajectory Correctly Consider states from t to t+T *To be more specific, LQR does consider infinite steps in the future, but solves the optimization problem before running the simulation

3. What is Linear Quadratic Regulator (LQR) • Objective: • minimize the quadratic cost for Infinite Time Model Analytically computes a stationary feedback gain Solution should be optimal for infinite time horizon Analytically computes a stationary feedback gain Disadvantages: • Cannot consider reference trajectory into account • Input is only dependent on current state at step t

4. What is Model Preditive Controller (MPC)? • Objective: • minimize the quadratic cost for Finite Time This allows for the controller to take future steps in account at every step t Model Only Optimize Input in a finite time

5. LQR vs. MPC • Standard Controller (PID, LQR) • Model Predictive Controller Infinite Time Optimal Control → Stationary Policy Finite Time Optimal Control → non-Stationary Policy Current Error Use Model to “Predict” future states and compute error over time

6. MPC Objective • Future states can be predicted using model dynamics • Future states Rollouts Derive a Dynamical Model from Newton’s second law Minimize Error between predicted and reference states & inputs

7. Rewriting Objective wrt U • Objective • Sub Y

8. MPC Formulization • Discrete System • Model • Augmented Model • Objective State Input Disturbance Observed State Weight Matrix Here, I added gravity vector as a disturbance For now Everything is measurable

9. Formulate Linear MPC as a Quadratic Programming Problem • Quadratic Programming where

10. Constraints • Input Constraints ex) Let Force be limited within 10 [N]

11. Constraints • State Constraints ex) Let max velocity be limited within 10 [m/s] Rewrite the constraints only with input U

12. Simulation LQR Controller MPC Controller Simulation Model Nonlinear Model

13. Constraints Angles are constrained within ±0.15 rad

14. Python Package: pycontroller https://github.com/watakandai/pycontroller

15. Framework DQN DDPG TD3 Rainbow PPO LQR PID MPC DDP iLQR CEM I am adding more features * This is just for running with an optimized agent. RL Agent has a train function separately.

16. Add your own policy, controller, environment Just inherit from the base classes

Editor's Notes

LQR controller is often compared with MPC because they have similar formulations. Both controllers compute an optimal input u with respect to an objective function. These controllers are called Optimal Controllers. Here throughout this implementation, I used a drone as an example model. In such case, input u is a force and torques applied to a drone. Drone uses 4 rotors to control its upward force and torques. On the left, we have a LQR controller. LQR only considers state at current time t. Therefore, it fails to plan ahead, so it is unable to track to the planned trajectory. However, the MPC controller considers future steps from current t to timestep T in the future. Therefore, it can plan ahead and compute an optimal policy so that it can track the reference trajectory. Now I will talk about the
Linear Quadratic Regulator’s objective is to minimize the linear quadratic state cost and input cost for infinite time horizon. State is system’s state who’s motion is governed by its physical model. From this, solution should be optimal for infinite time horizon. However, it is difficult to consider all reference states in the future. Thus, by setting the reference state and solve it as a linear quadratic regulator problem, an optimal solution can be computed analytically. This feedback gain is stationary and is optimal under this assumption. The disadvantages are 1. 2.
Opposed to LQR Controller, Model Predictive Controller minimizes ... In a graph, it can be illustrated as follows. It only optimizes inputs in a finite time at every time step t. This allows ...
To conclude, the lqr is a stionary policy and the mpc is a non-stationary policy, in other word, it changes over time. Also, it can be stated that MPC can take future steps into consideration.
The objective is to minimize error between predicted and reference states & inputs. The predicted states can be computed by rolling out the dynamics model. The dynamics model can be derived from the newton’s second law and rewrite it in the similar manner as on the right. One should be careful since this is a continuous state. You need to discretize based on the sample time. Once Dynamic Model is derived, the future states can be predicted by rolling out. Then the model can be augmented as on the equation on the bottom left. The final model then would look like the one on the bottom right.
Sub Y=HX in to the original objective function, we can rewritten the equation. At the end, we get a linear quadratic cost.
Wrap up. Disrete System Model can be written as the first equtation. Since, we have the model, we can predict its future state by inputting random inputs The augmented model can be rewritten as similar way. Thus we get the predicted measured states Y. In Linaer Quadratic Problem, our objective is to minimize the quadratic cost of states and inputs. As we can comptue Y with respect to U, it can be rewritten everything wrt U.
This is clearly a quadratic problem. Since this is a quadratic problem, we can add linear constraints.
For example, constraint on inputs can easily be added. For example, if you want the force to be in a range of 10N, you can set a constraint like this. Likewise, you can add constraints to the inputs very easily.
Constraints on the state can also be added, but need some tweak. Since all objectives are written in regard of U, we need to rewrite the constraint. For example, If the velocity has a limit of 10 m/s, one can write like this. But since, X can be written wrt to U, We can sub X into the constraint.
LQR controller is often compared with MPC because they have similar formulations. Both controllers compute an optimal input u with respect to an objective function. These controllers are called Optimal Controllers. Here throughout this implementation, I used a drone as an example model. In such case, input u is a force and torques applied to a drone. Drone uses 4 rotors to control its upward force and torques. On the left, we have a LQR controller. LQR only considers state at current time t. Therefore, it fails to plan ahead, so it is unable to track to the planned trajectory. However, the MPC controller considers future steps from current t to timestep T in the future. Therefore, it can plan ahead and compute an optimal policy so that it can track the reference trajectory. Now I will talk about the

Implementation of Model Predictive Controller for a drone

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Implementation of Model Predictive Controller for a drone

Similar to Implementation of Model Predictive Controller for a drone (20)

Recently uploaded

Recently uploaded (20)

Implementation of Model Predictive Controller for a drone

Editor's Notes