Optimal Geometric Control Theory and Pontryagin's Maximum Principle (PMP)

Per [1], let $Q$ be a closed manifold (say, the configuration space [c-space] of a robot or a machine [see, for instance, this webpage or this webage]), $T Q$ be the tangent bundle of $Q$ , $T T Q$ the double tangent bundle of $Q$ , $L$ a Lagrangian on $T Q$ , $g$ the inertial Riemannian metric on $Q$ induced by $L$ , $X \in Λ^{1} (T Q)$ the Lagrangian 1-form corrsponding to $L, X = \frac{\partial}{\partial t} {\frac{\partial L}{\partial \dot{q}}} - \frac{\partial L}{\partial q}$ , $R$ be a Raleigh dissipative term, $F$ an external force, $D$ a control distribution that is a sub-vector bundle of $T T Q$ , $B : U \to D$ a parameterization of $D$ (where $U \subseteq R^{m})$ , so that $X (\dot{γ} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - B (u (t)) - F (t) = 0$ is the equation of motion (EoM) (Note that this is an equation of 1-forms; to convert to a vector field, one would use the canonical sympletic form $ω$ on $T^{*} Q$ , per this post), $M$ a cost quadratic on $T Q$ , $N$ a cost quadratic on $U$ , and $P$ a cost quadratic on $T Q \times U$ . Let $(q_{0}, {\dot{q}}_{0})$ in $T Q$ be given, and consider an inital cost function $L_{0} ((q_{0}, {\dot{q}}_{0}), u_{0})$ , a running cost function $L (u) = M (q, \dot{q}) + N (u) + P ((q, \dot{q}), u)$ , a final cost function $L_{F} ((q_{F}, {\dot{q}}_{F}), u_{F})$ , and a (total) cost funtion $J (T) = L_{0} (\dot{γ} (0), u (0)) + \int_{0}^{T} L (\dot{γ} (t), u (t)) - λ (X (\dot{γ} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - B (u (t)) - F (t)) d t + L_{F} (\dot{γ} (T), u (T))$ , where $γ : [0, T] \to Q$ with $\dot{γ} (0) = (q_{0}, {\dot{q}}_{0})$ , $u : [0, T] \to U$ with $π_{T Q} (B (u (t))) = \dot{γ} (t)$ , and $λ : [0, T] \to T^{*} [Λ^{1} (T Q)]$ with $π_{T Q} (λ (t)) = \dot{γ} (t)$ (the co-state variable, encoding the dynamics of the system as a Lagrange multipliers problem of sorts) are to be determined, and typically either $T \in R$ (the final time) or $(q_{F}, {\dot{q}}_{F})$ (the final point of $\dot{γ}$ ) are specified.

Then we call

(Q, L, R, F, U, D, B, M, N, P, L_{0}, L_{F})

a nonlinear quadratic regulator (NQR) problem. It is typically desired to find a deterministic function

K : T Q \to U

with

u = K (q, \dot{q})

"automatically" producing the minimum of

J

(for a then uniquely-determined

λ

). We call

H : [0, T] \times (T T Q \oplus D \oplus T^{*} (T T Q)) \to R

given by

H (t, q, \dot{q}, \ddot{q}, u, λ) = L ((q, \dot{q}), u) - λ (X (q, \dot{q}) + \frac{\partial R}{\partial \dot{q}} - B (u) - F)

the Hamiltonian of the NQR problem.

Per [4], the NQR problem is solved in a coordinate patch by solving the following set of simultaneous equations:

{\begin{cases} \frac{\partial H}{\partial (q, \dot{q})} & = \frac{\partial}{\partial t} {\frac{\partial H}{\partial (\dot{q}, \ddot{q})}} \\ \frac{\partial H}{\partial u} & = \frac{\partial}{\partial t} {\frac{\partial H}{\partial \dot{u}}} \\ \frac{\partial H}{\partial λ} & = \frac{\partial}{\partial t} {\frac{\partial H}{\partial \dot{λ}}} \end{cases}

Recall that we have

X (\dot{γ} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - B (u (t)) - F (t) = 0

as the EoM. Per [3], this may be rewitten

B (u (t)) + F (t) = g [\ddot{γ} (t)] + Γ (\dot{q} (t), \dot{q} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - \frac{\partial L}{\partial q} (\dot{γ} (t))

, where

Γ

is the Cristoffel symbol of the Riemannian metric. The EoM then becomes

\ddot{γ} (t) = g^{- 1} [Γ (\dot{q} (t), \dot{q} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - \frac{\partial L}{\partial q} (\dot{γ} (t)) - B (u (t)) - F (t)]

. Write

f (t, \dot{γ} (t), u (t)) = g^{- 1} [Γ (\dot{q} (t), \dot{q} (t)) + \frac{\partial R}{\partial \dot{q}} (\dot{γ} (t)) - \frac{\partial L}{\partial q} (\dot{γ} (t)) - B (u (t)) - F (t)]

. The Hamiltonian then becomes

H (t, q, \dot{q}, \ddot{q}, u, λ) = L ((q, \dot{q}), u) - λ ((\dot{q}, \ddot{q}) - f (t, (q, \dot{q}), u))

and the equations to be solved then become

{\begin{cases} \frac{\partial L}{\partial (q, \dot{q})} - {(\frac{\partial f}{\partial (q, \dot{q})})}^{T} λ & = \dot{λ} \\ \frac{\partial L}{\partial u} - {(\frac{\partial f}{\partial u})}^{T} λ & = 0 \\ (\dot{q}, \ddot{q}) - f (t, (q, \dot{q}), u) & = 0 \end{cases}

The type of solution to the above system of equations may be checked by applying Sylvester's criterion for The Second Derivative Test to the symmetric matrix

[\begin{matrix} M & P \\ P^{T} & N \end{matrix}]

, which states that a symmetric matrix is (a) positive definite if and only if the determinants of all the leading diagonal submatrices are positive, meaning all the eigenvalues are positive and the solution is a local minimum, (b) negative definite if and only if the determinant of the first leading "diagonal submatrix" (entry) is negative, then the determinants of the remaining leading diagonal submatrices strictly alternate in sign, meaning all the eigenvalues are negative and the solution is a local maximum, (c) indefinite if any other pattern for the determinants of the leading diagonal submatrices holds but the determinant of the overall matrix is non-zero, meaning some eigenvalues are positive and some are negative but none are 0, and the solution is a saddle point, or (d) the determinant of the overall matrix is zero, meaning at least one eigenvalue is 0 and The Second Derivative Test fails for this solution.

Per [2], if we call $\tilde{Q} = [0, T] \times (T T Q \oplus D \oplus T^{*} (T T Q)) \times R$ with points given by $(t, (q (t), \dot{q} (t)), (\dot{q} (t), \ddot{q} (t)), u (t), λ (t), H (t, (q (t), \dot{q} (t)), (\dot{q} (t), \ddot{q} (t)), u (t), λ (t)))$ , then by Pontryagin's Maximum Principle (PMP), for any optimal (cost minimizing) solution $(γ (t), u (t), λ (t))$ to the NQR problem, the variations of $(t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t), H (t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t)))$ in $\tilde{Q}$ form an upwards-pointing cone at each point. Hence, for any optimal solution, there is a "path" of hyperplanes (contact structure) $(H_{t})$ along $(t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t), H (t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t))))$ orthogonal to the symmetry line of the cone and passing through the cone point, and hence a 1-form $η$ along $(t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t), H (t, \dot{γ} (t), \ddot{γ} (t), u (t), λ (t)))$ on $\tilde{Q}$ , called the Hamiltonian evolution of the NQR problem, with $H_{t} = \ker (η (t))$ . If $η (t) (- \frac{\partial}{\partial H}) \neq 0$ for all $t \in [0, T]$ (so that $- \frac{\partial}{\partial H}$ does not lie in any of the hyperplanes $H_{t}$ ), we call $η$ normal; otherwise, we call $η$ abnormal.

Bibliography:
[1] Agrachev, A. A. & Sachkov, Y. L. (2004). Control Theory from the Geometric Viewpoint. Springer. https://doi.org/10.1007/978-3-662-06404-7
[2] Jóźwikowski, M. & Respondek, W. A contact covariant approach to optimal control with applications to sub-Riemannian geometry. Math. Control Signals Syst. 28, 27 (2016) https://doi.org/10.1007/s00498-016-0176-3
[3] Lynch, K. M., & Park, F. C. (2017). Modern Robotics: Mechanics, Planning, and Control. Cambridge: Cambridge University Press.https://doi.org/10.1017/9781316661239
[4] Vrabie, D.L., Vamvoudakis, K.G., & Lewis, F.L. (2012). Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. https://doi.org/10.1049/PBCE081