Per [1], let \( Q\) be a closed manifold (say, the configuration
space
[c-space] of a robot or a machine [see, for instance, this webpage or this webage]), \( TQ\) be the tangent bundle of \(
Q\),
\(TTQ\) the double tangent bundle of \(Q\), \(\mathcal{L}\) a
Lagrangian on \(TQ\), \(g\) the inertial Riemannian
metric on \(Q\) induced by \(\mathcal{L}\), \(X \in \Gamma(TQ)\) the
Lagrangian vector field corrsponding to \(\displaystyle \mathcal{L}, X
= \frac{\partial}{\partial t}\left\{\frac{\partial
\mathcal{L}}{\partial \dot{q}}\right\} - \frac{\partial L}{\partial
q}\), \(R\) be a Raleigh dissipative term, \(F\) an
external force, \(\mathcal{D}\) a control distribution that is a
sub-vector bundle of
\(TTQ\), \(B: U \to \mathcal{D}\) a parameterization of \(\mathcal{D}\)
(where \(U
\subseteq \mathbb{R}^m)\), so that \(\displaystyle X(\dot{\gamma}(t)) +
\frac{\partial R}{\partial
\dot{q}}\left(\dot{\gamma}(t)\right) - B(u(t)) - F(t) = 0\) is the
equation of motion (EoM),
\(M\) a cost quadratic on \(TQ\), \(N\) a
cost quadratic on \(U\), and
\(P\) a cost quadratic on \(TQ \times U\). Let \((q_0, \dot{q}_0)\) in
\(TQ\) be given, and consider an inital
cost
function \(L_0((q_0, \dot{q}_0), u_0)\), a running cost function \(L(
u) =M(q, \dot{q}) +
N(u) + P((q,\dot{q}),
u)\) , a final cost
function \(L_F((q_F, \dot{q}_F), u_F)\)
, and a (total) cost funtion \(\displaystyle J(T) =
L_0(\dot{\gamma}(0), u(0)) + \int_0^T L(\dot{\gamma}(t), u(t)) -
\lambda\left(X(\dot{\gamma}(t)) + \frac{\partial
R}{\partial
\dot{q}}(\dot{\gamma}(t)) -B(u(t)) - F(t)\right)\ dt
+
L_F(\dot{\gamma}(T), u(T))\), where \(\gamma: [0, T] \to Q\) with
\(\dot{\gamma}(0) = (q_0, \dot{q}_0)\), \(u: [0, T] \to U\) with
\(\pi_{TQ}(B(u(t))) = \dot{\gamma}(t)\),
and
\(\lambda: [0, T] \to T^*(TTQ)\) with \(\pi_{TQ}(\lambda(t)) =
\dot{\gamma}(t)\) (the co-state
variable, encoding the dynamics of the system as a Lagrange
multipliers problem of sorts) are to be determined,
and
typically either \(T \in \mathbb{R}\) (the final time)
or \((q_F, \dot{q}_F)\) (the final point of \(\dot{\gamma}\)) are
specified.
Per [2], if we call \(\tilde{Q} = [0, T] \times (TTQ \oplus
\mathcal{D} \oplus T^*(TTQ))
\times \mathbb{R}\) with points given by \((t, (q(t),\dot{q}(t)),
(\dot{q}(t),\ddot{q}(t)), u(t),
\lambda(t), H(t, (q(t),\dot{q}(t)), (\dot{q}(t),\ddot{q}(t)), u(t),
\lambda(t)))\), then by Pontryagin's Maximum Principle (PMP), for
any optimal
(cost minimizing) solution \((\gamma(t), u(t), \lambda(t))\) to the NQR
problem,
the variations of \((t, \dot{\gamma}(t), \ddot{\gamma}(t), u(t),
\lambda(t), H(t,
\dot{\gamma}(t), \ddot{\gamma}(t), u(t), \lambda(t)))\) in
\(\tilde{Q}\) form an
upwards-pointing cone at each point. Hence, for any
optimal solution, there is a
"path" of hyperplanes (contact structure) \((\mathcal{H}_t)\) along
\((t, \dot{\gamma}(t), \ddot{\gamma}(t), u(t), \lambda(t), H(t,
\dot{\gamma}(t), \ddot{\gamma}(t), u(t), \lambda(t))))\) orthogonal to
the symmetry
line of the cone and passing through the cone point, and hence a 1-form
\(\eta\) along \((t, \dot{\gamma}(t), \ddot{\gamma}(t), u(t),
\lambda(t), H(t,
\dot{\gamma}(t),
\ddot{\gamma}(t),
u(t), \lambda(t)))\) on \(\tilde{Q}\), called the Hamiltonian
evolution
of the NQR problem, with \(\mathcal{H}_t =
\ker(\eta(t))\). If \(\displaystyle
\eta(t)\left(-\frac{\partial}{\partial H}\right) \neq 0\) for all \(t
\in [0, T]\) (so that \(\displaystyle -\frac{\partial}{\partial H}\)
does not lie in any of the hyperplanes \(\mathcal{H}_t \)), we call
\(\eta\) normal; otherwise,
we call \(\eta\) abnormal.
Bibliography:
[1] Agrachev, A. A. & Sachkov, Y. L. (2004). Control Theory
from the Geometric Viewpoint. Springer. https://doi.org/10.1007/978-3-662-06404-7
[2] Jóźwikowski, M. & Respondek, W. A contact covariant approach to
optimal control with applications to sub-Riemannian geometry. Math.
Control Signals Syst. 28, 27 (2016) https://doi.org/10.1007/s00498-016-0176-3
[3]
Lynch, K. M., & Park, F. C. (2017). Modern Robotics: Mechanics,
Planning, and Control. Cambridge: Cambridge University Press.https://doi.org/10.1017/9781316661239
[4] Vrabie, D.L., Vamvoudakis, K.G., & Lewis, F.L. (2012).
Optimal Adaptive Control and
Differential Games by Reinforcement
Learning Principles. https://doi.org/10.1049/PBCE081