Chapter 8 Adaptive Control

8.1 Model-Reference Adaptive Control

Basic flow for designing an adaptive controller

Design a control law with variable parameters
Design an adaptation law for adjusting the control parameters
Analyze the convergence of the closed-loop system

The control law design at the first step typically requires the designer to know what a good controller is if the true parameters were actually known, e.g., from feedback linearization (Appendix F), sliding control (Appendix G) etc.

The design of the adaptation law typically comes from analyzing the dynamics of the tracking error, which as we will see often appears in the form of Lemma 8.1.

The convergence of the closed-loop system is usually analyzed with the help of a Lyapunov-like function introduced in Chapter 5.

Lemma 8.1 (Basic Lemma) Let two signals \(e(t)\) and \(\phi(t)\) be related by \[\begin{equation} e(t) = H(p)[k \phi(t)^T v(t)] \tag{8.1} \end{equation}\] where \(e(t)\) a scalar output signal, \(H(p)\) a strictly positive real (SPR) transfer function, \(k\) an unknown real number with known sign, \(\phi(t) \in \mathbb{R}^m\) a control signal, and \(v(t) \in \mathbb{R}^m\) a measurable input signal.

If the control signal \(\phi(t)\) satisfies \[\begin{equation} \dot{\phi}(t) = - \mathrm{sgn}(k) \gamma e(t) v(t) \tag{8.2} \end{equation}\] with \(\gamma > 0\) a positive constant, then \(e(t)\) and \(\phi(t)\) are globally bounded. Moreover, if \(v(t)\) is bounded, then \[ \lim_{t \rightarrow \infty} e(t) = 0. \]

Proof. Let the state-space representation of (8.1) be \[\begin{equation} \dot{x} = A x + b [k \phi^T v], \quad e = c^T x. \tag{8.3} \end{equation}\] Since \(H(p)\) is SPR, it follows from the Kalman-Yakubovich Lemma E.1 that there exist \(P,Q \succ 0\) such that \[ A^T P + P A = -Q, \quad Pb = c. \] Let \[ V(x,\phi) = x^T P x + \frac{|k|}{\gamma} \phi^T \phi, \] clearly \(V\) is positive definite (i.e., \(V(0,0)=0\), and \(V(x,\phi) > 0\) for all \(x \neq 0, \phi \neq 0\)). The time derivative of \(V\) along the trajectory defined by (8.3) with \(\phi\) chosen as in (8.2) is \[\begin{align} \dot{V} & = \frac{\partial V}{\partial x} \dot{x} + \frac{\partial V}{\partial \phi} \dot{\phi} \\ &= x^T (PA + A^T P) x + 2 x^T P b (k \phi^T v) + \frac{2|k|}{\gamma} \phi^T (- \mathrm{sgn}(k) \gamma e v) \\ & = - x^T Q x + 2 (x^T c)(k\phi^T v) - 2 \phi^T (e v) \\ & = - x^T Q x \leq 0. \end{align}\] As a result, we know \(x\) and \(\phi\) must be bounded (\(V(x(t),\phi(t)) \leq V(x(0),\phi(0))\) is bounded). Since \(e = c^T x\), we know \(e\) must be bounded as well.

If the input signal \(v\) is also bounded, then \(\dot{x}\) is bounded as seen from (8.3). Because \(\ddot{V} = -2x^T Q \dot{x}\) is now bounded, we know \(\dot{V}\) is uniformly continuous. Therefore, by Barbalat’s stability certificate (Theorem 5.7), we know \(\dot{V}\) tends to zero as \(t\) tends to infinity, which implies \(\lim_{t \rightarrow \infty} x(t) = 0\) and hence \(\lim_{t \rightarrow \infty} e(t) = 0\).

8.1.1 First-Order Systems

Consider the first-order single-input single-output (SISO) system \[\begin{equation} \dot{x} = - a x + b u \tag{8.4} \end{equation}\] where \(a\) and \(b\) are unknown groundtruth parameters. However, we do assume that the sign of \(b\) is known. What if the sign of \(b\) is unknown too?

Let \(r(t)\) be a reference trajectory, e.g., a step function or a sinusoidal function, and \(x_d(t)\) be a desired system trajectory that tracks the reference \[\begin{equation} \dot{x}_d = - a_d x_d + b_d r(t), \tag{8.5} \end{equation}\] where \(a_d,b_d > 0\) are user-defined constants. Note that the transfer function from \(r\) to \(x_d\) is \[ x_d = \frac{b_d}{p + a_d} r \] and the system is stable. Review basics of transfer function.

The goal of adaptive control is to design a control law and an adaptation law such that the tracking error of the system \(x(t) - x_d(t)\) converges to zero.

Control law. We design the control law as \[\begin{equation} u = \hat{a}_r(t) r + \hat{a}_x(t) x \tag{8.6} \end{equation}\] where \(\hat{a}_r(t)\) and \(\hat{a}_x(t)\) are time-varying feedback gains that we wish to adapt. The closed-loop dynamics of system (8.4) with the controller (8.6) is \[ \dot{x} = - a x + b (\hat{a}_r r + \hat{a}_x x) = - (a - b \hat{a}_x) x + b \hat{a}_r r. \] With the equation above, the reason for choosing the control law (8.6) is clear: if the system parameters \((a,b)\) were known, then choosing \[\begin{equation} a_r^\star = \frac{b_d}{b}, \quad a_x^\star = \frac{a - a_d}{b} \tag{8.7} \end{equation}\] leads to the closed-loop dynamics \(\dot{x} = - a_d x + b_d r\) that is exactly what we want in (8.5).

However, in adaptive control, since the true parameters \((a,b)\) are not revealed to the control designer, an adaptation law is needed to dynamically adjust the gains \(\hat{a}_r\) and \(\hat{a}_x\) based on the tracking error \(x(t) - x_d(t)\).

Adaptation law. Let \(e(t) = x(t) - x_d(t)\) be the tracking error, and we develop its time derivative \[\begin{align} \dot{e} &= \dot{x} - \dot{x}_d \\ &= - a_d (x - x_d) + (a_d - a + b\hat{a}_x)x + (b \hat{a}_r - b_d) r \\ & = - a_d e + b\underbrace{(\hat{a}_x - \hat{a}_x^\star)}_{=:\tilde{a}_x} x + b \underbrace{(\hat{a}_r - \hat{a}_r^\star )}_{=:\tilde{a}_r} r \\ & = - a_d e + b (\tilde{a}_x x + \tilde{a}_r r) \tag{8.8} \end{align}\] where \(\tilde{a}_x\) and \(\tilde{a}_r\) are the gain errors w.r.t. the optimal gains in (8.7) if the true parameters were known. The error dynamics (8.8) is equivalent to the following transfer function \[\begin{equation} e = \frac{1}{p + a_d} b(\tilde{a}_x x + \tilde{a}_r r) = \frac{1}{p + a_d} \left(b \begin{bmatrix} \tilde{a}_x \\ \tilde{a}_r \end{bmatrix}^T \begin{bmatrix} x \\ r \end{bmatrix} \right), \tag{8.9} \end{equation}\] which is in the form of (8.1). Therefore, we choose the adaptation law \[\begin{equation} \begin{bmatrix} \dot{\tilde{a}}_x \\ \dot{\tilde{a}}_r \end{bmatrix} = - \mathrm{sgn}(b) \gamma e \begin{bmatrix} x \\ r \end{bmatrix}. \tag{8.10} \end{equation}\]

Tracking convergence. With the control law (8.6) and the adaptation law (8.10), we can prove that the tracking error converges to zero, using Lemma 8.1. With \(\tilde{a}=[\tilde{a}_x, \tilde{a}_r]^T\), let \[\begin{equation} V(e,\tilde{a}) = e^2 + \frac{|b|}{\gamma} \tilde{a}^T \tilde{a} \tag{8.11} \end{equation}\] be a positive definite Lyapunov function candidate with time derivative \[ \dot{V} = - 2a_d e^2 \leq 0. \] Clearly, \(e\) and \(\tilde{a}\) are both bounded. Assuming the reference trajectory \(r\) is bounded, we know \(x_d\) is bounded (due to (8.5)) and hence \(x\) is bounded (due to \(e = x - x_d\) being bounded). Consequently, from the error dynamics (8.8) we know \(\dot{e}\) is bounded, which implies \(\ddot{V} = -4a_d e \dot{e}\) is bounded and \(\dot{V}\) is uniformly continuous. By Barbalat’s stability certificate 5.7, we conlude \(e(t) \rightarrow 0\) as \(t \rightarrow \infty\).

It is always better to combine mathematical analysis with intuitive understanding. Can you explain intuitively why the adaptation law (8.10) makes sense? (Hint: think about how the control should react to a negative/positive tracking error.)

Parameter convergence. We have shown the control law (8.6) and the adaptation law (8.10) guarantee to track the reference trajectory. However, is it guaranteed that the gains of the controller (8.6) also converge to the optimal gains in (8.7)?

We will now show that the answer is indefinite and it depends on the reference trajectory \(r(t)\). Because the tracking error \(e\) converges to zero, and \(e\) is the output of a stable filter (8.9), we know the input \(b(\tilde{a}_x x + \tilde{a}_r r)\) must also converge to zero. On the other hand, the adaptation law (8.10) shows that both \(\dot{\tilde{a}}_x\) and \(\dot{\tilde{a}}_r\) converge to zero (due to \(e\) converging to zero and \(x\), \(r\) being bounded). As a result, we know \(\tilde{a} = [\tilde{a}_x,\tilde{a}_r]^T\) converges to a constant that satisfies \[\begin{equation} v^T \tilde{a} = 0, \quad v = \begin{bmatrix} x \\ r \end{bmatrix}, \tag{8.12} \end{equation}\] which is a single linear equation of \(\tilde{a}\) with time-varying coeffients.

Constant reference: no guaranteed convergence. Suppose \(r(t) \equiv r_0 \neq 0\) for all \(t\). From (8.5) we know \(x = x_d = \alpha r_0\) when \(t \rightarrow \infty\), where \(\alpha\) is the constant DC gain of the stable filter. Therefore, the linear equation (8.12) reduces to \[ \alpha \tilde{a}_x + \tilde{a}_r = 0. \] This implies that \(\tilde{a}\) does not necessarily converge to zero. In fact, it converges to a straight line in the parameter space.
Persistent excitation: guaranteed convergence. However, when the signal \(v\) satisfies the so-called persistent excitation condition, which states that for any \(t\), there exists \(T, \beta > 0\) such that \[\begin{equation} \int_{t}^{t+T} v v^T d\tau \geq \beta I, \tag{8.13} \end{equation}\] then \(\tilde{a}\) is guaranteed to converge to zero. To see this, we multiply (8.12) by \(v\) and integrate it from \(t\) to \(t+T\), which gives rise to \[ \left( \int_{t}^{t+T} vv^T d\tau \right) \tilde{a} = 0. \] By the persistent excitation condition (8.13), we infer that \(\tilde{a} = 0\) is the only solution.

It remains to understand under what conditions of the reference trajectory \(r(t)\) can we guarantee the persistent excitation of \(v\). We leave it as an exercise for the reader to show, if \(r(t)\) contains at least one sinusoidal component, then the persistent excitation condition of \(v\) is guaranteed.

Exercise 8.1 (Extension to Nonlinear Systems) Design a control law and an adaptation law for the following system \[ \dot{x} = - a x - c f(x) + b u \] with unknown true parameters \((a,b,c)\) (assume the sign of \(b\) is known) and known nonlinearity \(f(x)\) to track a reference trajectory \(r(t)\). Analyze the convergence of tracking error and parameter estimation error.

8.1.2 High-Order Systems

Consider an \(n\)-th order nonlinear system \[\begin{equation} q^{(n)} + \sum_{i=1}^n \alpha_i f_i(x,t) = bu \tag{8.14} \end{equation}\] where \(x=[q,\dot{q},\ddot{q},\dots,q^{(n-1)}]^T\) is the state of the system, \(f_i\)’s are known nonlinearities, \((\alpha_1,\dots,\alpha_n,b)\) are unknown parameters of the system (with \(\mathrm{sgn}(b)\) known).

The goal of adaptive control is to control the system (8.14) trajectory to follow a desired trajectory \(q_d(t)\) despite no knowing the true parameters.

To facilitate the derivation of the adaptive controller, let us divide both sides of (8.14) by \(b\) \[\begin{equation} h q^{(n)} + \sum_{i=1}^n a_i f_i(x,t) = u \tag{8.15} \end{equation}\] where \(h = 1 / b\) and \(a_i = \alpha_i / b\).

Control law. Recall that the choice of the control law is tyically inspired by the control design if the true system parameters were known. We will borrow ideas from sliding control (Appendix G).

Known parameters. Let \(e = q(t) - q_d(t)\) be the tracking error, and define the following combined error \[ s = e^{(n-1)} + \lambda_{n-2} e^{(n-2)} + \dots + \lambda_0 e = \Delta(p) e \] where \(\Delta(p) = p^{n-1} + \lambda_{n-2} p^{(n-2)} + \dots + \lambda_0\) is a stable polynomial with user-chosen coeffients \(\lambda_0,\dots,\lambda_{n-2}\). The rationale for defining the combined error \(s\) is that the convergence of \(e\) to zero can be guaranteed by the convergence of \(s\) to zero (when \(\Delta(p)\) is stable). Note that \(s\) can be equivalently written as \[\begin{align} s & = (q^{(n-1)} - q_d^{(n-1)}) + \lambda_{n-2} e^{(n-2)} + \dots + \lambda_0 e \\ & = q^{(n-1)} - \underbrace{ \left( q_d^{(n-1)} - \lambda_{n-2} e^{(n-2)} - \dots - \lambda_0 e \right) }_{q_r^{(n-1)}}. \end{align}\] Now consider the control law \[\begin{equation} u = h q_r^{(n)} - ks + \sum_{i=1}^n a_i f_i(x,t) \tag{8.16} \end{equation}\] where \[ q_r^{(n)} = q_d^{(n)} - \lambda_{n-2} e^{(n-1)} - \dots - \lambda_0 \dot{e} \] and \(k\) is a design constant that has the same sign as \(h\). This choice of control, plugged into the system dynamics (8.15), leads to \[\begin{align} h q^{(n)} + \sum_{i=1}^n a_i f_i(x,t) = h q_r^{(n)} - ks + \sum_{i=1}^n a_i f_i(x,t) \Longleftrightarrow \\ h \left( q^{(n)} - q_r^{(n)} \right) + ks = 0 \Longleftrightarrow \\ h \dot{s} + ks = 0, \end{align}\] which guarantees the exponential convergence of \(s\) to zero (note that \(h\) and \(k\) have the same sign), and hence the convergence of \(e\) to zero.
Unknown parameters. Inspired by the control law with known parameters in (8.16), we design the adapative control law as \[\begin{equation} u = \hat{h} q_r^{(n)} - ks + \sum_{i=1}^n \hat{a}_i f_i(x,t), \tag{8.17} \end{equation}\] where the time-varying gains \(\hat{h},\hat{a}_1,\dots,\hat{a}_n\) will be adjusted by an adaptation law.

Adaptation law. Inserting the adapative control law (8.17) into the system dynamics (8.15), we obtain \[\begin{align} h \dot{s} + ks = \tilde{h} q_r^{(n)} + \sum_{i=1}^n \tilde{a}_i f_i (x,t) \Longleftrightarrow \\ s = \frac{1}{p + k/h} \frac{1}{h} \underbrace{ \left( \begin{bmatrix} \tilde{h} \\ \tilde{a}_1 \\ \vdots \\ \tilde{a}_n \end{bmatrix}^T \begin{bmatrix} q_r^{(n)} \\ f_1(x,t) \\ \vdots \\ f_n(x,t) \end{bmatrix} \right)}_{=:\phi^T v} \tag{8.18} \end{align}\] where \(\tilde{h} = \hat{h} - h\) and \(\tilde{a}_i = \hat{a}_i - a_i,i=1,\dots,n\). Again, (8.18) is in the familiar form of (8.1), which naturally leads to the following adaptation law with \(\gamma > 0\) a chosen constant \[\begin{equation} \dot{\phi} = \begin{bmatrix} \dot{\tilde{h}} \\ \dot{\tilde{a}}_1 \\ \vdots \\ \dot{\tilde{a}}_n \end{bmatrix} = - \mathrm{sgn}(h) \gamma s \begin{bmatrix} q_r^{(n)} \\ f_1(x,t) \\ \vdots \\ f_n(x,t) \end{bmatrix}. \tag{8.19} \end{equation}\]

Tracking and parameter convergence. With the following Lyapunov function \[\begin{equation} V(s,\phi) = |h| s^2 + \frac{1}{\gamma} \phi^T \phi, \quad \dot{V}(s,\phi) -2|k| s^2, \tag{8.20} \end{equation}\] the global convergence of \(s\) to zero can be easily shown. For parameter convergence, it is easy to see that when \(v\) satisfies the persistent excitation condition, we have that \(\phi\) converges to zero. (However, the relationship between the reference trajectory \(q_d(t)\) and the persistent excitation of \(v\) becomes nontrivial due to the nonlinearities \(f_i\).)

8.1.3 Robotic Manipulator

So far our focus has been on systems with a single input (\(u \in \mathbb{R}\)). In the following, we will show that similar techniques can be applied to adapative control of systems with multiple inputs, particularly, trajectory control of a robotic manipulator.

Let \(q \in \mathbb{R}^n\) be the joint angles of a multi-link robotic arm, and \(\dot{q} \in \mathbb{R}^n\) be the joint velocities. The dynamics of a robotic manipulator reads \[\begin{equation} H(q) \ddot{q} + C(q,\dot{q})\dot{q} + g(q) = \tau, \tag{8.21} \end{equation}\] where \(H(q) \in \mathbb{S}^{n}_{++}\) is the manipulator inertia matrix (that is positive definite), \(C(q,\dot{q})\dot{q}\) is a vector of centripetal and Coriolis torques (with \(C(q,\dot{q}) \in \mathbb{R}^{n \times n}\)), and \(g(q)\) denotes gravitational torques.

Figure 8.1: Planar two-link manipulator

Example 8.1 (Planar Two-link Manipulator) The dynamics of a planar two-link manipulator in Fig. 8.1 is \[\begin{equation} \begin{bmatrix} H_{11} & H_{12} \\ H_{21} & H_{22} \end{bmatrix} \begin{bmatrix} \ddot{q}_1 \\ \ddot{q}_2 \end{bmatrix} + \begin{bmatrix} - h \dot{q}_2 & -h (\dot{q}_1 + \dot{q}_2) \\ h \dot{q}_1 & 0 \end{bmatrix} \begin{bmatrix} \dot{q}_1 \\ \dot{q}_2 \end{bmatrix} = \begin{bmatrix} \tau_1 \\ \tau_2 \end{bmatrix}, \tag{8.22} \end{equation}\] where \[\begin{align} H_{11} & = a_1 + 2 a_3 \cos q_2 + 2 a_4 \sin q_2 \\ H_{12} & = H_{21} = a_2 + a_3 \cos q_2 + a_4 \sin q_2 \\ H_{22} &= a_2 \\ h &= a_3 \sin q_2 - a_4 \cos q_2 \end{align}\] with \[\begin{align} a_1 &= I_1 + m_1 l_{c1}^2 + I_e + m_e l_{ce}^2 + m_e l_1^2 \\ a_2 &= I_e + m_e l_{ce}^2 \\ a_3 &= m_e l_1 l_{ce} \cos \delta_e \\ a_4 &= m_e l_1 l_{ce} \sin \delta_e. \end{align}\]

As seen from the above example, the parameters \(a\) (which are nonlinear functions of the physical parameters such as mass and length) enter linearly in \(H\) and \(C\) (\(g(q)\) is ignored because the manipulator is on a horizontal plane).

The goal of the control design is to have the manipulator track a desired trajectory \(q_d(t)\).

Known parameters. When the parameters are known, we follow the sliding control design framework. Let \(\tilde{q} = q(t) - q_d(t)\) be the tracking error, and define the combined error \[ s = \dot{\tilde{q}} + \Lambda \tilde{q} = \dot{q} - \underbrace{\left( \dot{q}_d - \Lambda \tilde{q} \right)}_{\dot{q}_r} \] where \(\Lambda \in \mathbb{S}^n_{++}\) is a user-chosen positive definite matrix (in general we want \(-\Lambda\) to be Hurwitz). In this case, \(s \rightarrow 0\) implies \(\tilde{q} \rightarrow 0\) as \(t \rightarrow \infty\). Choosing the control law (coming from feedback linearization Appendix F) \[\begin{equation} \tau = H \ddot{q}_r - K_D s + C \dot{q} + g(q) \tag{8.23} \end{equation}\] with \(K_D \in \mathbb{S}^n_{++}\) positive definite leads to the closed-loop dynamics \[ H \dot{s} + K_D s = 0 \Longleftrightarrow \dot{s} = - H^{-1} K_D s. \] Because the matrix \(H^{-1} K_D\) is the product of two positive definite matrices (recall \(H\) is positive definite and so is \(H^{-1}\)), it has strictly positive real eigenvalues.¹³ Hence, \(- H^{-1} K_D\) is Hurwitz and \(s\) is guaranteed to converge to zero.

Control law. A closer look at the controller (8.23) allows us to write it in the following form \[\begin{align} \tau &= H \ddot{q}_r + C(s + \dot{q}_r) + g(q) - K_D s \\ &= H \ddot{q}_r + C \dot{q}_r + g(q) + (C - K_D) s \\ &= Y (q,\dot{q},\dot{q}_r,\ddot{q}_r) a + (C - K_D) s \end{align}\] where \(a \in \mathbb{R}^m\) contains all the parameters and \(Y \in \mathbb{R}^{n \times m}\) is the matrix that collects all the coeffients of \(a\) in \(H \ddot{q}_r + C \dot{q}_r + g(q)\). As a result, we design the adapative control law to be \[\begin{equation} \tau = Y \hat{a} - K_D s, \tag{8.24} \end{equation}\] with \(\hat{a}\) the time-varying parameter that we wish to adapt. Note that here we have done something strange: the adapative control law does not exactly follow the controller (8.23) in the known-parameter case.¹⁴ We first separated \(s\) from \(\dot{q}\) and wrote \(Ya = H \ddot{q}_r + C \dot{q}_r + g\) instead of \(Ya = H \ddot{q}_r + C \dot{q} + g\); then we dropped the “\(C\)” matrix in front of \(s\) in the adapative control law. The reason for doing this will soon become clear when we analyze the tracking convergence.

Adaptation law and tracking convergence. Recall that the key of adapative control is to design a control law and an adaptation law such that global converge of the tracking error \(s\) can be guaranteed by a Lyapunov function. Looking at the previous Lyapunov functions in (8.11) and (8.20), we see that they both contain a positive definite term in the tracking error \(s\) (or \(e\) if in first-order systems) and another positive definite term in the parameter error \(\tilde{a}\). This hints us that we may try a Lyapunov candidate function of the following form \[\begin{equation} V = \frac{1}{2} \left( s^T H s + \tilde{a} \Gamma^{-1} \tilde{a} \right), \tag{8.25} \end{equation}\] where \(\Gamma \in \mathbb{S}^m_{++}\) is a constant positive definite matrix, and \(\tilde{a} = \hat{a} - a\) is the parameter error.

The next step would be to derive the time derivative of \(V\), which, as we can expect, will contain a term that involves \(\dot{H}\) and complicates our analysis. Fortunately, the following lemma will help us.

Lemma 8.2 For the manipulator dynamics (8.21), there exists a way to define \(C\) such that \(\dot{H} - 2C\) is skew-symmetric.

Proof. See Section 9.1, page 399-402 in (Slotine, Li, et al. 1991). You should also check if this is true for the planar two-link manipulator dynamics in Example 8.1.

With Lemma 8.2, the time derivative of \(V\) in (8.25) reads \[\begin{align} \dot{V} & = s^T H \dot{s} + \frac{1}{2} s^T \dot{H} s + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \\ &= s^T (H \ddot{q} - H \ddot{q}_r) + \frac{1}{2} s^T \dot{H} s + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \\ &= s^T (\tau - C \dot{q} - g - H \ddot{q}_r ) + \frac{1}{2} s^T \dot{H} s + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \tag{8.26} \\ &= s^T (\tau - H \ddot{q}_r - C (s + \dot{q}_r) - g) + \frac{1}{2} s^T \dot{H} s + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \\ &= s^T (\tau - H \ddot{q}_r - C \dot{q}_r - g) + \frac{1}{2} s^T (\dot{H}- 2C)s + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \tag{8.27}\\ &= s^T (\tau - H \ddot{q}_r - C \dot{q}_r - g) + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \\ &=s^T(Y\hat{a} - K_D s - Ya) + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} \tag{8.28}\\ &=s^T Y \tilde{a} + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}} - s^T K_D s \tag{8.29}, \end{align}\] where we used the manipulator dynamics (8.21) to rewrite \(H\ddot{q}\) in (8.26), used \(\dot{H} - 2C\) is skew-symmetric in (8.27), invoked the adapative control law (8.24) and reused \(Ya = H \ddot{q}_r + C \dot{q}_r + g(q)\) in (8.28). The derivation above explains why the choice of the control law in (8.24) did not exactly follow its counterpart when the parameters are known: we need to use \(s^T Cs\) to cancel \(\frac{1}{2} s^T \dot{H} s\) in (8.27).

We then wonder if we can design \(\dot{\tilde{a}}\) such that \(\dot{V}\) in (8.29) is negative semidefinie? This turns out to be straightforward with the adaptation law \[\begin{equation} \dot{\tilde{a}} = -\Gamma Y^T s, \tag{8.30} \end{equation}\] to make \(s^T Y \tilde{a} + \tilde{a}^T \Gamma^{-1} \dot{\tilde{a}}\) vanish and so \[ \dot{V} = - s^T K_D s \leq 0. \]

We are not done yet. To show \(s\) converges to zero (which is implied by \(\dot{V}\) converges to zero), by Barbalat’s stability certificate 5.7, it suffices to show \[ \ddot{V} = -2 s^T K_D \dot{s} \] is bounded. We already know \(s\) and \(\tilde{a}\) are bounded, due to the fact that \(V\) in (8.25) is bounded. Therefore, we only need to show \(\dot{s}\) is bounded. To do so, we plug the adapative control law (8.24) into the manipulator dynamics (8.21) and obtain \[ H \dot{s} + (C + K_D) s = Y\tilde{a}, \] which implies the boundedness of \(\dot{s}\) (note that \(H\) is uniformly positive definite, i.e., \(H \succeq \alpha I\) for some \(\alpha > 0\)). This concludes the analysis of the tracking convergence \(s \rightarrow 0\) as \(t \rightarrow \infty\).

8.2 Certainty-Equivalent Adaptive Control

References

Slotine, Jean-Jacques E, Weiping Li, et al. 1991. Applied Nonlinear Control. Vol. 199. 1. Prentice hall Englewood Cliffs, NJ.

Consider two positive definite matrices \(A\) and \(B\), let \(B = B^{1/2}B^{1/2}\). The product \(AB\) can be written as \(AB = A B^{1/2}B^{1/2} = B^{-1/2} (B^{1/2} A B^{1/2}) B^{1/2}\). Therefore \(AB\) is similar to \(B^{1/2} A B^{1/2}\) and is positive definite.↩︎
In fact, one can show that the controller (8.24) with known parameters, i.e., \(\tau = Y a - K_D s\), also guarantees the convergence of \(s\) towards zero, though it is different from the feedback linearization controller (8.23). Try proving the convergence with a Lyapunov candidate \(V = \frac{1}{2}s^T H s\).↩︎