arXiv:1605.07834v1 [cs.RO] 25 May 2016 Dynamic analysis of simultaneous adaptation of force, impedance and trajectory Y. Li and E. Burdet When carrying out tasks in contact with the environment, humans are found to concurrently adapt force, impedance and trajectory. Here we develop a robotic model of this mechanism in humans and analyse the underlying dynamics. We derive a general adaptive controller for the interaction of a robot with an environment solely characterised by its stiffness and damping, using Lyapunov theory. I. SYSTEM DYNAMICS The dynamics of a n-degree-of-freedom (n-DOF) robot in the operational space are given by M(q) ¨x + C(q, ˙q) ˙x + G(q) = u + f (1) where x is the position of the robot and q the vector of joints angle. M(q) denotes the inertia matrix, C(q, ˙q) ˙x the Coriolis and centrifugal forces, and G(q) the gravitational force, which can be identified using e.g. nonlinear adaptive control [1]. u is the control input and f the interaction force. In [2], we have described the control input u in two parts: u = v + w , (2) with v to track the reference trajectory xr by compensating for the robot’s dynamics, i.e. v = M(q) ¨xe + C(q, ˙q) ˙xe + G(q) −Γε (3) where ˙xe = ˙xr −αe , e ≡x −xr , α > 0 , (4) Γ a symmetric positive-definite matrix with minimal eigen- value λmin(Γ) ⩾λΓ > 0 and ε ≡˙e + α e (5) the tracking error. w is to adapt impedance and force in order to compensate for the unknown interaction dynamics. II. FORCE AND IMPEDANCE ADAPTATION Suppose that the interaction force can be expanded as f = F ∗ 0 + K∗ S(x −x∗ 0) + K∗ D ˙x , (6) where the force F ∗ 0 (t), stiffness K∗ S(t) and damping K∗ D(t) are feedforward components of the interaction force, x∗ 0(t) is the rest position of the environment visco-elasticity and all of these functions are unknown but periodic with T : F ∗ 0 (t + T ) ≡ F ∗ 0 (t) , K∗ S(t + T ) ≡K∗ S(t) , (7) K∗ D(t + T ) = K∗ D(t) , x∗ 0(t + T ) = x∗ 0(t) . (8) The authors are with the Department of Bioengineering, Imperial College of Science, Technology and Medicine, London SW72AZ, UK. To simplify the analysis, we rewrite the interaction force as f ≡F ∗+ K∗ S x + K∗ D ˙x (9) where F ∗≡F ∗ 0 −K∗ Sx∗ 0 is also periodic with T . w in Eq.(2) is then defined as w = −F −KSx −KD ˙x (10) where KS and KD are stiffness and damping matrices, respectively, and F is the feedforward force. By substituting the control input u into Eq.(1), the closed- loop system dynamics are described by M(q) ˙ε + C(q, ˙q) ε + Γε = eF + eKS x + eKD ˙x , (11) eF ≡F ∗−F , eKS ≡K∗ S −KS , eKD ≡K∗ D −KD . In this equation, we see that the feedforward force F, stiffness KS and damping KD ensure contact stability by compensating for the interaction dynamics. Therefore, the objective of force and impedance adaptation is to minimise these residual errors which can be carried out through minimising the cost function Jc(t) ≡ 1 2 Z t t−T eF T Q−1 F eF + vecT ( eKS)Q−1 S vec( e KS) +vecT ( eKD)Q−1 D vec( eKD) dτ , (12) where QF, QS and QD are symmetric positive-definite matrices, and vec(·) stands for the column vectorization operation. This objective is achieved through the following update laws: δF(t) ≡ F(t) −F(t −T ) ≡QF [ε(t) −β(t)F(t)] (13) δKS(t) ≡ KS(t) −KS(t −T ) = QS[ε(t)x(t)T −β(t)KS(t)] δKD(t) ≡ KD(t) −KD(t −T ) = QD[ε ˙x(t)T −β(t)KD(t)] where F, KS and KD are initialised as zero matrices/vectors with proper dimensions for t ∈[0, T ). Now that we have dealt with the interaction dynamics, stable trajectory control can be obtained by minimising the cost function Je(t) ≡1 2ε(t)T M(q) ε(t) . (14) Consequently, we use a combined cost function Jce ≡Jc + Je that yields concurrent minimisation of tracking error and control effort. III. TRAJECTORY ADAPTATION In a typical interaction task, the contact between the robot and the environment is maintained through a desired interaction force Fd. Assuming that there exists a desired trajectory xd yielding Fd, i.e. from Eq.(6) Fd = F ∗ 0 + K∗ S(xd −x∗ 0) + K∗ D ˙xd (15) = F ∗+ K∗ S xd + K∗ D ˙xd , F ∗= F ∗ 0 −K∗ S x∗ 0 , we propose to adapt the reference xr in order to track xd. However, xd is unknown as the parameters F ∗, K∗ S and K∗ D in the interaction force are unknown. Nevertheless, we know that xd is periodic with T as F ∗, K∗ S and K∗ D are periodic with T and we also set Fd to be periodic with T . In the following, we develop an update law to learn the desired trajectory xd. First, we define ξd ≡K∗ S xd + K∗ D ˙xd , ξr ≡KS xr + KD ˙xr . (16) Then, we develop the following update law δξr(t) ≡ξr −ξr(t −T ) ≡L−T Qr(Fd(t) −F(t) −ξr(t)) (17) where Qr and L are positive-definite constant gain matrices. This update law minimises the error between ξd and ξr, which is described by the following cost function Jr ≡1 2 Z t t−T (ξr −ξd)T QT r (ξr −ξd) dτ . (18) Because of the coupling of adaptation of force and impedance and trajectory adaptation, we modify the adap- tation of feedforward force Eq.(13) to δF(t) ≡QF[ε(t) −β(t)F(t) + QT r δξr(t)] . (19) As a result, update laws Eqs.(17) and (19) minimise the overall cost J = Jc + Je + Jr as shown in Appendix A. Then, we obtain the update law for trajectory adaptation δxr ≡xr(t) −xr(t −T ) (20) by solving δξr = KS δxr + KD δ ˙xr = KS δxr + KD d dt(δxr) (21) using δξr(t) from Eq.(17). According to the convergence of δξr, KS and KD as shown in Appendix A, xr will converge, as δξr −ξd = KSδxr + KDδ ˙xr , (22) Upon convergence, the desired interaction force Fd is main- tained between the robot and the environment according to Eq.(17). At the same time, the properties with adaptation of force and impedance are preserved which include trajectory tracking and control effort minimisation. However, from the analysis in Appendix A, we cannot draw the conclusion that F, KS, KD and xr converge to F ∗, K∗ S, K∗ D and xd, respectively, which will require the condition of persistent excitation (PE), similar to classical adaptive control theory [3]. IV. DISCUSSION A. No contact In a special case when there is no force applied by the environment and Fd is also zero, the controller component w will converge to zero. According to the update law Eq.(17), the reference trajectory will not adapt, as expected. B. No damping If we neglect the damping component in the interaction force f of Eq.(9), the trajectory adaptation described by Eqs.(17) and (21) can be simplified to δxr = L−T Qr(Fd −F −KS xr) (23) Correspondingly, the update laws for force and impedance Eq.(13) needs to be modified as δF ≡ QF (ε −βF + QT r δxr) , (24) δKS ≡ QS(ε xT −βKS + xT r QT r δxr) . The stability analysis is similar to the case with damping and is briefly explained in Appendix B. C. Force sensing As in [2], force sensing is not required in the proposed framework, in contrast to traditional methods for surface following where the force feedback is used to regulate the interaction force [4]. In particular, in a first phase force and impedance adap- tation is used to compensate for the interaction force from the environment. During this process, the unknown actual interaction force is estimated when the tracking error ε goes to zero as can be seen from Eq.(11): when ε = 0, we have w = −f. (25) Using this estimated interaction force, then a desired force in Eq.(15) can be rendered by adaptation of the reference trajectory xr. In this sense, it is important to note that trajectory adap- tation should be conducted only when force and impedance adaptation takes effect, which guarantees compensation of the interaction force and tracking of the current reference trajectory. Nevertheless, as shown in above stability analysis, adaptation of force, impedance and trajectory can be realised simultaneously. This also suggests that a force sensor should be used if available, as force and impedance adaptation could then be replaced by force feedback. In this way, trajectory adaptation would not depend on the force estimation process and can in principle happen faster than force and impedance adaptation is needed. However, the potential advantages of a force sensor depends on the quality of the signal it could provide, its cost and the complexity of its installation and use. V. APPENDIX A. Proof for minimisation of overall cost J Considering the definition of Jr in Eq. (18), we have δJr(t) ≡Jr(t) −Jr(t −T ) = 1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ) −ξd(τ)] dτ −1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ −T ) −ξd(τ −T )] dτ +1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ −T ) −ξd(τ −T )] dτ −1 2 Z t t−T [ξr(τ −T ) −ξd(τ −T )]T QT r × [ξr(τ −T ) −ξd(τ −T )] dτ = 1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r δξr(τ) dτ +1 2 Z t t−T [ξr(τ −T ) −ξd(τ −T )]T QT r δξr(τ) dτ = Z t t−T [ξr −ξd −1 2δξr]T QT r δξr dτ (as ξd(t) = ξd(t −T )) ⩽ Z t t−T [Qr(ξr(τ) −ξd(τ))]T δξr(τ) dτ . (26) According to Eqs.(15) to (17), we rewrite this inequality as δJr ⩽ Z t t−T [Qr(ξr −Fd + F + eF)]T δξr dτ = Z t t−T (−LT δξr + Qr eF)T δξr dτ. (27) Consider the difference between Jc of two consecutive periods δJc ≡Jc −Jc(t −T ) (28) = 1 2 Z t t−T [( eF T Q−1 F eF −eF T (τ −T )Q−1 F eF(τ −T )) +tr( eKT S Q−1 S eKS −eKT S (τ −T )Q−1 S eKS(τ −T ) +( eKT DQ−1 D eKD −eKT D(τ −T )Q−1 D eKD(τ −T ))] dτ where tr(·) stands for the trace of a matrix. We consider that eF T (τ)Q−1 F eF(τ) −eF T (τ −T )Q−1 F eF(τ −T ) = [ eF T (τ)Q−1 F eF(τ) −eF T (τ)Q−1 F eF(τ −T )] +[ eF T(τ)Q−1 F eF(τ −T ) −eF T (τ −T )Q−1 F eF(τ −T )] = −eF T(τ)Q−1 F δF(τ) −eF T (τ −T )Q−1 F δF(τ) = −(2 eF T(τ) + δF(τ))Q−1 F δF(τ) ⩽ −2 eF T(τ)Q−1 F δF(τ) = −2 eF T(τ)[ε(τ) −β(τ)F(τ) + QT r δξr(τ)] (29) Then, similarly, we have tr[ eKT S (τ)Q−1 S eKS(τ) −eKT S (τ)(τ −T )Q−1 S eKS(τ −T )] ⩽−2tr{ eKT S (τ)[ε(τ)xT (τ) −β(τ)KS(τ)]} tr[ eKT D(τ)Q−1 d eKD(τ) −eKT D(τ −T )Q−1 D eKD(τ −T )] ⩽−2tr[ eKT D(τ)(ε(τ) ˙xT (τ) −β(τ)KD(τ))] (30) Substituting Ineqs. (29) and (30) into Eq.(28) and considering Ineq. (27) yields δJr + δJc ⩽ Z t t−T −δξT r Lδξr −eF T (ε −βF) (31) −tr[ eKT S (εxT −βKS)] −tr[ eKT D(ε ˙xT −βKD)] dτ . The rest is to deal with the residual in the above inequality, which is similar to that in [2]. For completeness, we show the outline in the following. In particular, we consider the time derivative of Je ˙Je = εT M(q, ˙q) ˙ε + 1 2εT ˙M(q, ˙q)ε = εT M(q, ˙q) ˙ε + 1 2εT C(q)ε (32) as [5] zT ˙Mz ≡zT Cz ∀z . (33) Considering the closed-loop dynamics Eq.(11), above equa- tion can be written as ˙Je(t) ≡εT ( eF T + eKT S x + eKT D ˙x −Γε) . (34) Integrating ˙Je from t −T to t and considering Ineq. (31), we obtain δJ = δJc + δJr + δJe ⩽ Z t t−T −εT Γε −δξT r Lδξr +β[ eF T F + tr( eKT S KS + eKT DKD)] dτ = Z t t−T −εT Γε −δξT r Lδξr −β[ eF T eF + tr( eKT S eKS + eKT D eKD)] +β[ eF T F ∗+ tr( eKT S K∗ S + eKT DK∗ D)] dτ . (35) A sufficient condition for δJ ⩽0 is λΓ∥ε∥2 + λL∥δξr∥2 + β(∥eF∥2 + ∥eKS∥2 + ∥eKD∥2) −β(∥eF∥∥F ∗∥+ ∥eKS∥∥K∗ S∥+ ∥eKD∥∥K∗ D∥) ≥0 .(36) where λΓ and λL are the minimal eigenvalues of Γ and L, respectively. Therefore, ∥ε∥, ∥δξr∥, ∥eF∥, ∥eKS∥and ∥eKD∥ are bounded. In particular, they satisfy λΓ∥ε∥2 + λL∥δξr∥2 + β 2 (∥eF∥2 + ∥eKS∥2 + ∥eKD∥2) ≤β 2 (∥F ∗∥2 + ∥K∗ S∥2 + ∥K∗ D∥2) . (37) By choosing large λΓ and λL, ∥ε∥and ∥δξr∥can be made small. B. Proof for minimisation of overall cost when neglecting damping Consider the cost function J′ r ≡1 2 Z t t−T (xr −xd)T K∗T S QT r (xr −xd) dτ . (38) Following similar procedures to Ineqs. (26), (27), we obtain δJ′ r ⩽ Z t t−T [−LT δxr + Qr( eF + eKSxr)]T δxr dτ (39) Considering further the cost function J′ c ≡1 2 Z t t−T eF T Q−1 F eF + vecT ( e KS)Q−1 S vec( eKS) dτ . (40) and following similar procedures from Ineqs.(28) to (31), we obtain δJ′ r + δJ′ c (41) ⩽ Z t t−T −δxT r Lδxr −eF T (ε −βF) −tr[ e KT S (ε xT −βKS)] dτ . The rest is similar to the case with damping and thus omitted. REFERENCES [1] E Burdet, A Codourey and L Rey (1998), Experimental evaluation of nonlinear adaptive controllers. IEEE Control Systems Magazine 18(2): 39-47. [2] C Yang, G Ganesh, S Haddadin, S Parusel, A Albu-Schaeffer and E Burdet (2011), Human-like adaptation of force and impedance in stable and unstable interactions. IEEE Transaction on Robotics 27(5): 918-30. [3] KJ Astrom and B Wittenmark (1995), Adaptive control. Reading, MA: Addison-Wesley. [4] S Jung, TC Hsia and RG Bonitz (2001), Force tracking impedance control for robot manipulators with an unknown environment: theory, simulation, and experiment. The International Journal of Robotics Research 20(9) 765-74. [5] C de Wit, B Siciliano and G Bastin (1996), Theory of robot control. Springer. arXiv:1605.07834v1 [cs.RO] 25 May 2016 Dynamic analysis of simultaneous adaptation of force, impedance and trajectory Y. Li and E. Burdet When carrying out tasks in contact with the environment, humans are found to concurrently adapt force, impedance and trajectory. Here we develop a robotic model of this mechanism in humans and analyse the underlying dynamics. We derive a general adaptive controller for the interaction of a robot with an environment solely characterised by its stiffness and damping, using Lyapunov theory. I. SYSTEM DYNAMICS The dynamics of a n-degree-of-freedom (n-DOF) robot in the operational space are given by M(q) ¨x + C(q, ˙q) ˙x + G(q) = u + f (1) where x is the position of the robot and q the vector of joints angle. M(q) denotes the inertia matrix, C(q, ˙q) ˙x the Coriolis and centrifugal forces, and G(q) the gravitational force, which can be identified using e.g. nonlinear adaptive control [1]. u is the control input and f the interaction force. In [2], we have described the control input u in two parts: u = v + w , (2) with v to track the reference trajectory xr by compensating for the robot’s dynamics, i.e. v = M(q) ¨xe + C(q, ˙q) ˙xe + G(q) −Γε (3) where ˙xe = ˙xr −αe , e ≡x −xr , α > 0 , (4) Γ a symmetric positive-definite matrix with minimal eigen- value λmin(Γ) ⩾λΓ > 0 and ε ≡˙e + α e (5) the tracking error. w is to adapt impedance and force in order to compensate for the unknown interaction dynamics. II. FORCE AND IMPEDANCE ADAPTATION Suppose that the interaction force can be expanded as f = F ∗ 0 + K∗ S(x −x∗ 0) + K∗ D ˙x , (6) where the force F ∗ 0 (t), stiffness K∗ S(t) and damping K∗ D(t) are feedforward components of the interaction force, x∗ 0(t) is the rest position of the environment visco-elasticity and all of these functions are unknown but periodic with T : F ∗ 0 (t + T ) ≡ F ∗ 0 (t) , K∗ S(t + T ) ≡K∗ S(t) , (7) K∗ D(t + T ) = K∗ D(t) , x∗ 0(t + T ) = x∗ 0(t) . (8) The authors are with the Department of Bioengineering, Imperial College of Science, Technology and Medicine, London SW72AZ, UK. To simplify the analysis, we rewrite the interaction force as f ≡F ∗+ K∗ S x + K∗ D ˙x (9) where F ∗≡F ∗ 0 −K∗ Sx∗ 0 is also periodic with T . w in Eq.(2) is then defined as w = −F −KSx −KD ˙x (10) where KS and KD are stiffness and damping matrices, respectively, and F is the feedforward force. By substituting the control input u into Eq.(1), the closed- loop system dynamics are described by M(q) ˙ε + C(q, ˙q) ε + Γε = eF + eKS x + eKD ˙x , (11) eF ≡F ∗−F , eKS ≡K∗ S −KS , eKD ≡K∗ D −KD . In this equation, we see that the feedforward force F, stiffness KS and damping KD ensure contact stability by compensating for the interaction dynamics. Therefore, the objective of force and impedance adaptation is to minimise these residual errors which can be carried out through minimising the cost function Jc(t) ≡ 1 2 Z t t−T eF T Q−1 F eF + vecT ( eKS)Q−1 S vec( e KS) +vecT ( eKD)Q−1 D vec( eKD) dτ , (12) where QF, QS and QD are symmetric positive-definite matrices, and vec(·) stands for the column vectorization operation. This objective is achieved through the following update laws: δF(t) ≡ F(t) −F(t −T ) ≡QF [ε(t) −β(t)F(t)] (13) δKS(t) ≡ KS(t) −KS(t −T ) = QS[ε(t)x(t)T −β(t)KS(t)] δKD(t) ≡ KD(t) −KD(t −T ) = QD[ε ˙x(t)T −β(t)KD(t)] where F, KS and KD are initialised as zero matrices/vectors with proper dimensions for t ∈[0, T ). Now that we have dealt with the interaction dynamics, stable trajectory control can be obtained by minimising the cost function Je(t) ≡1 2ε(t)T M(q) ε(t) . (14) Consequently, we use a combined cost function Jce ≡Jc + Je that yields concurrent minimisation of tracking error and control effort. III. TRAJECTORY ADAPTATION In a typical interaction task, the contact between the robot and the environment is maintained through a desired interaction force Fd. Assuming that there exists a desired trajectory xd yielding Fd, i.e. from Eq.(6) Fd = F ∗ 0 + K∗ S(xd −x∗ 0) + K∗ D ˙xd (15) = F ∗+ K∗ S xd + K∗ D ˙xd , F ∗= F ∗ 0 −K∗ S x∗ 0 , we propose to adapt the reference xr in order to track xd. However, xd is unknown as the parameters F ∗, K∗ S and K∗ D in the interaction force are unknown. Nevertheless, we know that xd is periodic with T as F ∗, K∗ S and K∗ D are periodic with T and we also set Fd to be periodic with T . In the following, we develop an update law to learn the desired trajectory xd. First, we define ξd ≡K∗ S xd + K∗ D ˙xd , ξr ≡KS xr + KD ˙xr . (16) Then, we develop the following update law δξr(t) ≡ξr −ξr(t −T ) ≡L−T Qr(Fd(t) −F(t) −ξr(t)) (17) where Qr and L are positive-definite constant gain matrices. This update law minimises the error between ξd and ξr, which is described by the following cost function Jr ≡1 2 Z t t−T (ξr −ξd)T QT r (ξr −ξd) dτ . (18) Because of the coupling of adaptation of force and impedance and trajectory adaptation, we modify the adap- tation of feedforward force Eq.(13) to δF(t) ≡QF[ε(t) −β(t)F(t) + QT r δξr(t)] . (19) As a result, update laws Eqs.(17) and (19) minimise the overall cost J = Jc + Je + Jr as shown in Appendix A. Then, we obtain the update law for trajectory adaptation δxr ≡xr(t) −xr(t −T ) (20) by solving δξr = KS δxr + KD δ ˙xr = KS δxr + KD d dt(δxr) (21) using δξr(t) from Eq.(17). According to the convergence of δξr, KS and KD as shown in Appendix A, xr will converge, as δξr −ξd = KSδxr + KDδ ˙xr , (22) Upon convergence, the desired interaction force Fd is main- tained between the robot and the environment according to Eq.(17). At the same time, the properties with adaptation of force and impedance are preserved which include trajectory tracking and control effort minimisation. However, from the analysis in Appendix A, we cannot draw the conclusion that F, KS, KD and xr converge to F ∗, K∗ S, K∗ D and xd, respectively, which will require the condition of persistent excitation (PE), similar to classical adaptive control theory [3]. IV. DISCUSSION A. No contact In a special case when there is no force applied by the environment and Fd is also zero, the controller component w will converge to zero. According to the update law Eq.(17), the reference trajectory will not adapt, as expected. B. No damping If we neglect the damping component in the interaction force f of Eq.(9), the trajectory adaptation described by Eqs.(17) and (21) can be simplified to δxr = L−T Qr(Fd −F −KS xr) (23) Correspondingly, the update laws for force and impedance Eq.(13) needs to be modified as δF ≡ QF (ε −βF + QT r δxr) , (24) δKS ≡ QS(ε xT −βKS + xT r QT r δxr) . The stability analysis is similar to the case with damping and is briefly explained in Appendix B. C. Force sensing As in [2], force sensing is not required in the proposed framework, in contrast to traditional methods for surface following where the force feedback is used to regulate the interaction force [4]. In particular, in a first phase force and impedance adap- tation is used to compensate for the interaction force from the environment. During this process, the unknown actual interaction force is estimated when the tracking error ε goes to zero as can be seen from Eq.(11): when ε = 0, we have w = −f. (25) Using this estimated interaction force, then a desired force in Eq.(15) can be rendered by adaptation of the reference trajectory xr. In this sense, it is important to note that trajectory adap- tation should be conducted only when force and impedance adaptation takes effect, which guarantees compensation of the interaction force and tracking of the current reference trajectory. Nevertheless, as shown in above stability analysis, adaptation of force, impedance and trajectory can be realised simultaneously. This also suggests that a force sensor should be used if available, as force and impedance adaptation could then be replaced by force feedback. In this way, trajectory adaptation would not depend on the force estimation process and can in principle happen faster than force and impedance adaptation is needed. However, the potential advantages of a force sensor depends on the quality of the signal it could provide, its cost and the complexity of its installation and use. V. APPENDIX A. Proof for minimisation of overall cost J Considering the definition of Jr in Eq. (18), we have δJr(t) ≡Jr(t) −Jr(t −T ) = 1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ) −ξd(τ)] dτ −1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ −T ) −ξd(τ −T )] dτ +1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r [ξr(τ −T ) −ξd(τ −T )] dτ −1 2 Z t t−T [ξr(τ −T ) −ξd(τ −T )]T QT r × [ξr(τ −T ) −ξd(τ −T )] dτ = 1 2 Z t t−T [ξr(τ) −ξd(τ)]T QT r δξr(τ) dτ +1 2 Z t t−T [ξr(τ −T ) −ξd(τ −T )]T QT r δξr(τ) dτ = Z t t−T [ξr −ξd −1 2δξr]T QT r δξr dτ (as ξd(t) = ξd(t −T )) ⩽ Z t t−T [Qr(ξr(τ) −ξd(τ))]T δξr(τ) dτ . (26) According to Eqs.(15) to (17), we rewrite this inequality as δJr ⩽ Z t t−T [Qr(ξr −Fd + F + eF)]T δξr dτ = Z t t−T (−LT δξr + Qr eF)T δξr dτ. (27) Consider the difference between Jc of two consecutive periods δJc ≡Jc −Jc(t −T ) (28) = 1 2 Z t t−T [( eF T Q−1 F eF −eF T (τ −T )Q−1 F eF(τ −T )) +tr( eKT S Q−1 S eKS −eKT S (τ −T )Q−1 S eKS(τ −T ) +( eKT DQ−1 D eKD −eKT D(τ −T )Q−1 D eKD(τ −T ))] dτ where tr(·) stands for the trace of a matrix. We consider that eF T (τ)Q−1 F eF(τ) −eF T (τ −T )Q−1 F eF(τ −T ) = [ eF T (τ)Q−1 F eF(τ) −eF T (τ)Q−1 F eF(τ −T )] +[ eF T(τ)Q−1 F eF(τ −T ) −eF T (τ −T )Q−1 F eF(τ −T )] = −eF T(τ)Q−1 F δF(τ) −eF T (τ −T )Q−1 F δF(τ) = −(2 eF T(τ) + δF(τ))Q−1 F δF(τ) ⩽ −2 eF T(τ)Q−1 F δF(τ) = −2 eF T(τ)[ε(τ) −β(τ)F(τ) + QT r δξr(τ)] (29) Then, similarly, we have tr[ eKT S (τ)Q−1 S eKS(τ) −eKT S (τ)(τ −T )Q−1 S eKS(τ −T )] ⩽−2tr{ eKT S (τ)[ε(τ)xT (τ) −β(τ)KS(τ)]} tr[ eKT D(τ)Q−1 d eKD(τ) −eKT D(τ −T )Q−1 D eKD(τ −T )] ⩽−2tr[ eKT D(τ)(ε(τ) ˙xT (τ) −β(τ)KD(τ))] (30) Substituting Ineqs. (29) and (30) into Eq.(28) and considering Ineq. (27) yields δJr + δJc ⩽ Z t t−T −δξT r Lδξr −eF T (ε −βF) (31) −tr[ eKT S (εxT −βKS)] −tr[ eKT D(ε ˙xT −βKD)] dτ . The rest is to deal with the residual in the above inequality, which is similar to that in [2]. For completeness, we show the outline in the following. In particular, we consider the time derivative of Je ˙Je = εT M(q, ˙q) ˙ε + 1 2εT ˙M(q, ˙q)ε = εT M(q, ˙q) ˙ε + 1 2εT C(q)ε (32) as [5] zT ˙Mz ≡zT Cz ∀z . (33) Considering the closed-loop dynamics Eq.(11), above equa- tion can be written as ˙Je(t) ≡εT ( eF T + eKT S x + eKT D ˙x −Γε) . (34) Integrating ˙Je from t −T to t and considering Ineq. (31), we obtain δJ = δJc + δJr + δJe ⩽ Z t t−T −εT Γε −δξT r Lδξr +β[ eF T F + tr( eKT S KS + eKT DKD)] dτ = Z t t−T −εT Γε −δξT r Lδξr −β[ eF T eF + tr( eKT S eKS + eKT D eKD)] +β[ eF T F ∗+ tr( eKT S K∗ S + eKT DK∗ D)] dτ . (35) A sufficient condition for δJ ⩽0 is λΓ∥ε∥2 + λL∥δξr∥2 + β(∥eF∥2 + ∥eKS∥2 + ∥eKD∥2) −β(∥eF∥∥F ∗∥+ ∥eKS∥∥K∗ S∥+ ∥eKD∥∥K∗ D∥) ≥0 .(36) where λΓ and λL are the minimal eigenvalues of Γ and L, respectively. Therefore, ∥ε∥, ∥δξr∥, ∥eF∥, ∥eKS∥and ∥eKD∥ are bounded. In particular, they satisfy λΓ∥ε∥2 + λL∥δξr∥2 + β 2 (∥eF∥2 + ∥eKS∥2 + ∥eKD∥2) ≤β 2 (∥F ∗∥2 + ∥K∗ S∥2 + ∥K∗ D∥2) . (37) By choosing large λΓ and λL, ∥ε∥and ∥δξr∥can be made small. B. Proof for minimisation of overall cost when neglecting damping Consider the cost function J′ r ≡1 2 Z t t−T (xr −xd)T K∗T S QT r (xr −xd) dτ . (38) Following similar procedures to Ineqs. (26), (27), we obtain δJ′ r ⩽ Z t t−T [−LT δxr + Qr( eF + eKSxr)]T δxr dτ (39) Considering further the cost function J′ c ≡1 2 Z t t−T eF T Q−1 F eF + vecT ( e KS)Q−1 S vec( eKS) dτ . (40) and following similar procedures from Ineqs.(28) to (31), we obtain δJ′ r + δJ′ c (41) ⩽ Z t t−T −δxT r Lδxr −eF T (ε −βF) −tr[ e KT S (ε xT −βKS)] dτ . The rest is similar to the case with damping and thus omitted. REFERENCES [1] E Burdet, A Codourey and L Rey (1998), Experimental evaluation of nonlinear adaptive controllers. IEEE Control Systems Magazine 18(2): 39-47. [2] C Yang, G Ganesh, S Haddadin, S Parusel, A Albu-Schaeffer and E Burdet (2011), Human-like adaptation of force and impedance in stable and unstable interactions. IEEE Transaction on Robotics 27(5): 918-30. [3] KJ Astrom and B Wittenmark (1995), Adaptive control. Reading, MA: Addison-Wesley. [4] S Jung, TC Hsia and RG Bonitz (2001), Force tracking impedance control for robot manipulators with an unknown environment: theory, simulation, and experiment. The International Journal of Robotics Research 20(9) 765-74. [5] C de Wit, B Siciliano and G Bastin (1996), Theory of robot control. Springer.