arXiv:1605.07834v1 [cs.RO] 25 May 2016 Dynamic analysis of simultaneous adaptation of force, impedance and trajectory Y. Li and E. Burdet When carrying out tasks in contact with the environment, humans are found to concurrently adapt force, impedance and trajectory. Here we develop a robotic model of this mechanism in humans and analyse the underlying dynamics. We derive a general adaptive controller for the interaction of a robot with an environment solely characterised by its stiffness and damping, using Lyapunov theory. I. S YSTEM DYNAMICS The dynamics of a n -degree-of-freedom ( n -DOF) robot in the operational space are given by M ( q ) ̈ x + C ( q, ̇ q ) ̇ x + G ( q ) = u + f (1) where x is the position of the robot and q the vector of joints angle. M ( q ) denotes the inertia matrix, C ( q, ̇ q ) ̇ x the Coriolis and centrifugal forces, and G ( q ) the gravitational force, which can be identified using e.g. nonlinear adaptive control [1]. u is the control input and f the interaction force. In [2], we have described the control input u in two parts: u = v + w , (2) with v to track the reference trajectory x r by compensating for the robot’s dynamics, i.e. v = M ( q ) ̈ x e + C ( q, ̇ q ) ̇ x e + G ( q ) − Γ ε (3) where ̇ x e = ̇ x r − αe , e ≡ x − x r , α > 0 , (4) Γ a symmetric positive-definite matrix with minimal eigen- value λ min (Γ) > λ Γ > 0 and ε ≡ ̇ e + α e (5) the tracking error . w is to adapt impedance and force in order to compensate for the unknown interaction dynamics. II. F ORCE AND IMPEDANCE ADAPTATION Suppose that the interaction force can be expanded as f = F ∗ 0 + K ∗ S ( x − x ∗ 0 ) + K ∗ D ̇ x , (6) where the force F ∗ 0 ( t ) , stiffness K ∗ S ( t ) and damping K ∗ D ( t ) are feedforward components of the interaction force, x ∗ 0 ( t ) is the rest position of the environment visco-elasticity and all of these functions are unknown but periodic with T : F ∗ 0 ( t + T ) ≡ F ∗ 0 ( t ) , K ∗ S ( t + T ) ≡ K ∗ S ( t ) , (7) K ∗ D ( t + T ) = K ∗ D ( t ) , x ∗ 0 ( t + T ) = x ∗ 0 ( t ) . (8) The authors are with the Department of Bioengineering, Imperial College of Science, Technology and Medicine, London SW72AZ, UK. To simplify the analysis, we rewrite the interaction force as f ≡ F ∗ + K ∗ S x + K ∗ D ̇ x (9) where F ∗ ≡ F ∗ 0 − K ∗ S x ∗ 0 is also periodic with T . w in Eq.(2) is then defined as w = − F − K S x − K D ̇ x (10) where K S and K D are stiffness and damping matrices, respectively, and F is the feedforward force. By substituting the control input u into Eq.(1), the closed- loop system dynamics are described by M ( q ) ̇ ε + C ( q, ̇ q ) ε + Γ ε = ̃ F + ̃ K S x + ̃ K D ̇ x , (11) ̃ F ≡ F ∗ − F , ̃ K S ≡ K ∗ S − K S , ̃ K D ≡ K ∗ D − K D . In this equation, we see that the feedforward force F , stiffness K S and damping K D ensure contact stability by compensating for the interaction dynamics. Therefore, the objective of force and impedance adaptation is to minimise these residual errors which can be carried out through minimising the cost function J c ( t ) ≡ 1 2 ∫ t t − T ̃ F T Q − 1 F ̃ F + vec T ( ̃ K S ) Q − 1 S vec ( ̃ K S ) + vec T ( ̃ K D ) Q − 1 D vec ( ̃ K D ) dτ , (12) where Q F , Q S and Q D are symmetric positive-definite matrices, and vec ( · ) stands for the column vectorization operation. This objective is achieved through the following update laws: δF ( t ) ≡ F ( t ) − F ( t − T ) ≡ Q F [ ε ( t ) − β ( t ) F ( t )] (13) δK S ( t ) ≡ K S ( t ) − K S ( t − T ) = Q S [ ε ( t ) x ( t ) T − β ( t ) K S ( t )] δK D ( t ) ≡ K D ( t ) − K D ( t − T ) = Q D [ ε ̇ x ( t ) T − β ( t ) K D ( t )] where F , K S and K D are initialised as zero matrices/vectors with proper dimensions for t ∈ [0 , T ) . Now that we have dealt with the interaction dynamics, stable trajectory control can be obtained by minimising the cost function J e ( t ) ≡ 1 2 ε ( t ) T M ( q ) ε ( t ) . (14) Consequently, we use a combined cost function J ce ≡ J c + J e that yields concurrent minimisation of tracking error and control effort. III. T RAJECTORY A DAPTATION In a typical interaction task, the contact between the robot and the environment is maintained through a desired interaction force F d . Assuming that there exists a desired trajectory x d yielding F d , i.e. from Eq.(6) F d = F ∗ 0 + K ∗ S ( x d − x ∗ 0 ) + K ∗ D ̇ x d (15) = F ∗ + K ∗ S x d + K ∗ D ̇ x d , F ∗ = F ∗ 0 − K ∗ S x ∗ 0 , we propose to adapt the reference x r in order to track x d . However, x d is unknown as the parameters F ∗ , K ∗ S and K ∗ D in the interaction force are unknown. Nevertheless, we know that x d is periodic with T as F ∗ , K ∗ S and K ∗ D are periodic with T and we also set F d to be periodic with T . In the following, we develop an update law to learn the desired trajectory x d . First, we define ξ d ≡ K ∗ S x d + K ∗ D ̇ x d , ξ r ≡ K S x r + K D ̇ x r . (16) Then, we develop the following update law δξ r ( t ) ≡ ξ r − ξ r ( t − T ) ≡ L − T Q r ( F d ( t ) − F ( t ) − ξ r ( t )) (17) where Q r and L are positive-definite constant gain matrices. This update law minimises the error between ξ d and ξ r , which is described by the following cost function J r ≡ 1 2 ∫ t t − T ( ξ r − ξ d ) T Q T r ( ξ r − ξ d ) dτ . (18) Because of the coupling of adaptation of force and impedance and trajectory adaptation, we modify the adap- tation of feedforward force Eq.(13) to δF ( t ) ≡ Q F [ ε ( t ) − β ( t ) F ( t ) + Q T r δξ r ( t )] . (19) As a result, update laws Eqs.(17) and (19) minimise the overall cost J = J c + J e + J r as shown in Appendix A. Then, we obtain the update law for trajectory adaptation δx r ≡ x r ( t ) − x r ( t − T ) (20) by solving δξ r = K S δx r + K D δ ̇ x r = K S δx r + K D d dt ( δx r ) (21) using δξ r ( t ) from Eq.(17). According to the convergence of δξ r , K S and K D as shown in Appendix A, x r will converge, as δξ r − ξ d = K S δx r + K D δ ̇ x r , (22) Upon convergence, the desired interaction force F d is main- tained between the robot and the environment according to Eq.(17). At the same time, the properties with adaptation of force and impedance are preserved which include trajectory tracking and control effort minimisation. However, from the analysis in Appendix A, we cannot draw the conclusion that F , K S , K D and x r converge to F ∗ , K ∗ S , K ∗ D and x d , respectively, which will require the condition of persistent excitation (PE), similar to classical adaptive control theory [3]. IV. D ISCUSSION A. No contact In a special case when there is no force applied by the environment and F d is also zero, the controller component w will converge to zero. According to the update law Eq.(17), the reference trajectory will not adapt, as expected. B. No damping If we neglect the damping component in the interaction force f of Eq.(9), the trajectory adaptation described by Eqs.(17) and (21) can be simplified to δx r = L − T Q r ( F d − F − K S x r ) (23) Correspondingly, the update laws for force and impedance Eq.(13) needs to be modified as δF ≡ Q F ( ε − βF + Q T r δx r ) , (24) δK S ≡ Q S ( ε x T − βK S + x T r Q T r δx r ) . The stability analysis is similar to the case with damping and is briefly explained in Appendix B. C. Force sensing As in [2], force sensing is not required in the proposed framework, in contrast to traditional methods for surface following where the force feedback is used to regulate the interaction force [4]. In particular, in a first phase force and impedance adap- tation is used to compensate for the interaction force from the environment. During this process, the unknown actual interaction force is estimated when the tracking error ε goes to zero as can be seen from Eq.(11): when ε = 0 , we have w = − f. (25) Using this estimated interaction force, then a desired force in Eq.(15) can be rendered by adaptation of the reference trajectory x r . In this sense, it is important to note that trajectory adap- tation should be conducted only when force and impedance adaptation takes effect, which guarantees compensation of the interaction force and tracking of the current reference trajectory . Nevertheless, as shown in above stability analysis, adaptation of force, impedance and trajectory can be realised simultaneously. This also suggests that a force sensor should be used if available, as force and impedance adaptation could then be replaced by force feedback. In this way, trajectory adaptation would not depend on the force estimation process and can in principle happen faster than force and impedance adaptation is needed. However, the potential advantages of a force sensor depends on the quality of the signal it could provide, its cost and the complexity of its installation and use. V. A PPENDIX A. Proof for minimisation of overall cost J Considering the definition of J r in Eq. (18), we have δJ r ( t ) ≡ J r ( t ) − J r ( t − T ) = 1 2 ∫ t t − T [ ξ r ( τ ) − ξ d ( τ )] T Q T r [ ξ r ( τ ) − ξ d ( τ )] dτ − 1 2 ∫ t t − T [ ξ r ( τ ) − ξ d ( τ )] T Q T r [ ξ r ( τ − T ) − ξ d ( τ − T )] dτ + 1 2 ∫ t t − T [ ξ r ( τ ) − ξ d ( τ )] T Q T r [ ξ r ( τ − T ) − ξ d ( τ − T )] dτ − 1 2 ∫ t t − T [ ξ r ( τ − T ) − ξ d ( τ − T )] T Q T r × [ ξ r ( τ − T ) − ξ d ( τ − T )] dτ = 1 2 ∫ t t − T [ ξ r ( τ ) − ξ d ( τ )] T Q T r δξ r ( τ ) dτ + 1 2 ∫ t t − T [ ξ r ( τ − T ) − ξ d ( τ − T )] T Q T r δξ r ( τ ) dτ = ∫ t t − T [ ξ r − ξ d − 1 2 δξ r ] T Q T r δξ r dτ (as ξ d ( t ) = ξ d ( t − T ) ) 6 ∫ t t − T [ Q r ( ξ r ( τ ) − ξ d ( τ ))] T δξ r ( τ ) dτ . (26) According to Eqs.(15) to (17), we rewrite this inequality as δJ r 6 ∫ t t − T [ Q r ( ξ r − F d + F + ̃ F )] T δξ r dτ = ∫ t t − T ( − L T δξ r + Q r ̃ F ) T δξ r dτ. (27) Consider the difference between J c of two consecutive periods δJ c ≡ J c − J c ( t − T ) (28) = 1 2 ∫ t t − T [( ̃ F T Q − 1 F ̃ F − ̃ F T ( τ − T ) Q − 1 F ̃ F ( τ − T )) + tr ( ̃ K T S Q − 1 S ̃ K S − ̃ K T S ( τ − T ) Q − 1 S ̃ K S ( τ − T ) +( ̃ K T D Q − 1 D ̃ K D − ̃ K T D ( τ − T ) Q − 1 D ̃ K D ( τ − T ))] dτ where tr ( · ) stands for the trace of a matrix. We consider that ̃ F T ( τ ) Q − 1 F ̃ F ( τ ) − ̃ F T ( τ − T ) Q − 1 F ̃ F ( τ − T ) = [ ̃ F T ( τ ) Q − 1 F ̃ F ( τ ) − ̃ F T ( τ ) Q − 1 F ̃ F ( τ − T )] +[ ̃ F T ( τ ) Q − 1 F ̃ F ( τ − T ) − ̃ F T ( τ − T ) Q − 1 F ̃ F ( τ − T )] = − ̃ F T ( τ ) Q − 1 F δF ( τ ) − ̃ F T ( τ − T ) Q − 1 F δF ( τ ) = − (2 ̃ F T ( τ ) + δF ( τ )) Q − 1 F δF ( τ ) 6 − 2 ̃ F T ( τ ) Q − 1 F δF ( τ ) = − 2 ̃ F T ( τ )[ ε ( τ ) − β ( τ ) F ( τ ) + Q T r δξ r ( τ )] (29) Then, similarly, we have tr [ ̃ K T S ( τ ) Q − 1 S ̃ K S ( τ ) − ̃ K T S ( τ )( τ − T ) Q − 1 S ̃ K S ( τ − T )] 6 − 2 tr { ̃ K T S ( τ )[ ε ( τ ) x T ( τ ) − β ( τ ) K S ( τ )] } tr [ ̃ K T D ( τ ) Q − 1 d ̃ K D ( τ ) − ̃ K T D ( τ − T ) Q − 1 D ̃ K D ( τ − T )] 6 − 2 tr [ ̃ K T D ( τ )( ε ( τ ) ̇ x T ( τ ) − β ( τ ) K D ( τ ))] (30) Substituting Ineqs. (29) and (30) into Eq.(28) and considering Ineq. (27) yields δJ r + δJ c 6 ∫ t t − T − δξ T r Lδξ r − ̃ F T ( ε − βF ) (31) − tr [ ̃ K T S ( εx T − βK S )] − tr [ ̃ K T D ( ε ̇ x T − βK D )] dτ . The rest is to deal with the residual in the above inequality, which is similar to that in [2]. For completeness, we show the outline in the following. In particular, we consider the time derivative of J e ̇ J e = ε T M ( q, ̇ q ) ̇ ε + 1 2 ε T ̇ M ( q, ̇ q ) ε = ε T M ( q, ̇ q ) ̇ ε + 1 2 ε T C ( q ) ε (32) as [5] z T ̇ M z ≡ z T Cz ∀ z . (33) Considering the closed-loop dynamics Eq.(11), above equa- tion can be written as ̇ J e ( t ) ≡ ε T ( ̃ F T + ̃ K T S x + ̃ K T D ̇ x − Γ ε ) . (34) Integrating ̇ J e from t − T to t and considering Ineq. (31), we obtain δJ = δJ c + δJ r + δJ e 6 ∫ t t − T − ε T Γ ε − δξ T r Lδξ r + β [ ̃ F T F + tr ( ̃ K T S K S + ̃ K T D K D )] dτ = ∫ t t − T − ε T Γ ε − δξ T r Lδξ r − β [ ̃ F T ̃ F + tr ( ̃ K T S ̃ K S + ̃ K T D ̃ K D )] + β [ ̃ F T F ∗ + tr ( ̃ K T S K ∗ S + ̃ K T D K ∗ D )] dτ . (35) A sufficient condition for δJ 6 0 is λ Γ ‖ ε ‖ 2 + λ L ‖ δξ r ‖ 2 + β ( ‖ ̃ F ‖ 2 + ‖ ̃ K S ‖ 2 + ‖ ̃ K D ‖ 2 ) − β ( ‖ ̃ F ‖‖ F ∗ ‖ + ‖ ̃ K S ‖‖ K ∗ S ‖ + ‖ ̃ K D ‖‖ K ∗ D ‖ ) ≥ 0 . (36) where λ Γ and λ L are the minimal eigenvalues of Γ and L , respectively. Therefore, ‖ ε ‖ , ‖ δξ r ‖ , ‖ ̃ F ‖ , ‖ ̃ K S ‖ and ‖ ̃ K D ‖ are bounded. In particular, they satisfy λ Γ ‖ ε ‖ 2 + λ L ‖ δξ r ‖ 2 + β 2 ( ‖ ̃ F ‖ 2 + ‖ ̃ K S ‖ 2 + ‖ ̃ K D ‖ 2 ) ≤ β 2 ( ‖ F ∗ ‖ 2 + ‖ K ∗ S ‖ 2 + ‖ K ∗ D ‖ 2 ) . (37) By choosing large λ Γ and λ L , ‖ ε ‖ and ‖ δξ r ‖ can be made small. B. Proof for minimisation of overall cost when neglecting damping Consider the cost function J ′ r ≡ 1 2 ∫ t t − T ( x r − x d ) T K ∗ T S Q T r ( x r − x d ) dτ . (38) Following similar procedures to Ineqs. (26), (27), we obtain δJ ′ r 6 ∫ t t − T [ − L T δx r + Q r ( ̃ F + ̃ K S x r )] T δx r dτ (39) Considering further the cost function J ′ c ≡ 1 2 ∫ t t − T ̃ F T Q − 1 F ̃ F + vec T ( ̃ K S ) Q − 1 S vec ( ̃ K S ) dτ . (40) and following similar procedures from Ineqs.(28) to (31), we obtain δJ ′ r + δJ ′ c (41) 6 ∫ t t − T − δx T r Lδx r − ̃ F T ( ε − βF ) − tr [ ̃ K T S ( ε x T − βK S )] dτ . The rest is similar to the case with damping and thus omitted. R EFERENCES [1] E Burdet, A Codourey and L Rey (1998), Experimental evaluation of nonlinear adaptive controllers. IEEE Control Systems Magazine 18(2): 39-47. [2] C Yang, G Ganesh, S Haddadin, S Parusel, A Albu-Schaeffer and E Burdet (2011), Human-like adaptation of force and impedance in stable and unstable interactions. IEEE Transaction on Robotics 27(5): 918-30. [3] KJ Astrom and B Wittenmark (1995), Adaptive control. Reading, MA: Addison-Wesley. [4] S Jung, TC Hsia and RG Bonitz (2001), Force tracking impedance control for robot manipulators with an unknown environment: theory, simulation, and experiment. The International Journal of Robotics Research 20(9) 765-74. [5] C de Wit, B Siciliano and G Bastin (1996), Theory of robot control. Springer.