A Predictive Momentum-Based Whole-Body Torque Controller: Theory and Simulations for the iCub Stepping Stefano Dafarra, Francesco Romano, Gabriele Nava, Francesco Nori Abstract— When balancing, a humanoid robot can be easily subjected to unexpected disturbances like external pushes. In these circumstances, reactive movements as steps become a necessary requirement in order to avoid potentially harmful falling states. In this paper we conceive a Model Predictive Controller which determines a desired set of contact wrenches by predicting the future evolution of the robot, while taking into account constraints switching in case of steps. The control inputs computed by this strategy, namely the desired contact wrenches, are directly obtained on the robot through a modifi- cation of the momentum-based whole-body torque controller currently implemented on iCub. The proposed approach is validated through simulations in a stepping scenario, revealing high robustness and reliability when executing a recovery strategy. I. INTRODUCTION The unpredictable nature of the real world is one of the leading obstacles to the widespread diffusion of robots out- side labs. These complicated mechanical systems ought to be robust against a wide spectrum of disturbances. Considering humanoid robots, the aim for robustness translates into pre- venting stumbling. Despite the apparent straightforwardness of such an objective, complications arise when facing it from an algorithmic point of view. Considering humanoid robots, problems arise due to their intrinsic underactuation [1] which, in essence, means that by exploiting all theirs actuated degrees-of-freedom they can control their internal configuration, but they cannot affect directly their global pose. For example, an astronaut in the space is free to move all his limbs, but without exploiting any hooking he cannot move inside the spaceship. On the contrary, by exploiting contacts with the environment it is possible to circumvent this limitation. Additional difficulties arise by the fact that contacts may change over time. This results in a different evolution of the constrained dynamical system making the overall system hybrid [2], i.e. it possesses both a continuous and discrete time dynamics. An appealing research problem consists in applying Model Predictive Control (MPC) techniques to these particular systems. MPC is a particular optimal control method which enables the introduction of the feedback into the optimization procedure thanks to the “Receding Horizon Principle” [3], [4], [5]. The basic concept consists in solving, at each time step, a new finite-horizon optimal control problem initialized at the current plant state. At every time instant This work was supported by the FP7 EU project CoDyCo (No. 600716 ICT 2011.2.1 Cognitive Systems and Robotics) and Horizon 2020 EU project An.Dy. (No. 731540 Research and Innovation Programme) The authors are with the iCub Facility Department, Istituto Italiano di Tecnologia, 16163 Genova, Italy (e-mail: name.surname@iit.it) only the first control input is then applied to the plant. Model predictive controllers are appealing for controlling hybrid systems [6], [7] since the full hybrid model can be exploited. Indeed, thanks to the prediction capabilities, it is possible to include inside the formulation both time- and state-dependent switching, performing anticipatory actions for the imminent variation in the dynamics. However MPC does not solve those problems related to the numerical integration of hybrid systems, which indeed is an open research problem. In literature the original underactuated hybrid dynamics is usually scaled down to simplified models, making the problem easier to be handled. Simple models like the Linear Inverted Pendulum (LIP) [8] are widely adopted. In this context, MPC has been applied in order to stabilize walking patterns [9], [10], [11], especially of position controlled robots. The result is usually a slow and steady walking style characterized by a nearly constant Center of Mass (CoM) height. The LIP model has been applied also in [12] with a robot equipped with force controlled joints. While being easily applicable, this simple model provides only a limited amount of informations about the actual dynamics of the robot. A related popular approach is based on the Capture Point framework [13]. While keeping the model complexity low, this technique allows to evaluate the possibility of the robot to stay in the upright position. Starting from the simple Linear Inverted Pendulum, the model has been progressively enriched [14] to catch different robot peculiarities. This method is particularly interesting due to the possibility of drawing stability criteria [15] and in [16] authors applied MPC techniques based on the Capture Point formulation. A recent trend seems to move in a different direction. The CoM dynamic model is substituting the simple pendulum, especially when planning complex trajectories. In [17], [18], [19] a kinematic planner is merged with one based on the CoM dynamics to accomplish complex and feasible motions, while taking into account different contact locations. The main issues of this approach are usually related to the computational complexity, while the optimization problem is generally non-convex, introducing problems of local min- ima. In other applications, only the momentum dynamics is considered. In [20] only the 3D CoM acceleration is taken into consideration, while in [21] authors propose a convex upper bound of the angular momentum to be minimized. In [22] the derivative of the angular momentum is approximated by using quadratic constraints together with slack variables necessary to keep the approximation error low. Nevertheless, in this approach, it is not possible to directly penalize the use of the angular momentum, while introducing many additional arXiv:1705.10635v2 [cs.RO] 28 Jul 2017 variables into the formulation. Following this direction, we present here a momentum- based whole-body torque controller based on a MPC for- mulation. In particular, the dynamic evolutions of the robot linear and angular momentum are taken into account. We thus make use of a reduced model: differently from simpli- fied models used in literature, the robot momentum is an exact model that captures the global behaviour of the robot. We deal with the complications introduced by the derivative of the angular momentum by resorting to a Taylor expansion. The presented controller allows to deal directly with the intrinsic hybrid dynamic of the system by considering time- varying constraints. The peculiarity of our approach also resides in the fact that the computed control inputs, i.e. the contact wrenches, are directly applied on the robot, rather than used to define a joint position reference trajectory. In particular, the presented scheme inherits its structure from the momentum-based whole body torque controller [23], [24], [25] implemented on iCub. Torque control is particularly suitable for our application given that it permits to absorb the impacts efficiently, maintaining the balance also in case of robot positioning errors. We tested this approach on the iCub humanoid robot while performing a step recovery strategy. II. BACKGROUND A. Notation Throughout this paper we adopt the following notation. • I represents an inertial reference frame with the origin placed on the ground, while the orientation is so that the z−axis points against the gravity and the x−axis is oriented frontally with respect to the robot. • 1n represents a n×n identity matrix. 0n×m ∈Rn×m is a zero matrix while 0n = 0n×1 is a zero column vector of size n. • xCoM ∈R3 is the position of the center of mass with respect to I. • ∥x∥2 W = x⊤Wx is the weighted square norm of x. • x∧∈R3×3 denotes the skew-symmetric matrix such that x∧y = x × y, where × denotes the cross product operator in R3. • ⌈·⌉denotes the ceiling operator which outputs the rounding of the argument toward +∞. B. Whole-body torque control The predictive controller we propose in this paper, relies on a whole-body optimization-based torque controller for the stabilization of the desired contact wrenches. The original whole-body control algorithm is a momentum-based hier- archical controller composed of two control objectives [24], [25]. The first, and most priority objective, is the tracking of a desired robot momentum while the second is the stabilization of the zero dynamics. In this paper we slightly modify the original controller by directly commanding desired contact wrenches. We thus “substitute” the first control objective, i.e. generation of contact wrenches to track a desired robot momentum, with our controller proposed in this paper. The second objective, which is left as in the original controller, is responsible for constraining the joint variables and avoid internal divergent behaviours [25]. When described as an optimization problem, this second objective can be formulated as: min τ ∥τ −ψ∥2 (1a) s.t. M(q) ˙ν + h(q, ν) −J⊤f ref = Bτ (1b) J ˙ν + ˙Jν = 0 (1c) ψ := hj(q, 0) −J(j),⊤f ref −Kj p(qj −qref j ) −Kj d ˙qj (1d) Eq.(1b) describes the free-floating dynamics of the mechani- cal system [26, Ch. 7], where (q, ν) is the free-floating state of the robot. Eq.(1c) is the constraint equation describing the kinematic constraints associated with the contacts. Eq.(1d), which resembles a PD plus gravity and contact wrenches compensation, plays the role of a desired joint torque ref- erence where hj and J(j) denotes the joint space bias term and Jacobian respectively. Note that the controller depends on two inputs: f ref and qref j . As we will show later, our predictive controller is responsible of choosing the former quantity, while the latter is the output of a inverse kinematics algorithm. III. PROBLEM FORMULATION A. The model In this work, we examine the scenario where the only contacts available are those expressed at the feet. Without loss of generality, we consider as a test case the robot balancing on single support and performing a step adopting the right foot as swing limb. The rate of change of the robot momentum, when expressed in a frame located at the center of mass with the same orientation of I, has the following expression: " ¨xCoM ˙Hang # =  m−113 03×3 03×3 13  h CoMXl CoMXr i  fl fr  +¯g (2) where m ∈R is the mass of the robot and ¯g the 6D gravity acceleration. The indexes l and r refer to left and right foot respectively. fi ∈R6, with i either l or r, is the contact wrench, composed of 3D forces and torques applied at the contact point, namely fi =  f⊤ i , τ ⊤ i  . Matrices CoMXi, have the following particular structure: CoMXi =  13 03×3 (xi −xCoM)∧ 13  . (3) The presence of xi, highlights the dependency of the mo- mentum from the foot position, but the presented model does not carry any information about the robot kinematics. The variables involved in the formulation do not affect directly xi, thus it is not possible to define its feasible values properly. In addition, due to the very dynamic task the robot is going to perform, the tracking of a reference for the swing foot may not be perfect. Consequently, local modifications of a preplanned foot trajectory might not have a relevant role. In literature the foot positioning problem is usually ad- dressed by exploiting the LIP model, as in [27], [28] or [29]. The kinematics limitations can be introduced through simple box constraints, which may not have a direct connection with the actual robot kinematic limits. As a consequence we rely on external planners to decide where to place the foot. This information will be considered as a datum for the presented strategy, thus keeping its computational complexity low. For similar reasons we can assume to know the instant where the impact is going to happen, here called timpact. Given the application we consider, the time necessary to take the step should be as little as possible. Consequently, it has been chosen to be the minimum time the robot needs to physically perform the step given its limits. This happened to be a common assumption in literature ([27], [29], [18]), since it also avoids to introduce integer variables into the optimal control formulation, thus avoiding np-hard complexity. B. The angular momentum The second set of rows of Eq.(2), represents the derivative of angular momentum ˙Hang, i.e.: ˙Hang = l,r X i (xi −xCoM)∧fi + τi. (4) If we consider both xCoM and fi as two optimization vari- ables, its product renders the formulation non-convex and thus much harder to be solved, mainly due to possible local minima. In our application, we can admit a certain level of approximation, since during the step this quantity does not need to be precisely controlled to zero. On the other hand, angular momentum can be used to penalize the usage of contact wrenches. For that purpose the Taylor expansion (around the last available values of fi and xCoM) truncated to the first order represents an acceptable approximation: ˙Hang ≈ {l,r} X i τi +  xi −x0 CoM ∧ f0 i + +  xi −x0 CoM ∧ fi −f0 i  + +  f0 i ∧ xCoM −x0 CoM  ⇒˙Hang ≈ {l,r} X i τi +  xi −x0 CoM ∧ fi + +  f0 i ∧ xCoM −x0 CoM  . (5) Notice that the anticommutative property of the cross prod- uct, i.e. x∧y = −y∧x, has been exploited. The superscript 0 refers to the last feedback retrieved from the robot, which will be used as the point around which we compute the Taylor expansion. By truncating at the first order, the ap- proximation error is o fi −f0 i  xCoM −x0 CoM  . Thus, the less the CoM moves from the feedback position the better the approximation will be. This is usually the case given short prediction windows. Finally, we define γ as the state and f as the control variable for the MPC problem, γ := h x⊤ CoM ˙x⊤ CoM H⊤ ang i⊤ ∈R9, (6a) f := h f ⊤ l f ⊤ r i⊤ ∈R6 (6b) then, we can rewrite Eq.(2) as: ˙γ = ˜Evγγ + ˜Fγf + ˜Gγ + ˜S0 γ (7) where ˜Evγ =   03×3 13 03×3 03×3 03×3 03×3 f0 l + f0 r ∧ 03×3 03×3  , ˜Fγ =   03×3 03×3 03×3 03×3 m−113 03×3 m−113 03×3 xl −x0 CoM ∧ 13 xr −x0 CoM ∧ 13  , ˜Gγ =   05 −g 03  , ˜S0 γ = " 06 − f0 l + f0 r ∧x0 CoM # . Here, ˜S0 γ introduces the constant terms resulting from the Taylor approximation. Under the above considerations, the model described by Eq.(7) is affine and can be easily inserted into a QP formulation, as described in Sec. V. C. Constraints definition The constraints introduced in the formulation are mainly physical limitations induced by contacts. Considering con- tacts located at the feet, those constraints enforce the fea- sibility of the exerted wrenches. For example, the applied wrenches must be within the friction cone (approximated through a polyhedral convex cone) while their point of application (i.e. the center of pressure, CoP) should lie inside the support polygon. Formally, we can write these constraints linearly as follows: Aclfl ≤bcl ∀t : t ≤tf. (8) Notice that these constraints are consistent only if the foot is in contact. Thus, they have not to be applied on the swing limb, as the corresponding required wrench should be null when t < timpact. The constraints applied on the swing foot attempt to catch the hybrid nature of the system by varying in time. Formally: ( fr = 06 ∀t : t < timpact Acrfr ≤bcr ∀t : timpact ≤t ≤tf. (9) By considering the CoM dynamics only, we elude the complications introduced by time-varying constraints on the robot configuration after the establishment of a new contact (e.g. the velocity of the corresponding foot should be zero). D. Cost function definition The cost function Γ possesses different components that act only before or after the impact. Formally, it has the following expression: Γ = 1 2 Z tf 0 γ(τ) −γd(τ) 2 Kγ dτ+ (10a) + Z tf ¯timp γ(τ) −γd(τ) 2 Kimp γ dτ+ (10b) + Z tf 0 f(τ) 2 Kf dτ+ (10c) + γ(tf) −γd(tf) 2 Kimp γ ! . (10d) K(·) (·) is a real positive semi-definite matrix of gains with suitable dimensions, accounting also for unit of measurement mismatches inside Γ. ¯timp is the minimum between timpact and tf (to avoid negative integrals when the impact is expected to occur after tf) while, for the sake of simplicity, the initial time instant is set to zero. Note that it is possible to vary the cost applied to the state γ after the impact through the use of the matrix Kimp γ . Finally, a terminal cost term, weighted by the same matrix Kimp γ , models the finiteness of the control horizon. Particular attention has been payed to the choice of the references, here expressed with the superscript d. Indeed, even if the model carries no information on how the step is made, the CoM dynamics highly influences the resultant motion. The choice of the references are therefore crucial for performing a step correctly, since they are the only insight we could give to the controller about the desired posture of the robot. On the other hand, to provide a precise CoM trajectory would reduce the benefits and the efficiency of the MPC controller. After the completion of the step, we would like to have the CoM over the convex hull centroid. Nevertheless, this objective does not constraint the CoM trajectory during the step. Ideally, we could leave the optimizer to come out with an optimal trajectory, given initial and final conditions. For this reason, we decide to weight the traverse components (i.e. x and y) of the CoM position only in the terminal cost Eq.(10d) and after the step is made, Eq.(10b). On the other hand, the z−component of the CoM position is continuously weighted, allowing to have some authority over the CoM trajectory, useful to perform the step movement correctly. The CoM velocity and the angular momentum are always minimized. As a last term, we weight the requested reaction forces with the goal of avoiding impulsive responses. Additionally, having smooth references for the desired wrenches allows to reduce the tracking error introduced by the underlying momentum controller. E. References definition for the whole-body torque controller The outputs of the MPC strategy are used as references for the whole-body torque controller described in Section II-B. In particular, f ref = f, i.e. the contact wrenches are directly used as references. Regarding the desired joint positions qref j , we solve an inverse kinematics problem, viz., we impose a desired pose for the swing foot, following as close as possible the center of mass trajectory defined by the MPC controller. IV. MPC AS A STEP TRIGGER Let us consider the scenario where the robot has to perform a step because of an external perturbation. In principle we could leverage the presented MPC formulation to define the moment in which to start the motion, up to now considered as a datum. This moment can be defined as the instant where the controller is not able to bring the CoM back to a desired equilibrium point, given the present support configuration. In other words, the constraints related to the balancing configu- ration prevent the controller to recover from the push. Thus, it is necessary to step and change balancing configuration in order to avoid a fall. The availability of a prediction horizon particularly suits this idea. Hypothetically, if tf = ∞, as soon as the robot is pushed, we could predict whether or not the controller will be able to absorb the disturbance completely. In practice this is not true. The finiteness of the prediction horizon hides the actual recovery capabilities of the controller, thus it is still necessary to define an heuristic. By considering the very last predicted state, we can set the following condition: ∥xCoM(tf) −xd CoM∥+kv∥˙xCoM(tf) −˙xd CoM∥< ¯d. (11) When Eq.(11) is violated, the robot performs the step. Both kv and ¯d ∈R are user-defined parameter affecting the sensitivity to pushes. The smaller ¯d the more the robot will be inclined to take a step, while kv allows to regulate the relative importance of the two errors. Notice that Eq.(11) does not depend directly on the feet dimensions. In order to be effective, this heuristic needs the MPC strategy to be in charge of sending references to the robot, even if the step will not be performed at all. Indeed, Eq.(11) is based on a prediction which assumes that all the computed wrenches are directly applied on the system. Nevertheless, notice that here the goal is different from what has been presented in Sec. III-D. In a sense, the robot should do its best to avoid a step. As a consequence, looking at Eq.(10), ¯timp will be equal to tf (i.e. the step will not occur at all), while Kγ should be equal to Kimp γ . In practice, the cost function should resemble the case where the impact already happened, in order to weight correctly the whole state terms. V. QUADRATIC PROGRAMMING TRANSCRIPTION We solve the finite-horizon optimal control problem pre- sented in the previous section by using direct numerical methods. In particular we convert the original optimal control problem into a single Quadratic Programming (QP) problem to be solved at each time step. The general form of a QP problem is the following: minimize χ 1 2χ⊤Hχ + χ⊤g (12a) subject to: Aχ ≤b (12b) where χ ∈Rn is the set of optimization variables, H ∈ Rn×n a positive semi-definite matrix (the “Hessian” matrix), g ∈Rn the “gradient” vector, while A ∈Rng×n and b ∈Rng define the set of ng constraints applied to the optimization variables. A. Model transcription We discretize the model using forward Euler formulation. Different approaches may have been chosen, as described in [30], but we decided that Euler integration was suitable thanks to its simplicity. Discretizing Eq.(7), we obtain: γ(k + 1) = Evγγ(k) + Fγf(k) + Gγ + S0 γ (13) where Evγ = 19 + dt ˜Evγ, Fγ = dt ˜Fγ, Gγ = dt˜gγ, S0 γ = dt ˜S0 γ while k ∈N, k = 0, · · · , N −1 is the discrete time, N = ⌈tf/dt⌉and dt ∈R is the time step length. We can now define the optimization vector χ as the collection of all the states and control variables over the horizon. Formally: χ =   γ(1) f(0) ... γ(k) f(k −1) ... γ(N) f(N −1)   , χ ∈R9N+12N. (14) We thus opted for leaving the state γ as an optimization variable. This choice allows for an easy reformulation of the cost function, while preserving a particular sparse structure which may be useful for the optimizer. According to the receding horizon principle, at each time step, f(0) will be applied to the system, through a modified version of the balancing controller implemented on iCub, as presented in Sec. III-A. A crucial point to be considered, is the definition of the time instant at which the impact occurs. We assume the impact to be impulsive and taking place at the beginning of a time step which we denote with kimpact. We further assume that the control input f(k) is applied piecewise constantly, i.e. constant from instant k to k + 1. Note that, in view of the above two assumptions, setting kimpact = 0 implies that both feet are at contact already at the initial time k = 0. As mentioned previously, we do not directly model the distance between the swing foot and the ground, but we com- pute the position and the expected impact time at the begin- ning of the push strategy. The following procedure describes how we update the expected impact instance throughout the proposed MPC controller. Assuming the initial time instant to be always set to zero, we start with a value of kimpact equal to ⌈timpact/dt⌉. At each controller execution, we set the new value to max{kimpact−1, 1}, which implies that, if the impact has not occurred yet, we saturate kimpact to 1, i.e. we expect the impact to occur at the second time step in order to avoid requiring wrenches on a swinging limb. If the impact has occurred, instead, kimpact is correctly set to 0. Note that the formal definition of the impact time and the trajectory control of mechanical system with unilateral constraints (as it is our case), are still an open research problem [31], [32]. It is now possible to write the discretized dynamics as an equality constraints, transforming the initial continuous time optimal control problem into Eq.(12). By expanding Eq.(13) over the horizon we get the following constraint: Ev −eγ  χ = −G −Evγ0 −S0 (15) where Ev ∈R9N×21N and G, Evγ0, S0 ∈R9N×1. The matrix eγ ∈R9N×21N selects the terms related to state γ from vector χ, while Ev has a peculiar banded structure due to the choice of vector χ. In particular, different repetitions of Evγ and Fγ are included, replicating Eq.(13) for each time step in the control horizon. The dependency from γ(0) is considered through matrix Evγ0, constituted by only zeros, except from the first block of nine rows, which are equal to Evγ. Finally G considers the gravitational effects, while S0 contains the constant terms related to the angular momentum. A similar approach can be used to transcribe the con- straints presented in III-C: Ahor χ ≤bhor. (16) Matrices Ahor and bhor condense the constraints of Eq.(8) and Eq.(9) for each time instant. In order to avoid a varying number of constraints from one MPC execution to the other, we adopted a simple expedient. An upper and lower bound on the normal force fz are added, among friction and CoP constraints. In order to set the wrench on the swing foot to zero, it is simply necessary to set both the previously mentioned upper and lower bound to zero. All the other components of the wrench are automatically set to zero thanks to the friction and CoP constraints. Thus, it is simply necessary to correctly handle the terms inside bhor without changing the number of constraints. B. Cost function transcription The last stage of the transcription process involves the cost function of Eq.(10). As before, this process consists in a discretization stage followed by a reformulation phase necessary to express the cost function in terms of χ. This latter stage is omitted here, but it simply exploits the appro- priate extractor matrices, as it has been done for the model transcription through eγ. Starting from the first term, Eq.(10a), the discretization corresponds to (10a) ⇒Γγ = 1 2 N X k=1 γ(k) −γd(k) 2 Kγ (17) noticing that the cost is evaluated starting from 1, avoiding to consider the feedback γ(0). Similarly, the other contributions can be discretized as: (10b)+(10d) ⇒Γimp γ = = 1 2 N X k=¯kimp γ(k) −γd(k) 2 Kimp γ , (18a) (10c) ⇒Γf = 1 2 N−1 X k=0 f(k) 2 Kf , (18b) where ¯kimp is the minimum between N and kimpact, so that Γimp γ is composed by at least one term, as if there was a terminal cost like in Γ. We can exploit the simple structure of the cost function to further weight the difference of desired reaction forces between two subsequent time-steps, in order to avoid abrupt changes in the control input. This new cost term in the discrete case has the following form: Γdf = 1 2 N−1 X k=0 f(k) −f(k −1) 2 Kdf (19) with Kdf a suitable positive semi-definite matrix of gains and initialized with f(−1), considered as a datum, e.g. at the beginning it corresponds to the wrench necessary to satisfy the equilibrium of the robot momentum. Finally, the overall cost function Γ is: Γ = Γγ + Γimp γ + Γf + Γdf. (20) VI. SIMULATION RESULTS The presented MPC approach has been tested in the Gazebo simulator [33] by using the iCub model. The iCub humanoid robot [34] has been conceived to study develop- mental cognitive systems. It possesses 53 actuated joints, but only a total of 23 degrees of freedom (DoF) are used for balancing tasks, i.e. we do not consider those located in the eyes and in the hands. Driven by the need of fast prototyping, the presented con- troller has been developed using the MATLAB R ⃝/Simulink R ⃝ environment, taking advantage of WBToolbox library [35] and YARP Plugins [36] to establish connection with the simulator. In order to test the presented MPC scheme, we conceived a simple stepping scenario, where the robot, balancing on its left foot, will use the right foot to take a step. This simple scenario allows us to test the performance of the proposed controller with a single contact activation. We present the results of different pushes, applied on the traverse plane, with an angle with respect to the lateral axis (pointing to the right of the robot), of 20◦(Fig. 1a), −20◦ (Fig. 1b) and 45◦(Fig. 1c). We choose a time step of 10ms, coincident with the rate of the whole-body torque controller, and a controller horizon of N = 25. We noticed that the chosen value of N is sufficient to allow the effectiveness of the strategy when the robot is pushed from different directions. The push is nearly impulsive, applied on the chest with a magnitude around 100N, which is about one third of the robot weight force. Figure 1 shows the CoM evolution for the three exper- iments, i.e. with different directions of pushes. It can be noticed in both Figures 1a and 1c that two pushes occur. The first one does not violate the condition in Eq.(11), thus it does not force the robot to take a step. The second one, instead, triggers a change in the support feet configuration, and as a consequence, a new desired configuration for the CoM. Figure 2 represents the tracking of the desired vertical forces output by the MPC controller for the side-push experiment. Remarkably, the normal force on the right foot appears to be tracked also across the step. Figure 3 shows one of the benefits of torque control. The tracking of the joint position reference on the right knee undergoes a strong perturbation after the step. When hitting the ground, the intrinsic compliance introduced by torque control allows to absorb the impact, especially on this joint. This induce a peak of 30◦of tracking error, but the robot is still able to balance. In addition, torque control helps avoiding problems related to a not perfect placement of the swing foot before the impact. Summarizing, the presented controller allows the robot to recover from pushes of various intensity and directions, while remaining able to perform involved step movements. VII. CONCLUSIONS AND FUTURE WORK The proposed controller adopts a slightly approximated model of the robot linear and angular momentum dynamics in a predictive framework. It allows to take into account step movements by varying the structure of constraints and cost functions across the change of contacts configuration. The uncertainty on its actual time instant is considered by a “shift” in the prediction window. An heuristic is also employed to determine a condition for stepping. The contact wrenches are assumed to be control inputs and realized by the robot through a modified version of the iCub momentum- based whole-body torque controller. Taking [37] into a comparison, this approach avoids the definition of a CoM trajectory along the step, leaving the responsibility to the optimizer rather than to the designer. As a return, the proposed strategy presents higher robustness properties with the drawback of an increased computational complexity. Since this approach has been meant to provide reaction in real-time, efforts have to be payed in order to reduce the time needed to get a solution. At the present time it takes almost 0.1 seconds to solve an instance of the presented formulation on a machine running Ubuntu 16.04. The PC is equipped with a quad-core Intel R ⃝Core i5@2.30GHz and 16GB of RAM. MOSEK R ⃝is the selected solver, accessed through the MATLAB R ⃝interface CVX [38]. An optimized C++ version of this strategy is currently under development. In addition, for a practical implementation on the real robot, the delay between the feedback and the actual generation of references should be reduced as much 0 5 10 -0.2 0 0.2 0.4 0.6 (a) Side step 0 5 10 -0.2 0 0.2 0.4 0.6 (b) Back step 0 5 10 -0.4 -0.2 0 0.2 0.4 (c) Forward step Fig. 1: CoM evolution during three different experiments of push recovery. The single dashed lines in (a) and (c) show a not-enough-strong push to violate the condition in Eq. (11). The external push force has been applied, with respect to the lateral axis, at 20◦, −20◦and 45◦for the cases (a), (b) and (c) respectively. 0 5 10 0 100 200 300 400 Fig. 2: Vertical components of the contact forces during the side-step experiment. Measured forces are plotted with dashed lines, while desired forces with the solid lines. as possible. Alternatively, we might consider a one step delay for the application of the control action, directly in the model. An additional entry on the checklist of future improve- ments consists in providing the MPC controller with infor- mation about the leg kinematics. This would allow to insert the step peculiar characteristics (step duration and location) directly inside the optimization. REFERENCES [1] M. W. Spong, Control Problems in Robotics and Automation. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, ch. Underactuated mechanical systems, pp. 135–150. [2] J. Lygeros, C. Tomlin, and S. Sastry, “Hybrid systems: modeling, analysis and control,” preprint, 1999. [3] D. Mayne and H. Michalska, “Receding horizon control of nonlinear systems,” IEEE Transactions on Automatic Control, vol. 35, no. 7, pp. 814–824, Jul 1990. [4] G. De Nicolao, L. Magni, and R. Scattolini, “Stability and robustness of nonlinear receding horizon control,” in Nonlinear model predictive control. Springer, 2000, pp. 3–22. [5] D. Mayne, J. Rawlings, C. Rao, and P. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789 – 814, 2000. 0 5 10 -100 -50 0 Fig. 3: Position tracking error of the right knee joint. The knee absorbs the impact with the ground, as it can be noticed by the peak in the error right after the impact. [6] M. Lazar, W. Heemels, S. Weiland, and A. Bemporad, “Stabilizing model predictive control of hybrid systems,” IEEE Transactions on Automatic Control, vol. 51, no. 11, pp. 1813–1818, 2006. [7] A. Bemporad, W. M. H. Heemels, and B. De Schutter, “On hybrid sys- tems and closed-loop MPC systems,” IEEE Transactions on Automatic Control, vol. 47, no. 5, pp. 863–869, 2002. [8] S. Kajita, F. Kanehiro, K. Kaneko, K. Yokoi, and H. Hirukawa, “The 3d linear inverted pendulum mode: a simple modeling for a biped walking pattern generation,” in Intelligent Robots and Systems, 2001. Proceedings. 2001 IEEE/RSJ International Conference on, vol. 1, 2001, pp. 239–246 vol.1. [9] P.-B. Wieber, “Trajectory free linear model predictive control for stable walking in the presence of strong perturbations,” in 2006 6th IEEE- RAS International Conference on Humanoid Robots. IEEE, 2006, pp. 137–142. [10] H. Diedam, D. Dimitrov, P.-B. Wieber, K. Mombaur, and M. Diehl, “Online walking gait generation with adaptive foot positioning through linear model predictive control,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2008, pp. 1121– 1126. [11] M. Missura and S. Behnke, “Balanced walking with capture steps,” in Robot Soccer World Cup. Springer, 2014, pp. 3–15. [12] B. J. Stephens and C. G. Atkeson, “Push recovery by stepping for humanoid robots with force controlled joints,” in Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on. IEEE, 2010, pp. 52–59. [13] J. Pratt, J. Carff, S. Drakunov, and A. Goswami, “Capture point: A step toward humanoid push recovery,” in Humanoid Robots, 2006 6th IEEE-RAS International Conference on, Dec 2006, pp. 200–207. [14] J. Pratt, T. Koolen, T. De Boer, J. Rebula, S. Cotton, J. Carff, M. Johnson, and P. Neuhaus, “Capturability-based analysis and con- trol of legged locomotion, part 2: Application to M2V2, a lower body humanoid,” The International Journal of Robotics Research, p. 0278364912452762, 2012. [15] J. Englsberger, C. Ott, M. A. Roa, A. Albu-Sch¨affer, and G. Hirzinger, “Bipedal walking control based on capture point dynamics,” in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2011, pp. 4420–4427. [16] M. Krause, J. Englsberger, P.-B. Wieber, and C. Ott, “Stabilization of the capture point dynamics for bipedal walking based on model predictive control,” IFAC Proceedings Volumes, vol. 45, no. 22, pp. 165–171, 2012. [17] H. Dai, A. Valenzuela, and R. Tedrake, “Whole-body motion planning with centroidal dynamics and full kinematics,” in 2014 IEEE-RAS International Conference on Humanoid Robots. IEEE, 2014, pp. 295–302. [18] A. Herzog, N. Rotella, S. Schaal, and L. Righetti, “Trajectory gen- eration for multi-contact momentum control,” in Humanoid Robots (Humanoids), 2015 IEEE-RAS 15th International Conference on. IEEE, 2015, pp. 874–880. [19] J. Carpentier, S. Tonneau, M. Naveau, O. Stasse, and N. Mansard, “A versatile and efficient pattern generator for generalized legged locomo- tion,” in Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016, pp. 3555–3561. [20] S. Caron and A. Kheddar, “Multi-contact walking pattern genera- tion based on model preview control of 3D COM accelerations,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [21] H. Dai and R. Tedrake, “Planning robust walking motion on uneven terrain via convex optimization,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [22] B. Ponton, A. Herzog, S. Schaal, and L. Righetti, “A convex model of momentum dynamics for multi-contact motion generation,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Con- ference on, Nov 2016. [23] F. Nori, S. Traversaro, J. Eljaik, F. Romano, A. Del Prete, and D. Pucci, “iCub whole-body control through force regulation on rigid noncoplanar contacts,” Frontiers in Robotics and AI, vol. 2, no. 6, 2015. [24] D. Pucci, F. Romano, S. Traversaro, and F. Nori, “Highly dynamic balancing via force control,” in IEEE-RAS International Conference on Humanoid Robots (Humanoids),. IEEE, 2016, pp. 141–141. [25] G. Nava, F. Romano, F. Nori, and D. Pucci, “Stability analysis and design of momentum-based controllers for humanoid robots,” Intelli- gent Robots and Systems (IROS) 2016. IEEE International Conference on, 2016. [26] B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo, Robotics: Mod- elling, Planning and Control, 2009. [27] S. Feng, X. Xinjilefu, C. G. Atkeson, and J. Kim, “Robust dynamic walking using online foot step optimization,” in Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on. IEEE, 2016, pp. 5373–5378. [28] O. E. Ramos, N. Mansard, and P. Soueres, “Whole-body motion integrating the capture point in the operational space inverse dynamics control,” in Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014, pp. 707–712. [29] A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and M. Diehl, “Online walking motion generation with automatic footstep placement,” Advanced Robotics, vol. 24, no. 5-6, pp. 719–737, 2010. [30] J. Betts, Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. Society for Industrial and Applied Mathematics, 2010. [31] M. Rijnen, A. Saccon, and H. Nijmeijer, “On optimal trajectory tracking for mechanical systems with unilateral constraints,” in 2015 54th IEEE Conference on Decision and Control (CDC). IEEE, 2015, pp. 2561–2566. [32] J. B. Biemond, N. van de Wouw, W. M. H. Heemels, and H. Nijmeijer, “Tracking control for hybrid systems with state-triggered jumps,” IEEE Transactions on Automatic Control, vol. 58, no. 4, pp. 876–890, 2013. [33] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings. 2004 IEEE/RSJ International Con- ference on, pp. 2149 – 2154, 2004. [34] G. Metta, G. Sandini, D. Vernon, D. Caldwell, N. Tsagarakis, R. Beira, J. Santos-Victor, A. Ijspeert, L. Righetti, G. Cappiello, et al., “The RobotCub project-an open framework for research in embodied cog- nition,” Humanoids Workshop, IEEE–RAS International Conference on Humanoid Robots, December, 2005. [35] F. Romano, S. Traversaro, D. Pucci, A. D. Prete, J. Eljaik, and F. Nori, “A whole-body software abstraction layer for control design of free-floating mechanical systems,” in 2017 IEEE 1st International Conference on Robotic Computing, 2017. [36] E. Mingo, S. Traversaro, A. Rocchi, M. Ferrati, A. Settimi, F. Romano, L. Natale, A. Bicchi, F. Nori, and N. G. Tsagarakis, “Yarp based plugins for gazebo simulator,” in 2014 Modelling and Simulation for Autonomous Systems Workshop (MESAS), Roma, Italy, 5 -6 May 2014, 2014. [37] S. Dafarra, F. Romano, and F. Nori, “Torque-controlled stepping- strategy push recovery: Design and implementation on the iCub Humanoid robot,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [38] I. CVX Research, “CVX: Matlab software for disciplined convex programming, version 2.0,” Aug. 2012. A Predictive Momentum-Based Whole-Body Torque Controller: Theory and Simulations for the iCub Stepping Stefano Dafarra, Francesco Romano, Gabriele Nava, Francesco Nori Abstract— When balancing, a humanoid robot can be easily subjected to unexpected disturbances like external pushes. In these circumstances, reactive movements as steps become a necessary requirement in order to avoid potentially harmful falling states. In this paper we conceive a Model Predictive Controller which determines a desired set of contact wrenches by predicting the future evolution of the robot, while taking into account constraints switching in case of steps. The control inputs computed by this strategy, namely the desired contact wrenches, are directly obtained on the robot through a modifi- cation of the momentum-based whole-body torque controller currently implemented on iCub. The proposed approach is validated through simulations in a stepping scenario, revealing high robustness and reliability when executing a recovery strategy. I. INTRODUCTION The unpredictable nature of the real world is one of the leading obstacles to the widespread diffusion of robots out- side labs. These complicated mechanical systems ought to be robust against a wide spectrum of disturbances. Considering humanoid robots, the aim for robustness translates into pre- venting stumbling. Despite the apparent straightforwardness of such an objective, complications arise when facing it from an algorithmic point of view. Considering humanoid robots, problems arise due to their intrinsic underactuation [1] which, in essence, means that by exploiting all theirs actuated degrees-of-freedom they can control their internal configuration, but they cannot affect directly their global pose. For example, an astronaut in the space is free to move all his limbs, but without exploiting any hooking he cannot move inside the spaceship. On the contrary, by exploiting contacts with the environment it is possible to circumvent this limitation. Additional difficulties arise by the fact that contacts may change over time. This results in a different evolution of the constrained dynamical system making the overall system hybrid [2], i.e. it possesses both a continuous and discrete time dynamics. An appealing research problem consists in applying Model Predictive Control (MPC) techniques to these particular systems. MPC is a particular optimal control method which enables the introduction of the feedback into the optimization procedure thanks to the “Receding Horizon Principle” [3], [4], [5]. The basic concept consists in solving, at each time step, a new finite-horizon optimal control problem initialized at the current plant state. At every time instant This work was supported by the FP7 EU project CoDyCo (No. 600716 ICT 2011.2.1 Cognitive Systems and Robotics) and Horizon 2020 EU project An.Dy. (No. 731540 Research and Innovation Programme) The authors are with the iCub Facility Department, Istituto Italiano di Tecnologia, 16163 Genova, Italy (e-mail: name.surname@iit.it) only the first control input is then applied to the plant. Model predictive controllers are appealing for controlling hybrid systems [6], [7] since the full hybrid model can be exploited. Indeed, thanks to the prediction capabilities, it is possible to include inside the formulation both time- and state-dependent switching, performing anticipatory actions for the imminent variation in the dynamics. However MPC does not solve those problems related to the numerical integration of hybrid systems, which indeed is an open research problem. In literature the original underactuated hybrid dynamics is usually scaled down to simplified models, making the problem easier to be handled. Simple models like the Linear Inverted Pendulum (LIP) [8] are widely adopted. In this context, MPC has been applied in order to stabilize walking patterns [9], [10], [11], especially of position controlled robots. The result is usually a slow and steady walking style characterized by a nearly constant Center of Mass (CoM) height. The LIP model has been applied also in [12] with a robot equipped with force controlled joints. While being easily applicable, this simple model provides only a limited amount of informations about the actual dynamics of the robot. A related popular approach is based on the Capture Point framework [13]. While keeping the model complexity low, this technique allows to evaluate the possibility of the robot to stay in the upright position. Starting from the simple Linear Inverted Pendulum, the model has been progressively enriched [14] to catch different robot peculiarities. This method is particularly interesting due to the possibility of drawing stability criteria [15] and in [16] authors applied MPC techniques based on the Capture Point formulation. A recent trend seems to move in a different direction. The CoM dynamic model is substituting the simple pendulum, especially when planning complex trajectories. In [17], [18], [19] a kinematic planner is merged with one based on the CoM dynamics to accomplish complex and feasible motions, while taking into account different contact locations. The main issues of this approach are usually related to the computational complexity, while the optimization problem is generally non-convex, introducing problems of local min- ima. In other applications, only the momentum dynamics is considered. In [20] only the 3D CoM acceleration is taken into consideration, while in [21] authors propose a convex upper bound of the angular momentum to be minimized. In [22] the derivative of the angular momentum is approximated by using quadratic constraints together with slack variables necessary to keep the approximation error low. Nevertheless, in this approach, it is not possible to directly penalize the use of the angular momentum, while introducing many additional arXiv:1705.10635v2 [cs.RO] 28 Jul 2017 variables into the formulation. Following this direction, we present here a momentum- based whole-body torque controller based on a MPC for- mulation. In particular, the dynamic evolutions of the robot linear and angular momentum are taken into account. We thus make use of a reduced model: differently from simpli- fied models used in literature, the robot momentum is an exact model that captures the global behaviour of the robot. We deal with the complications introduced by the derivative of the angular momentum by resorting to a Taylor expansion. The presented controller allows to deal directly with the intrinsic hybrid dynamic of the system by considering time- varying constraints. The peculiarity of our approach also resides in the fact that the computed control inputs, i.e. the contact wrenches, are directly applied on the robot, rather than used to define a joint position reference trajectory. In particular, the presented scheme inherits its structure from the momentum-based whole body torque controller [23], [24], [25] implemented on iCub. Torque control is particularly suitable for our application given that it permits to absorb the impacts efficiently, maintaining the balance also in case of robot positioning errors. We tested this approach on the iCub humanoid robot while performing a step recovery strategy. II. BACKGROUND A. Notation Throughout this paper we adopt the following notation. • I represents an inertial reference frame with the origin placed on the ground, while the orientation is so that the z−axis points against the gravity and the x−axis is oriented frontally with respect to the robot. • 1n represents a n×n identity matrix. 0n×m ∈Rn×m is a zero matrix while 0n = 0n×1 is a zero column vector of size n. • xCoM ∈R3 is the position of the center of mass with respect to I. • ∥x∥2 W = x⊤Wx is the weighted square norm of x. • x∧∈R3×3 denotes the skew-symmetric matrix such that x∧y = x × y, where × denotes the cross product operator in R3. • ⌈·⌉denotes the ceiling operator which outputs the rounding of the argument toward +∞. B. Whole-body torque control The predictive controller we propose in this paper, relies on a whole-body optimization-based torque controller for the stabilization of the desired contact wrenches. The original whole-body control algorithm is a momentum-based hier- archical controller composed of two control objectives [24], [25]. The first, and most priority objective, is the tracking of a desired robot momentum while the second is the stabilization of the zero dynamics. In this paper we slightly modify the original controller by directly commanding desired contact wrenches. We thus “substitute” the first control objective, i.e. generation of contact wrenches to track a desired robot momentum, with our controller proposed in this paper. The second objective, which is left as in the original controller, is responsible for constraining the joint variables and avoid internal divergent behaviours [25]. When described as an optimization problem, this second objective can be formulated as: min τ ∥τ −ψ∥2 (1a) s.t. M(q) ˙ν + h(q, ν) −J⊤f ref = Bτ (1b) J ˙ν + ˙Jν = 0 (1c) ψ := hj(q, 0) −J(j),⊤f ref −Kj p(qj −qref j ) −Kj d ˙qj (1d) Eq.(1b) describes the free-floating dynamics of the mechani- cal system [26, Ch. 7], where (q, ν) is the free-floating state of the robot. Eq.(1c) is the constraint equation describing the kinematic constraints associated with the contacts. Eq.(1d), which resembles a PD plus gravity and contact wrenches compensation, plays the role of a desired joint torque ref- erence where hj and J(j) denotes the joint space bias term and Jacobian respectively. Note that the controller depends on two inputs: f ref and qref j . As we will show later, our predictive controller is responsible of choosing the former quantity, while the latter is the output of a inverse kinematics algorithm. III. PROBLEM FORMULATION A. The model In this work, we examine the scenario where the only contacts available are those expressed at the feet. Without loss of generality, we consider as a test case the robot balancing on single support and performing a step adopting the right foot as swing limb. The rate of change of the robot momentum, when expressed in a frame located at the center of mass with the same orientation of I, has the following expression: " ¨xCoM ˙Hang # =  m−113 03×3 03×3 13  h CoMXl CoMXr i  fl fr  +¯g (2) where m ∈R is the mass of the robot and ¯g the 6D gravity acceleration. The indexes l and r refer to left and right foot respectively. fi ∈R6, with i either l or r, is the contact wrench, composed of 3D forces and torques applied at the contact point, namely fi =  f⊤ i , τ ⊤ i  . Matrices CoMXi, have the following particular structure: CoMXi =  13 03×3 (xi −xCoM)∧ 13  . (3) The presence of xi, highlights the dependency of the mo- mentum from the foot position, but the presented model does not carry any information about the robot kinematics. The variables involved in the formulation do not affect directly xi, thus it is not possible to define its feasible values properly. In addition, due to the very dynamic task the robot is going to perform, the tracking of a reference for the swing foot may not be perfect. Consequently, local modifications of a preplanned foot trajectory might not have a relevant role. In literature the foot positioning problem is usually ad- dressed by exploiting the LIP model, as in [27], [28] or [29]. The kinematics limitations can be introduced through simple box constraints, which may not have a direct connection with the actual robot kinematic limits. As a consequence we rely on external planners to decide where to place the foot. This information will be considered as a datum for the presented strategy, thus keeping its computational complexity low. For similar reasons we can assume to know the instant where the impact is going to happen, here called timpact. Given the application we consider, the time necessary to take the step should be as little as possible. Consequently, it has been chosen to be the minimum time the robot needs to physically perform the step given its limits. This happened to be a common assumption in literature ([27], [29], [18]), since it also avoids to introduce integer variables into the optimal control formulation, thus avoiding np-hard complexity. B. The angular momentum The second set of rows of Eq.(2), represents the derivative of angular momentum ˙Hang, i.e.: ˙Hang = l,r X i (xi −xCoM)∧fi + τi. (4) If we consider both xCoM and fi as two optimization vari- ables, its product renders the formulation non-convex and thus much harder to be solved, mainly due to possible local minima. In our application, we can admit a certain level of approximation, since during the step this quantity does not need to be precisely controlled to zero. On the other hand, angular momentum can be used to penalize the usage of contact wrenches. For that purpose the Taylor expansion (around the last available values of fi and xCoM) truncated to the first order represents an acceptable approximation: ˙Hang ≈ {l,r} X i τi +  xi −x0 CoM ∧ f0 i + +  xi −x0 CoM ∧ fi −f0 i  + +  f0 i ∧ xCoM −x0 CoM  ⇒˙Hang ≈ {l,r} X i τi +  xi −x0 CoM ∧ fi + +  f0 i ∧ xCoM −x0 CoM  . (5) Notice that the anticommutative property of the cross prod- uct, i.e. x∧y = −y∧x, has been exploited. The superscript 0 refers to the last feedback retrieved from the robot, which will be used as the point around which we compute the Taylor expansion. By truncating at the first order, the ap- proximation error is o fi −f0 i  xCoM −x0 CoM  . Thus, the less the CoM moves from the feedback position the better the approximation will be. This is usually the case given short prediction windows. Finally, we define γ as the state and f as the control variable for the MPC problem, γ := h x⊤ CoM ˙x⊤ CoM H⊤ ang i⊤ ∈R9, (6a) f := h f ⊤ l f ⊤ r i⊤ ∈R6 (6b) then, we can rewrite Eq.(2) as: ˙γ = ˜Evγγ + ˜Fγf + ˜Gγ + ˜S0 γ (7) where ˜Evγ =   03×3 13 03×3 03×3 03×3 03×3 f0 l + f0 r ∧ 03×3 03×3  , ˜Fγ =   03×3 03×3 03×3 03×3 m−113 03×3 m−113 03×3 xl −x0 CoM ∧ 13 xr −x0 CoM ∧ 13  , ˜Gγ =   05 −g 03  , ˜S0 γ = " 06 − f0 l + f0 r ∧x0 CoM # . Here, ˜S0 γ introduces the constant terms resulting from the Taylor approximation. Under the above considerations, the model described by Eq.(7) is affine and can be easily inserted into a QP formulation, as described in Sec. V. C. Constraints definition The constraints introduced in the formulation are mainly physical limitations induced by contacts. Considering con- tacts located at the feet, those constraints enforce the fea- sibility of the exerted wrenches. For example, the applied wrenches must be within the friction cone (approximated through a polyhedral convex cone) while their point of application (i.e. the center of pressure, CoP) should lie inside the support polygon. Formally, we can write these constraints linearly as follows: Aclfl ≤bcl ∀t : t ≤tf. (8) Notice that these constraints are consistent only if the foot is in contact. Thus, they have not to be applied on the swing limb, as the corresponding required wrench should be null when t < timpact. The constraints applied on the swing foot attempt to catch the hybrid nature of the system by varying in time. Formally: ( fr = 06 ∀t : t < timpact Acrfr ≤bcr ∀t : timpact ≤t ≤tf. (9) By considering the CoM dynamics only, we elude the complications introduced by time-varying constraints on the robot configuration after the establishment of a new contact (e.g. the velocity of the corresponding foot should be zero). D. Cost function definition The cost function Γ possesses different components that act only before or after the impact. Formally, it has the following expression: Γ = 1 2 Z tf 0 γ(τ) −γd(τ) 2 Kγ dτ+ (10a) + Z tf ¯timp γ(τ) −γd(τ) 2 Kimp γ dτ+ (10b) + Z tf 0 f(τ) 2 Kf dτ+ (10c) + γ(tf) −γd(tf) 2 Kimp γ ! . (10d) K(·) (·) is a real positive semi-definite matrix of gains with suitable dimensions, accounting also for unit of measurement mismatches inside Γ. ¯timp is the minimum between timpact and tf (to avoid negative integrals when the impact is expected to occur after tf) while, for the sake of simplicity, the initial time instant is set to zero. Note that it is possible to vary the cost applied to the state γ after the impact through the use of the matrix Kimp γ . Finally, a terminal cost term, weighted by the same matrix Kimp γ , models the finiteness of the control horizon. Particular attention has been payed to the choice of the references, here expressed with the superscript d. Indeed, even if the model carries no information on how the step is made, the CoM dynamics highly influences the resultant motion. The choice of the references are therefore crucial for performing a step correctly, since they are the only insight we could give to the controller about the desired posture of the robot. On the other hand, to provide a precise CoM trajectory would reduce the benefits and the efficiency of the MPC controller. After the completion of the step, we would like to have the CoM over the convex hull centroid. Nevertheless, this objective does not constraint the CoM trajectory during the step. Ideally, we could leave the optimizer to come out with an optimal trajectory, given initial and final conditions. For this reason, we decide to weight the traverse components (i.e. x and y) of the CoM position only in the terminal cost Eq.(10d) and after the step is made, Eq.(10b). On the other hand, the z−component of the CoM position is continuously weighted, allowing to have some authority over the CoM trajectory, useful to perform the step movement correctly. The CoM velocity and the angular momentum are always minimized. As a last term, we weight the requested reaction forces with the goal of avoiding impulsive responses. Additionally, having smooth references for the desired wrenches allows to reduce the tracking error introduced by the underlying momentum controller. E. References definition for the whole-body torque controller The outputs of the MPC strategy are used as references for the whole-body torque controller described in Section II-B. In particular, f ref = f, i.e. the contact wrenches are directly used as references. Regarding the desired joint positions qref j , we solve an inverse kinematics problem, viz., we impose a desired pose for the swing foot, following as close as possible the center of mass trajectory defined by the MPC controller. IV. MPC AS A STEP TRIGGER Let us consider the scenario where the robot has to perform a step because of an external perturbation. In principle we could leverage the presented MPC formulation to define the moment in which to start the motion, up to now considered as a datum. This moment can be defined as the instant where the controller is not able to bring the CoM back to a desired equilibrium point, given the present support configuration. In other words, the constraints related to the balancing configu- ration prevent the controller to recover from the push. Thus, it is necessary to step and change balancing configuration in order to avoid a fall. The availability of a prediction horizon particularly suits this idea. Hypothetically, if tf = ∞, as soon as the robot is pushed, we could predict whether or not the controller will be able to absorb the disturbance completely. In practice this is not true. The finiteness of the prediction horizon hides the actual recovery capabilities of the controller, thus it is still necessary to define an heuristic. By considering the very last predicted state, we can set the following condition: ∥xCoM(tf) −xd CoM∥+kv∥˙xCoM(tf) −˙xd CoM∥< ¯d. (11) When Eq.(11) is violated, the robot performs the step. Both kv and ¯d ∈R are user-defined parameter affecting the sensitivity to pushes. The smaller ¯d the more the robot will be inclined to take a step, while kv allows to regulate the relative importance of the two errors. Notice that Eq.(11) does not depend directly on the feet dimensions. In order to be effective, this heuristic needs the MPC strategy to be in charge of sending references to the robot, even if the step will not be performed at all. Indeed, Eq.(11) is based on a prediction which assumes that all the computed wrenches are directly applied on the system. Nevertheless, notice that here the goal is different from what has been presented in Sec. III-D. In a sense, the robot should do its best to avoid a step. As a consequence, looking at Eq.(10), ¯timp will be equal to tf (i.e. the step will not occur at all), while Kγ should be equal to Kimp γ . In practice, the cost function should resemble the case where the impact already happened, in order to weight correctly the whole state terms. V. QUADRATIC PROGRAMMING TRANSCRIPTION We solve the finite-horizon optimal control problem pre- sented in the previous section by using direct numerical methods. In particular we convert the original optimal control problem into a single Quadratic Programming (QP) problem to be solved at each time step. The general form of a QP problem is the following: minimize χ 1 2χ⊤Hχ + χ⊤g (12a) subject to: Aχ ≤b (12b) where χ ∈Rn is the set of optimization variables, H ∈ Rn×n a positive semi-definite matrix (the “Hessian” matrix), g ∈Rn the “gradient” vector, while A ∈Rng×n and b ∈Rng define the set of ng constraints applied to the optimization variables. A. Model transcription We discretize the model using forward Euler formulation. Different approaches may have been chosen, as described in [30], but we decided that Euler integration was suitable thanks to its simplicity. Discretizing Eq.(7), we obtain: γ(k + 1) = Evγγ(k) + Fγf(k) + Gγ + S0 γ (13) where Evγ = 19 + dt ˜Evγ, Fγ = dt ˜Fγ, Gγ = dt˜gγ, S0 γ = dt ˜S0 γ while k ∈N, k = 0, · · · , N −1 is the discrete time, N = ⌈tf/dt⌉and dt ∈R is the time step length. We can now define the optimization vector χ as the collection of all the states and control variables over the horizon. Formally: χ =   γ(1) f(0) ... γ(k) f(k −1) ... γ(N) f(N −1)   , χ ∈R9N+12N. (14) We thus opted for leaving the state γ as an optimization variable. This choice allows for an easy reformulation of the cost function, while preserving a particular sparse structure which may be useful for the optimizer. According to the receding horizon principle, at each time step, f(0) will be applied to the system, through a modified version of the balancing controller implemented on iCub, as presented in Sec. III-A. A crucial point to be considered, is the definition of the time instant at which the impact occurs. We assume the impact to be impulsive and taking place at the beginning of a time step which we denote with kimpact. We further assume that the control input f(k) is applied piecewise constantly, i.e. constant from instant k to k + 1. Note that, in view of the above two assumptions, setting kimpact = 0 implies that both feet are at contact already at the initial time k = 0. As mentioned previously, we do not directly model the distance between the swing foot and the ground, but we com- pute the position and the expected impact time at the begin- ning of the push strategy. The following procedure describes how we update the expected impact instance throughout the proposed MPC controller. Assuming the initial time instant to be always set to zero, we start with a value of kimpact equal to ⌈timpact/dt⌉. At each controller execution, we set the new value to max{kimpact−1, 1}, which implies that, if the impact has not occurred yet, we saturate kimpact to 1, i.e. we expect the impact to occur at the second time step in order to avoid requiring wrenches on a swinging limb. If the impact has occurred, instead, kimpact is correctly set to 0. Note that the formal definition of the impact time and the trajectory control of mechanical system with unilateral constraints (as it is our case), are still an open research problem [31], [32]. It is now possible to write the discretized dynamics as an equality constraints, transforming the initial continuous time optimal control problem into Eq.(12). By expanding Eq.(13) over the horizon we get the following constraint: Ev −eγ  χ = −G −Evγ0 −S0 (15) where Ev ∈R9N×21N and G, Evγ0, S0 ∈R9N×1. The matrix eγ ∈R9N×21N selects the terms related to state γ from vector χ, while Ev has a peculiar banded structure due to the choice of vector χ. In particular, different repetitions of Evγ and Fγ are included, replicating Eq.(13) for each time step in the control horizon. The dependency from γ(0) is considered through matrix Evγ0, constituted by only zeros, except from the first block of nine rows, which are equal to Evγ. Finally G considers the gravitational effects, while S0 contains the constant terms related to the angular momentum. A similar approach can be used to transcribe the con- straints presented in III-C: Ahor χ ≤bhor. (16) Matrices Ahor and bhor condense the constraints of Eq.(8) and Eq.(9) for each time instant. In order to avoid a varying number of constraints from one MPC execution to the other, we adopted a simple expedient. An upper and lower bound on the normal force fz are added, among friction and CoP constraints. In order to set the wrench on the swing foot to zero, it is simply necessary to set both the previously mentioned upper and lower bound to zero. All the other components of the wrench are automatically set to zero thanks to the friction and CoP constraints. Thus, it is simply necessary to correctly handle the terms inside bhor without changing the number of constraints. B. Cost function transcription The last stage of the transcription process involves the cost function of Eq.(10). As before, this process consists in a discretization stage followed by a reformulation phase necessary to express the cost function in terms of χ. This latter stage is omitted here, but it simply exploits the appro- priate extractor matrices, as it has been done for the model transcription through eγ. Starting from the first term, Eq.(10a), the discretization corresponds to (10a) ⇒Γγ = 1 2 N X k=1 γ(k) −γd(k) 2 Kγ (17) noticing that the cost is evaluated starting from 1, avoiding to consider the feedback γ(0). Similarly, the other contributions can be discretized as: (10b)+(10d) ⇒Γimp γ = = 1 2 N X k=¯kimp γ(k) −γd(k) 2 Kimp γ , (18a) (10c) ⇒Γf = 1 2 N−1 X k=0 f(k) 2 Kf , (18b) where ¯kimp is the minimum between N and kimpact, so that Γimp γ is composed by at least one term, as if there was a terminal cost like in Γ. We can exploit the simple structure of the cost function to further weight the difference of desired reaction forces between two subsequent time-steps, in order to avoid abrupt changes in the control input. This new cost term in the discrete case has the following form: Γdf = 1 2 N−1 X k=0 f(k) −f(k −1) 2 Kdf (19) with Kdf a suitable positive semi-definite matrix of gains and initialized with f(−1), considered as a datum, e.g. at the beginning it corresponds to the wrench necessary to satisfy the equilibrium of the robot momentum. Finally, the overall cost function Γ is: Γ = Γγ + Γimp γ + Γf + Γdf. (20) VI. SIMULATION RESULTS The presented MPC approach has been tested in the Gazebo simulator [33] by using the iCub model. The iCub humanoid robot [34] has been conceived to study develop- mental cognitive systems. It possesses 53 actuated joints, but only a total of 23 degrees of freedom (DoF) are used for balancing tasks, i.e. we do not consider those located in the eyes and in the hands. Driven by the need of fast prototyping, the presented con- troller has been developed using the MATLAB R ⃝/Simulink R ⃝ environment, taking advantage of WBToolbox library [35] and YARP Plugins [36] to establish connection with the simulator. In order to test the presented MPC scheme, we conceived a simple stepping scenario, where the robot, balancing on its left foot, will use the right foot to take a step. This simple scenario allows us to test the performance of the proposed controller with a single contact activation. We present the results of different pushes, applied on the traverse plane, with an angle with respect to the lateral axis (pointing to the right of the robot), of 20◦(Fig. 1a), −20◦ (Fig. 1b) and 45◦(Fig. 1c). We choose a time step of 10ms, coincident with the rate of the whole-body torque controller, and a controller horizon of N = 25. We noticed that the chosen value of N is sufficient to allow the effectiveness of the strategy when the robot is pushed from different directions. The push is nearly impulsive, applied on the chest with a magnitude around 100N, which is about one third of the robot weight force. Figure 1 shows the CoM evolution for the three exper- iments, i.e. with different directions of pushes. It can be noticed in both Figures 1a and 1c that two pushes occur. The first one does not violate the condition in Eq.(11), thus it does not force the robot to take a step. The second one, instead, triggers a change in the support feet configuration, and as a consequence, a new desired configuration for the CoM. Figure 2 represents the tracking of the desired vertical forces output by the MPC controller for the side-push experiment. Remarkably, the normal force on the right foot appears to be tracked also across the step. Figure 3 shows one of the benefits of torque control. The tracking of the joint position reference on the right knee undergoes a strong perturbation after the step. When hitting the ground, the intrinsic compliance introduced by torque control allows to absorb the impact, especially on this joint. This induce a peak of 30◦of tracking error, but the robot is still able to balance. In addition, torque control helps avoiding problems related to a not perfect placement of the swing foot before the impact. Summarizing, the presented controller allows the robot to recover from pushes of various intensity and directions, while remaining able to perform involved step movements. VII. CONCLUSIONS AND FUTURE WORK The proposed controller adopts a slightly approximated model of the robot linear and angular momentum dynamics in a predictive framework. It allows to take into account step movements by varying the structure of constraints and cost functions across the change of contacts configuration. The uncertainty on its actual time instant is considered by a “shift” in the prediction window. An heuristic is also employed to determine a condition for stepping. The contact wrenches are assumed to be control inputs and realized by the robot through a modified version of the iCub momentum- based whole-body torque controller. Taking [37] into a comparison, this approach avoids the definition of a CoM trajectory along the step, leaving the responsibility to the optimizer rather than to the designer. As a return, the proposed strategy presents higher robustness properties with the drawback of an increased computational complexity. Since this approach has been meant to provide reaction in real-time, efforts have to be payed in order to reduce the time needed to get a solution. At the present time it takes almost 0.1 seconds to solve an instance of the presented formulation on a machine running Ubuntu 16.04. The PC is equipped with a quad-core Intel R ⃝Core i5@2.30GHz and 16GB of RAM. MOSEK R ⃝is the selected solver, accessed through the MATLAB R ⃝interface CVX [38]. An optimized C++ version of this strategy is currently under development. In addition, for a practical implementation on the real robot, the delay between the feedback and the actual generation of references should be reduced as much 0 5 10 -0.2 0 0.2 0.4 0.6 (a) Side step 0 5 10 -0.2 0 0.2 0.4 0.6 (b) Back step 0 5 10 -0.4 -0.2 0 0.2 0.4 (c) Forward step Fig. 1: CoM evolution during three different experiments of push recovery. The single dashed lines in (a) and (c) show a not-enough-strong push to violate the condition in Eq. (11). The external push force has been applied, with respect to the lateral axis, at 20◦, −20◦and 45◦for the cases (a), (b) and (c) respectively. 0 5 10 0 100 200 300 400 Fig. 2: Vertical components of the contact forces during the side-step experiment. Measured forces are plotted with dashed lines, while desired forces with the solid lines. as possible. Alternatively, we might consider a one step delay for the application of the control action, directly in the model. An additional entry on the checklist of future improve- ments consists in providing the MPC controller with infor- mation about the leg kinematics. This would allow to insert the step peculiar characteristics (step duration and location) directly inside the optimization. REFERENCES [1] M. W. Spong, Control Problems in Robotics and Automation. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, ch. Underactuated mechanical systems, pp. 135–150. [2] J. Lygeros, C. Tomlin, and S. Sastry, “Hybrid systems: modeling, analysis and control,” preprint, 1999. [3] D. Mayne and H. Michalska, “Receding horizon control of nonlinear systems,” IEEE Transactions on Automatic Control, vol. 35, no. 7, pp. 814–824, Jul 1990. [4] G. De Nicolao, L. Magni, and R. Scattolini, “Stability and robustness of nonlinear receding horizon control,” in Nonlinear model predictive control. Springer, 2000, pp. 3–22. [5] D. Mayne, J. Rawlings, C. Rao, and P. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789 – 814, 2000. 0 5 10 -100 -50 0 Fig. 3: Position tracking error of the right knee joint. The knee absorbs the impact with the ground, as it can be noticed by the peak in the error right after the impact. [6] M. Lazar, W. Heemels, S. Weiland, and A. Bemporad, “Stabilizing model predictive control of hybrid systems,” IEEE Transactions on Automatic Control, vol. 51, no. 11, pp. 1813–1818, 2006. [7] A. Bemporad, W. M. H. Heemels, and B. De Schutter, “On hybrid sys- tems and closed-loop MPC systems,” IEEE Transactions on Automatic Control, vol. 47, no. 5, pp. 863–869, 2002. [8] S. Kajita, F. Kanehiro, K. Kaneko, K. Yokoi, and H. Hirukawa, “The 3d linear inverted pendulum mode: a simple modeling for a biped walking pattern generation,” in Intelligent Robots and Systems, 2001. Proceedings. 2001 IEEE/RSJ International Conference on, vol. 1, 2001, pp. 239–246 vol.1. [9] P.-B. Wieber, “Trajectory free linear model predictive control for stable walking in the presence of strong perturbations,” in 2006 6th IEEE- RAS International Conference on Humanoid Robots. IEEE, 2006, pp. 137–142. [10] H. Diedam, D. Dimitrov, P.-B. Wieber, K. Mombaur, and M. Diehl, “Online walking gait generation with adaptive foot positioning through linear model predictive control,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2008, pp. 1121– 1126. [11] M. Missura and S. Behnke, “Balanced walking with capture steps,” in Robot Soccer World Cup. Springer, 2014, pp. 3–15. [12] B. J. Stephens and C. G. Atkeson, “Push recovery by stepping for humanoid robots with force controlled joints,” in Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on. IEEE, 2010, pp. 52–59. [13] J. Pratt, J. Carff, S. Drakunov, and A. Goswami, “Capture point: A step toward humanoid push recovery,” in Humanoid Robots, 2006 6th IEEE-RAS International Conference on, Dec 2006, pp. 200–207. [14] J. Pratt, T. Koolen, T. De Boer, J. Rebula, S. Cotton, J. Carff, M. Johnson, and P. Neuhaus, “Capturability-based analysis and con- trol of legged locomotion, part 2: Application to M2V2, a lower body humanoid,” The International Journal of Robotics Research, p. 0278364912452762, 2012. [15] J. Englsberger, C. Ott, M. A. Roa, A. Albu-Sch¨affer, and G. Hirzinger, “Bipedal walking control based on capture point dynamics,” in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2011, pp. 4420–4427. [16] M. Krause, J. Englsberger, P.-B. Wieber, and C. Ott, “Stabilization of the capture point dynamics for bipedal walking based on model predictive control,” IFAC Proceedings Volumes, vol. 45, no. 22, pp. 165–171, 2012. [17] H. Dai, A. Valenzuela, and R. Tedrake, “Whole-body motion planning with centroidal dynamics and full kinematics,” in 2014 IEEE-RAS International Conference on Humanoid Robots. IEEE, 2014, pp. 295–302. [18] A. Herzog, N. Rotella, S. Schaal, and L. Righetti, “Trajectory gen- eration for multi-contact momentum control,” in Humanoid Robots (Humanoids), 2015 IEEE-RAS 15th International Conference on. IEEE, 2015, pp. 874–880. [19] J. Carpentier, S. Tonneau, M. Naveau, O. Stasse, and N. Mansard, “A versatile and efficient pattern generator for generalized legged locomo- tion,” in Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016, pp. 3555–3561. [20] S. Caron and A. Kheddar, “Multi-contact walking pattern genera- tion based on model preview control of 3D COM accelerations,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [21] H. Dai and R. Tedrake, “Planning robust walking motion on uneven terrain via convex optimization,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [22] B. Ponton, A. Herzog, S. Schaal, and L. Righetti, “A convex model of momentum dynamics for multi-contact motion generation,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Con- ference on, Nov 2016. [23] F. Nori, S. Traversaro, J. Eljaik, F. Romano, A. Del Prete, and D. Pucci, “iCub whole-body control through force regulation on rigid noncoplanar contacts,” Frontiers in Robotics and AI, vol. 2, no. 6, 2015. [24] D. Pucci, F. Romano, S. Traversaro, and F. Nori, “Highly dynamic balancing via force control,” in IEEE-RAS International Conference on Humanoid Robots (Humanoids),. IEEE, 2016, pp. 141–141. [25] G. Nava, F. Romano, F. Nori, and D. Pucci, “Stability analysis and design of momentum-based controllers for humanoid robots,” Intelli- gent Robots and Systems (IROS) 2016. IEEE International Conference on, 2016. [26] B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo, Robotics: Mod- elling, Planning and Control, 2009. [27] S. Feng, X. Xinjilefu, C. G. Atkeson, and J. Kim, “Robust dynamic walking using online foot step optimization,” in Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on. IEEE, 2016, pp. 5373–5378. [28] O. E. Ramos, N. Mansard, and P. Soueres, “Whole-body motion integrating the capture point in the operational space inverse dynamics control,” in Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014, pp. 707–712. [29] A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and M. Diehl, “Online walking motion generation with automatic footstep placement,” Advanced Robotics, vol. 24, no. 5-6, pp. 719–737, 2010. [30] J. Betts, Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. Society for Industrial and Applied Mathematics, 2010. [31] M. Rijnen, A. Saccon, and H. Nijmeijer, “On optimal trajectory tracking for mechanical systems with unilateral constraints,” in 2015 54th IEEE Conference on Decision and Control (CDC). IEEE, 2015, pp. 2561–2566. [32] J. B. Biemond, N. van de Wouw, W. M. H. Heemels, and H. Nijmeijer, “Tracking control for hybrid systems with state-triggered jumps,” IEEE Transactions on Automatic Control, vol. 58, no. 4, pp. 876–890, 2013. [33] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings. 2004 IEEE/RSJ International Con- ference on, pp. 2149 – 2154, 2004. [34] G. Metta, G. Sandini, D. Vernon, D. Caldwell, N. Tsagarakis, R. Beira, J. Santos-Victor, A. Ijspeert, L. Righetti, G. Cappiello, et al., “The RobotCub project-an open framework for research in embodied cog- nition,” Humanoids Workshop, IEEE–RAS International Conference on Humanoid Robots, December, 2005. [35] F. Romano, S. Traversaro, D. Pucci, A. D. Prete, J. Eljaik, and F. Nori, “A whole-body software abstraction layer for control design of free-floating mechanical systems,” in 2017 IEEE 1st International Conference on Robotic Computing, 2017. [36] E. Mingo, S. Traversaro, A. Rocchi, M. Ferrati, A. Settimi, F. Romano, L. Natale, A. Bicchi, F. Nori, and N. G. Tsagarakis, “Yarp based plugins for gazebo simulator,” in 2014 Modelling and Simulation for Autonomous Systems Workshop (MESAS), Roma, Italy, 5 -6 May 2014, 2014. [37] S. Dafarra, F. Romano, and F. Nori, “Torque-controlled stepping- strategy push recovery: Design and implementation on the iCub Humanoid robot,” in Humanoid Robots (Humanoids), 2016 IEEE-RAS International Conference on, Nov 2016. [38] I. CVX Research, “CVX: Matlab software for disciplined convex programming, version 2.0,” Aug. 2012.