A Convex Model of Humanoid Momentum Dynamics for Multi-Contact Motion Generation Brahayam Ponton1, Alexander Herzog1, Stefan Schaal12 and Ludovic Righetti1 Abstract— Linear models for control and motion generation of humanoid robots have received significant attention in the past years, not only due to their well known theoretical guar- antees, but also because of practical computational advantages. However, to tackle more challenging tasks and scenarios such as locomotion on uneven terrain, a more expressive model is required. In this paper, we are interested in contact interaction- centered motion optimization based on the momentum dynam- ics model. This model is non-linear and non-convex; however, we find a relaxation of the problem that allows us to formulate it as a single convex quadratically-constrained quadratic program (QCQP) that can be very efficiently optimized. Furthermore, experimental results suggest that this relaxation is tight and therefore useful for multi-contact planning. This convex model is then coupled to the optimization of end-effector contacts location using a mixed integer program, which can be solved in realtime. This becomes relevant e.g. to recover from external pushes, where a predefined stepping plan is likely to fail and an online adaptation of the contact location is needed. The performance of our algorithm is demonstrated in several multi- contact scenarios for a humanoid robot. I. INTRODUCTION One of the major challenges that a humanoid robot faces when walking or running is keeping its balance. The diffi- culty arises from the fact that to move its Center of Mass (CoM) in a direction other than that of gravity, it needs to generate external forces by dynamically interacting with the environment through the creation of intermittent physical contacts. However, these contact forces, that allow to gen- erate and control locomotion, are limited by the mechanical laws of unilateral contact [3]: feet can only push and not pull on the ground. This means that arbitrary motions are not possible, and therefore being able to continuously answer questions such as: where to place the feet, how hard to push, or in which direction to move the body? are important for generating safe and stable motions, even more in the case of uneven terrain or strong perturbations. On one hand, simplified models (usually based on re- searchers’ intuition about the simplified dynamics of hu- manoid robots, such as the linear inverted pendulum model (LIPM) [15], [21]) provide an answer to these questions by formulating an optimal control problem, that exploits the linearity of the model and can be repeatedly solved in a receding horizon fashion as a quadratic program [9], [25]. This research was mainly supported by the Max-Planck-Society, the European Research Council (ERC) under the European Union Horizon 2020 research and innovation programme (grant agreement No 637935) and the Max-Planck ETH Center for Learning Systems. 1Max Planck Institute for Intelligent Systems - Autonomous Motion De- partment, Tuebingen, Germany firstname.lastname@tuebingen.mpg.de 2University of Southern California - Computational Learning and Motor Control Lab, Los Angeles, USA This model predictive control scheme endows robustness to the control strategy by bringing viability as a side effect of the optimization [18] and has made these approaches very successful in controlling locomotion of bipedal robots [10], [22]. However, simplified models have limitations, because of its assumptions such as co-planar footsteps, constant CoM height, zero angular momentum, among others, which might be undesirable in more dynamic maneuvers. On the other hand, the benefits of using a full dynamics model have been demonstrated with the optimization of more complex behaviors as in [19], [23]. While these approaches generate sophisticated whole body behaviors by making use of full rigid body dynamics models and simultaneous optimization of contact forces and robot motion, they are computationally expensive. Besides, in these formulations, the optimization landscape is very high-dimensional, prone to local-minima, non-convex and discontinuous due to contacts, characteristics that make the problem very hard to optimize. Between these two extremes, there is a full range of model choices. An interesting approach is [13], [7], where a full kinematics model and the centroidal momentum dynamics model are combined. In this approach, the momentum gen- erated kinematically and dynamically match, ensuring equiv- alence to the full dynamics model, provided that there exists enough torque authority. [13] further shows that only the momentum equations are necessary to reason about dynam- ics, which allows to solve the problem iteratively between kinematics and dynamics optimization. It is also shown that the optimal control problem using the centroidal momentum dynamics can be solved very efficiently (by exploiting the problem structure) as a sequence of convex QCQPs. [6], [11] show experimentally the utility of controlling momentum to stabilize and generate dynamic motion in a humanoid robot. Other interesting line of research is the use of mixed- integer programs to optimize not only continuous postural adjustments and contact forces, but also discrete changes in the contact state. [14] shows the benefits of simultaneous adaptation of gait pattern and posture in a humanoid walking on flat ground. This work considers a simplified dynamics model (LIPM) to find footsteps that help the robot to main- tain stability and locomote. [8] does not consider a dynamics model, however it presents a method for footsteps planning on uneven terrain with obstacles using a mixed-integer quadratically-constrained quadratic program (MIQCQP). In- teger decision variables are used to select out of a set of safe regions, the region over which to step. Another unique feature is the piecewise affine approximation of rotation, which keeps the form of the problem as a MIQCQP and makes it efficiently solvable to its global minimum. arXiv:1607.08644v1 [cs.RO] 28 Jul 2016 In this paper, we are interested in planning: CoM mo- tion and momentum, contact interaction forces, and a short sequence of contacts consistent with the desired dynamic motion (Fig. 1). Specifically our contributions are: 1) We use the result presented in [13] (namely, that the torque contribution of each end-effector to the angular momentum rate can be analytically decomposed into a convex and a concave part) to find a convex relaxation of the angular momentum dynamics. This allows us to formulate the problem as a single convex QCQP. Our approach significantly improves computational complexity and can solve the momentum optimization problem in realtime. Moreover, experimental results suggest that the relaxation is tight: solutions of the relaxed convex problem correspond to the global opti- mum of the original, non-convex, problem. 2) Using our convex model, we extend [8] by including a dynamic model and hand contacts. It allows to plan together contact locations, contact forces and momen- tum dynamics. Therefore, the contact plan is not blind to the dynamic evolution of the robot. Furthermore, it is fast enough to be used online for a short preview sequence of contacts. The remainder of this paper is structured as follows. In Sec. II, we present the problem formulation. Then, in Sec. III, we show how to obtain a convex model of the momentum dynamics and present the extension of the contact planner to incorporate a dynamics model. We show experimental results in Sec. IV and conclude the paper in Sec. V. II. PROBLEM FORMULATION The dynamic model of a floating-base rigid body system is H(q)¨q+C(q, ˙q) = STτj +JT e λ , where q = qT j xTT denotes the robot state. x ∈SE(3) is the position and orientation of the floating base frame of the robot with respect to the inertial frame and qj ∈Rnj are joints positions. H(q) ∈Rn j+6×n j+6 is the inertia matrix; C(q, ˙q) ∈ Rn j+6 the vector of Coriolis, centrifugal, gravity forces; S = In j×n j 0 ∈Rn j×n j+6 the selection matrix and represents the under-actuation of the system. τ j ∈Rn j is the vector of joint torques; Je the Jacobian of the contact constraints and λ = ··· fe τe ··· the vector of generalized forces, composed of forces fe and torques τe acting at contact e. The equations of motion can be decomposed into an actuated a and an un-actuated u part as follows: Ha(q)¨q+Ca(q, ˙q) = τj +JT e,aλ (1a) Hu(q)¨q+Cu(q, ˙q) = JT e,uλ (1b) Equation (1b) can be interpreted as the Newton-Euler equations of the system. It expresses the change of momen- tum of a robot as a function of external forces. Under the assumption of enough torque authority, any combination of forces λ and accelerations ¨q can be realized, as shown by the actuated part of the equations of motion (1a), if they are consistent with the underactuated dynamics [24], [11]. This suggests a natural decomposition for planning dynamically Fig. 1: Capabilities of the proposed algorithm. consistent multi-contact motions for legged robots: the mo- mentum equations are sufficient to ensure dynamic feasibility and Equation (1a) is only necessary to ensure kinematic feasibility and torque limits [12]. The centroidal dynamics equations, when expressed at the robot CoM, are: ˙h =   ˙r ˙l ˙k  =   1 Ml Mg+∑e fe ∑e(pe +ze −r)×fe +τe   (2) where r, l and k denote the CoM position, linear and angular momenta, respectively. M is the robot total mass and g the gravity vector. ze is the center of pressure (CoP) position within the end-effector support region with respect to the point pe, denoting the position of the e end-effector. Addi- tionally, these dynamics are subject to physical constraints such as friction cones, CoPs within its support region in order to avoid tilting and torque limits. In this paper, we concentrate on finding a convex formu- lation of the centroidal momentum dynamics (2), to plan optimal dynamic motions and contact locations in realtime, which could then be realized by a low-level controller such as an inverse dynamics one [12]. Formally, we would like to find a solution that min pe,ze,fe,τe φN(hN)+ N−1 ∑ t=1 ℓt(h,pe,ze,fe,τe)∆t (3) minimizes the sum of a terminal cost φN(hN) and a running cost ℓt(h,pe,ze,fe,τe), as will be defined later, over the avail- able controls (namely contact locations pe, CoP locations ze, and contact wrenches fe,τe), under the discretized centroidal momentum dynamics:   rt lt kt ˙lt ˙kt   =   rt−1 + ∆t M lt lt−1 +˙lt∆t kt−1 + ˙kt∆t Mg+∑e fe,t ∑e κe,t   (4) where the variable κe,t (end-effector contribution to angular momentum rate ˙kt) has been defined as κe,t = (pe,φ(t) +ze,t −rt)×fe,t +τe,t = ℓe,t ×fe,t +τe,t =   0 −ℓz e,t ℓy e,t ℓz e,t 0 −ℓx e,t −ℓy e,t ℓx e,t 0     f x e,t f y e,t f z e,t  +τe,t (5) For notational simplicity we used the change of variable ℓe,t = (pe,φ(t) + ze,t −rt). The variable φ(t) has been intro- duced to denote that the end-effector position pe,φ(t) remains fixed during a phase or sequence of predefined time steps. Physical constraints such as friction cone, CoP within region of support and torque limits are given by Lf x e,t + Lf y e,t ≤µf Lf z e,t, Lf z e,t ≥0, (6a) Lz x e,t ∈[z x min,z x max], Lz y e,t ∈  z y min,z y max  , (6b) Lτz e,t ≤µτ Lf z e,t (6c) where the left super-script L, denotes that the variables are expressed in local coordinate frames. (6a) expresses that forces belong to a friction cone with coefficient µf. (6b) expresses that the CoP should be within a conservative region with respect to the real physical available region. (6c) constraints the torque to a cone with torsional coefficient µτ [5]. Torques Lτx e,t and Lτy e,t in local coordinate frames are zero. Local variables are mapped to the inertial frame through an appropriate rotation matrix Re,t. We use right super-script on the rotation matrix to denote a particular set of its columns. fe,t = Re,t Lfe,t τe,t = Rz e,t Lτz e,t ze,t = Rx,y e,t Lzx,y e,t (7) Friction cones are usually approximated by a set of hy- perplanes in pyramid shape. However, as we will write the dynamics as a convex QCQP, we do not require this approximation and keep the friction cone as a second order cone constraint on the contact forces. The only non-convexity in the described dynamics model comes from the variable κe,t, which will be our focus in the next section. III. APPROACH A. Reformulation of the dynamics Due to the practical and theoretical efficiency of solving a linear (LP) or a quadratic program, they have been exploited in many previous works, such as those where the angular momentum dynamics are neglected or the CoM height is kept constant ([1], [9], [10], [14], [15] to name a few). With few exceptions, the usual approach when the model complexity increases is to directly resort to general nonlinear solvers such as sequential quadratic programming or interior point methods, which might or might not find a solution for the non-convex problem [6], [19], and also importantly, if they find one, the required time is not comparable to our formulation. In this paper, we find a convex approximation of the angular momentum dynamics, that allows to find solutions very efficiently. We will start by using the result presented in [13] that formulates the angular momentum 1 0 x x Ty -1 -1 0 y 1 -1 -0.5 0 0.5 1 Nonconvex 1 0 x -Q- Q+ -1 -1 0 y 1 0.5 0 -0.5 -1 1 Difference of Convex Fig. 2: Difference of convex functions decomposition. dynamics as a difference of convex functions, and then we will analyze different alternatives to cope with the non- convex term. Difference of Convex Functions Decomposition: The set of difference of convex functions C ± can be defined as: C ± =  C +(x)−C −(x) | x ∈Rn, C +,C −: Rn →R, are convex functions  This set is dense in the space of continuous functions [4], which means that it can approximate with arbitrary accuracy any continuous function. Of particular interest are functions ∈C ± that can be analytically decomposed, as is the case of scalar and cross products. Figure 2 shows an example decomposition of a scalar product xTy = Q+ −Q−, where Q+ = 1 4 ∥x+y∥2 , and Q−= 1 4 ∥x−y∥2 Using the example decomposition, it is easy to express each element of a cross product as an element of C ±. First, each element is defined as a scalar product ℓ×f =   −ℓz ℓy  | {z } aTcvx ′ dcvx ′ z}|{  fy fz  , ℓz −ℓx  | {z } bTcvx ′ ecvx ′ z}|{  fx fz  , −ℓy ℓx  | {z } cTcvx ′ fcvx ′ z}|{  fx fy    T (8) and then each scalar product is defined as an element of C ±. ℓ×f = α 4   acvx +dcvx 2 − acvx −dcvx 2 bcvx +ecvx 2 − bcvx −ecvx 2 ccvx + fcvx 2 − ccvx −fcvx 2   (9) Remark: When performing the decomposition, it is impor- tant not to forget the underlying physics of the system at hand. While the decomposition is mathematically correct and exploits the structure of the problem, the units of each of the terms of the decomposition do not match. Therefore, normalizing its variables is not only aesthetically pleasing, but also improves the conditioning of the problem, on which rate and region of quadratic convergence of Newton’s method depend. Remember that the decomposition squares forces and lengths, which further increases their ratio. In this problem, e.g. the variables a ′ cvx and d ′ cvx are defined in terms of lengths and forces, respectively. However, the variables acvx and dcvx are defined in terms of normalized lengths and forces, respectively. The numerical constant α encodes the normalization of lengths by the nominal length of end-effectors (shoulder to hand distance for hands, and CoM to foot distance for feet) and forces by gravity. Now that the cross product has been decomposed, we will analyze alternatives to cope with its non-convexity C −. We will first introduce the standard method and then show how it can be improved for our purposes. 1) Iterative Linearization: This is the standard method to solve problems with constraints ∈C ±, and the one used in [13]. It reformulates the non-convex constraint as a convex one, by using a first-order Taylor expansion of the concave term evaluated at the current guess of the optimal vector. In the case of equality constraints, a slack variable is used to relax it to an inequality constraint, and also a penalty term over the slack variable is added to the cost. This procedure is performed iteratively until convergence to a solution. In a simple example, this procedure would look as follows: min x∈Rn f(x) s.t. g(x) ≤0, h(x) = 0, x ∈Ω where f(x) is the objective to be minimized, g(x) is an in- equality constraint, h(x) ∈C ± = h+(x)−h−(x) is an equality constraint, and Ωis the feasible set, including upper and lower bounds on variables. Then, the problem to be solved iteratively would take the form: min x∈Rn f(x)+ µ∑si s.t. g(x) ≤0, x ∈Ω, si ≥0 h+(x)− h−(xk)+H− k (x−xk)  ≤si where µ is a penalty over the slack variables si (if more than one constraint needed to be relaxed), and H− k = ∇x h−(xk) is the gradient of the concave term. To sum up, this method approximates at each iteration the concave term C −by a locally valid hyperplane, and neglects its second-order infor- mation (source of non-convexity). It is important to note that it does not neglect the entire concave part, because this would drastically change the constraint, instead it approximates it locally by a convex form. In the following, we present a new alternative procedure, that does not require to solve a sequence of convex programs, but a single one. 2) Semidefinite Programming (SDP) Approach: In this section, we present our first idea at trying to reformulate the non-convexity of the problem. The main idea is to upper bound the positive and negative definite components of the angular momentum dynamics. These upper bounds would then linearly define the angular momentum rate ˙k and would belong to a convex set. It is well known that, under some mild assumptions, convex problems can be solved efficiently in theory. However, convexity alone is not enough to guarantee the existence of efficient solution algorithms in practice. One type of convex set, namely the positive semidefinite cone Sn +, is amenable for efficient optimization and commonly used to compute lower bounds of integer problems. In our setting, we will use SDP to compute every time step upper bounds for the positive and negative definite components of the angular momentum dynamics. In this way, the dynamics would be linear, and the upper bounds would be elements of the cone Sn +. For simplicity of presentation, we denote p = acvx + dcvx and q = acvx −dcvx (p,q ∈R2, see (9)) . Using this notation the x component of the torque contribution of an end-effector to the angular momentum rate dynamics (5) becomes κx = α 4  pTp−qTq  +τx , which is equivalent to the following formulation κx = α 4  Tr ppT −Tr qqT +τx , using the invariance of the trace under cyclic permutations: Tr ppT = Tr pTp  = pTp . At this point, we introduce the variables P, Q ∈S2 + (they are 2 × 2 positive semidefinite matrices, which means that e.g. xTPx ≥0) and perform the following change of variables κx = α 4  Tr P −Tr Q +τx (10a) where : P = ppT, Q = qqT (10b) Now, we can point out exactly to the non-convexity and know its shape. Equation (10b) means that the feasible set is a cone surface (which is non-convex). The convex relaxation consist in using as feasible set the convex hull of this cone surface. This means that (10b) becomes: ppT ⪯Sn+ P, qqT ⪯Sn+ Q , (11) In our specific problem, we can interpret the convex relax- ation as follows: We have introduced new variables P and Q, whose traces are upper bounds of pTp and qTq (that represent the positive and negative definite components of the contribution of each end-effector to the angular momentum rate). Then our work is to find conditions under which the gap (value difference between Tr P and Tr Q and the scalar products pTp and qTq, respectively) is as small as possible. Making the gap between the upper bounds and the actual quantities zero means that the constraints in the non- convex problem are also satisfied exactly. In our problem, this would mean that the values of the torque contribution of each end-effector κe,t computed using the difference of upper bounds and using cross products of lengths and forces match. We defer this for the next section and will concentrate now on analyzing the current formulation. In summary, we have relaxed the equality constraints to (ppT ⪯Sn+ P, qqT ⪯Sn+ Q), or equivalently P −ppT ∈Sn + ⇔ xT(P −ppT)x ≥0 and Q −qqT ∈Sn + ⇔xT(Q −qqT)x ≥ 0. If we were to actually solve the problem using this formulation, the convex approximation of the end-effector torque contribution to the angular momentum rate dynamics using linear matrix inequalities (LMI) would be κx = α 4  Tr P −Tr Q +τx (12a) P p pT 1  ⪰0, Q q qT 1  ⪰0 (12b) The transformation of inequalities (11) to (12b) is per- formed using Schur’s complement. What we can notice under the current formulation is that we have introduced new optimization variables P and Q, which happen to be matrices, and it seems that we have only inflated the problem. This is true indeed; however, we have learned that • A convex approximation of the angular momentum rate dynamics can be found using upper bounds of its positive and negative definite components. • We are approximating a scalar quantity (either pTp or qTq), therefore using a matrix for it is unnecessary. A scalar quantity would suffice our purposes. • Positive semidefinite cones are too general, there are other cones, special cases of Sn + for example, that we could use and more amenable to faster optimization. 3) Quadratically-Constrained Quadratic Program Ap- proach: As mentioned in the last subsection, SDP includes e.g. as special cases LP (when the symmetric matrices in- volved are diagonal) or SOCP (when the symmetric matrices have an arrow form) [2]. In this subsection, we simplify the formulation of the previous subsection to a convex quadratic inequality constraint. By applying the linear trace operator on the previously defined inequalities (11), we would get Tr(ppT) ≤Tr(P) , Tr(qqT) ≤Tr(Q) pTp ≤Tr(P) , qTq ≤Tr(Q) pTp ≤¯p , qTq ≤¯q where we have introduced the scalar variables ¯p, ¯q ∈R+ as the upper bounds of the positive and negative definite components of our dynamics constraint, which becomes κx = α 4 ¯p−¯q +τx where : pTp ⪯R+ ¯p, qTq ⪯R+ ¯q Using this idea, the angular momentum dynamics of our problem can be reformulated as convex quadratic constraints: κe,t = ℓe,t ×fe,t +τe,t = α 4   u+ x −u− x u+ y −u− y u+ z −u− z   e,t +τe,t = α 4  U+ e,t −U− e,t  +τe,t (13) where: acvx +dcvx 2 e,t ⪯R+ u+ x e,t, acvx −dcvx 2 e,t ⪯R+ u− x e,t, bcvx +ecvx 2 e,t ⪯R+ u+ y e,t, bcvx −ecvx 2 e,t ⪯R+ u− y e,t, ccvx + fcvx 2 e,t ⪯R+ u+ z e,t, ccvx −fcvx 2 e,t ⪯R+ u− z e,t. (14) where U+ e,t and U− e,t are upper bounds for each end-effector e and time-step t of the positive and negative definite components of the end-effector torque contribution κe,t to the angular momentum rate dynamics ˙kt. Notice that, in this formulation, differently from the iterative linearization, the concave part is not limited to take values within a hyperplane, but on the exact quadratic function, which is then upper bounded using a convex quadratic inequality constraint, instead of a LMI as in the SDP case. B. Cost function and summary In this subsection, we define the cost function to be optimized and summarize the optimization problem. The running cost ℓd t of the dynamics optimization is given by: ℓd t =∥lt∥2 Ql +∑ e U+ t,e 2 Qk + U− t,e 2 Qk + ∑ e ∥fe,t∥Qfe + Lτz e,t Qτe + Lzx,y e,t Qze (15) were a cost of the form ∥x∥2 Qx represents a quadratic cost xTQxx, with Qx a positive semi-definite matrix. This cost penalizes high forces fe,t and torques Lτz e,t, deviations of the CoP Lzx,y e,t from the end-effector position pe,φ(t) (such that it stays close to the center of the support region). This cost also includes a capturability penalty by penalizing a derivative of the CoM position [18], namely, the linear momentum lt. The terminal cost φN(hN) is usually defined as the point where we would like to be at the end of the time horizon or a velocity we would like to track. Remark In (15), we did not include directly a cost over the angular momentum kt (e.g. ∥kt∥2 Qk), because this variable is defined in terms of its rate ˙kt; which in turn is defined as a sum of per end-effector upper bounds of positive U+ e,t and negative U− e,t definite components (13); therefore, if we were to penalize either kt or ˙kt directly (e.g. ˙kt 2 Q˙k), they could be made trivially zero by the upper bounds (U+ e,t, U− e,t) taking any value higher than the scalar product they upper bound. In this case, there would be a gap between upper bounds and the actual positive and negative definite components defined by the scalar products (14). Therefore, in practice we penalize angular momentum indirectly by adding a cost over the upper bounds (U+ e,t and U− e,t), which is a penalization over the contribution to angular momentum rate per end- effector, separately for its positive and negative definite part. In this case, we have empirically found that the gap is, up to numerical precision, zero. Intuitively, you can think of the upper bounds as free variables that can take any value higher than the quantity they upper bound. However, as they are penalized, they will not be higher than needed by the angular momentum and will actually be on the cone surface, making the approximation tight. To ensure physical consistency at execution time, we also include the constraint pe,φ(t) −rt ≤ℓmax e , (16) that constraints the distance from the CoM to the end- effector position by the maximum end-effector length ℓmax e (previously called nominal length, used for normalization). This constraint holds as is for feet, but in the next section, we will show how it is adapted for hands. The following convex QCQP program summarizes the optimization problem assuming a fixed set of contacts: minimize φN(hN)+∑ t ℓd t ∆t subject to (4),(6),(7),(8),(13),(14),(16) ∀t,e . (17) The optimization variables are: CoM rt, linear momentum and its rate lt, ˙lt, angular momentum and its rate kt, ˙kt, wrenches in local and inertial coordinate frames fe,t, τe,t, Lfe,t, Lτz e,t, CoP in local and inertial coordinate frames ze,t, Lzx,y e,t , upper bounds U− e,t, U+ e,t, and per time-step and per end- effector auxiliary variables acvx, bcvx, ccvx, dcvx, ecvx, fcvx. From (8), we only use the definition of auxiliary variables acvx, bcvx, and ccvx in terms of lengths ℓe,t, and dcvx, ecvx, and fcvx in terms of forces in world coordinate frame. They are appropriately normalized, therefore, the primed variables can be replaced by non-primed ones (e.g a ′ cvx →acvx). Equation (15) defines the running cost ℓd t . C. Contacts planning Up to this point, we have described an optimization problem able to efficiently plan CoM motion and interaction forces of the robot with the environment given a plan of non-coplanar contacts. This optimization problem, formally described as a convex QCQP, can be easily embedded in the footstep planner algorithm [8]. The main adaptation would be in the definition of the cost, where the goal would no longer be finding a possibly large sequence of footsteps such that a desired final feet configuration (position and orientation) is achieved, but instead finding a short sequence of contacts that support the achievement of a dynamic motion. The optimization problem would be to minimize the cost given by ℓd t (15) plus a regularization of the distance between footsteps under the constraints imposed by the dynamics model (17) and the contacts planner [8]. Of course, a simplification of the contacts planner constraints is possible, given that its purpose differs from the original formulation. Exploiting the fact that, in this formulation the CoM of the robot is a decision variable, we make an extension of the algorithm, by including in the optimization, the search for contacts using hands. In the following, we briefly introduce algorithm [8] and its extension. 1) Original formulation: In [8], the terrain description consists of a set of convex, obstacle free regions r ∈{1,R}, and the optimization considers a sequence of j = φ(t) ∈ {1,n} footsteps. For each footstep j, a piecewise affine (PWA) approximation of sine and cosine is used in order to handle footstep rotation (see Fig.3). Su,j =⇒ ( φu ≤θj ≤φu+1 sj = guθj +hu u = 1,...,U Cv, j =⇒ ( φv ≤θ j ≤φv+1 cj = gvθ j +hv v = 1,...,V u ∈{1,U} and v ∈{1,V} are the number of linear functions used in the approximation. Su,j and Cv, j are integer decision variables that define the active sine and cosine linear function for footstep j. The intersection of θj ∈[φu,φu+1] and θj ∈ [φv,φv+1] is the region of validity of the approximation and, sj = guθj +hu and cj = gvθj +hv are the linear approxima- tions. Another set of integer decision variables Hr, j defines the region r ∈{1,R}, whose domain contains footstep j Hr, j =⇒Arpj ≤br r = 1,...,R , where the inequality constraint Arpj ≤br, constraints the footstep position pj to not only belong to the epigraph defined by region r, but also to be lying on it, because all footsteps in the plan are active. Additionally, integrality θ -3 -2 -1 0 1 2 3 -1 -0.5 0 0.5 1 guθj + hu gvθj + hv Cos(θ) Sin( θ) PWA Fig. 3: Piecewise affine approximation of sine and cosine. constraints are imposed over the integer decision variables, which renders the optimization efficient: ∑r Hr, j = ∑u Su, j = ∑v Cv, j = 1 Hr, j, Su, j, Cv, j ∈{0,1} This constraints each footstep to belong only to one region and to use only one linear model to approximate the rotation. Finally, another remarkable feature of [8], is the expression of reachability constraints between footstep positions as the intersection of SOCP constraints: px j py j  − px j−1 py j−1  +  cj −sj sj cj  ri  ≤di, i = 1,2 , where ri ∈R2 for i = 1,2 are distances in opposite direction from the last end-effector position pj−1 and can also be rotated by θ. They define new points, whose distance to the next footstep position pj cannot exceed di. 2) Extension to hand end-effectors: In the following, we present the extension of the algorithm, consisting in defining reachability and safe region constraints for hands, and including an additional set of integer decision variables that define the activation of hand end-effectors. Hands, in the same way as feet, are also constrained to belong to a unique safe region r at each phase j; however, as they cannot always be in contact, for them belonging to a safe region has the meaning of being in the epigraph of the region. Let’s denote by nr the outward-pointing normal of the surface of a region, by sr any point on the surface, and by Ah r, bh r the hyperplanes that define the borders of the region. For a hand the safe region constraint is defined as: Arpj ≤br where: Ar =  Ah r −nr  , and br =  bh r −nT r sr  , while for a foot, it also contains the constraint nT r pj ≤nT r sr, that constraints the contact to be lying on the surface. To define reachability constraints for hands, we introduce an additional vector rtr, defined as the difference between the position of the upper torso and the CoM. This vector has constant length, but encodes the current orientation of the torso. This vector is updated every time we re-optimize the contact plan, but is kept constant during the optimization. With this in mind, the reachability constraint becomes: pj −  rj +rtr +   c j −sj 0 s j cj 0 0 0 1  rsh   ≤ℓmax arm . pj is the position of a hand, rsh is a constant vector pointing from the upper torso to the shoulder of the corre- sponding hand, and ℓmax arm is the maximum distance between shoulder and hand. Therefore, the term within the norm is the relative position of a hand with respect to its shoulder. In the objective function, we include regularization of this term from a default relative position. A set of integer decision variables J ∈R2×n per hand is also used to define contact activations. J1, j =⇒ ( pe,j = pe,j−1 nT r pj ≤nT r sr J2, j =⇒ ( Lfe,t = 0 pe,j ̸= pe,j−1 valid ∀t ∈φ(t) = j. The columns of J are constrained to sum one: ∑i Ji,j = 1,∀j. Either the end-effector is in contact or not. The problem lies on the fact that, it is not possible to determine how many hand activations are necessary during a task. Therefore, the sum of the first row of J has to be left free ∑j J1, j ≤nmax. This is the fact that limits the planner to only a short sequence of contacts, usually n = nmax = (3 or 4), in order to obtain results in a reasonable time. Finally, the decision variables on safe regions, also affect the mapping of quantities from local to world frame, by the selection of the appropriate rotation matrix in (7). Hr, j =⇒      fe,t = Re,t∈φ(t) Lfe,t ze,t = Rx,y e,t∈φ(t) Lzx,y e,t τe,t = Rz e,t∈φ(t) Lτz e,t The mapping of discrete quantities j to continuous time is fixed by predefining the mapping j = φ(t), that defines the timing of each discrete phase in terms of the time t. IV. EXPERIMENTAL RESULTS In this section, we present experimental evaluations of our algorithm (Figure 1) on several multi-contact scenarios for a simulated humanoid robot: climbing on an uneven terrain using only feet, walking on an uneven terrain, walking on the same terrain with an external disturbance, climbing using hands and feet, traversing monkey bars and avoiding an obstacle using hands and feet. The results of the experiments are visible in the accompanying video 1. All the tasks have a time duration of around 10 seconds. The contact planner first finds a sequence of dynamically consistent contact locations. We use 5 timesteps between each contact change (timestep duration between 200 and 400ms). In a second stage the CoM, momentum and contact forces are optimized using the contact sequence. The time discretization in this case is finer (100ms). The whole plan is computed without any initial guess (only the final desired CoM location is specified). For all the tasks the complete plan is found at once, except for the push recovery task where the complete plan for the next two contacts is computed using a receding horizon of 200ms. Both optimization problems were solved using the parallel barrier algorithm implemented in Gurobi which is particularly efficient for second order 1It can also be found under https://youtu.be/qLrftO0w5g4 cone constraints and MIQCQP2. We use inverse kinematics (tracking both momentum and end-effector positions) to visualize the whole-body motion of the robot following the dynamic plan. Examples of the linear and momentum trajectories for a walking motion are shown in Fig. 6. Green lines de- pict trajectories optimized using the dynamic formulation presented in this paper. Blue lines show the momentum trajectories optimized kinematically to track the desired dynamic trajectories (used for videos). As can be seen, in this figure and in the videos, momentum trajectories are non- trivial. In this particular case, we show the linear momentum in the direction responsible for moving the CoM laterally between footsteps, and the angular momentum in the forward direction of walking, responsible for the motion of arms. A. Solution Time and Computational Complexity The computational complexity C of solving a dense but convex QCQP using a primal-dual interior point method with an m-self-concordant function is polynomial and of the order O(m 1 2 [m + n]n2) [17], where m is the number of convex quadratic inequalities and n the size of the optimization vector. However, in the case of a sparse problem (which is our case) we can expect to have a better computational complexity. Our formulation requires 9 quadratic inequalities per end-effector and time-step (6 to upper and lower bound the end-effector contribution to the angular momentum rate (14), 1 to constraint the distance from the CoM (16), 2 for friction and torque constraints (6a), (6c)). Note that they are part of the optimization, only if end-effector e is active at timestep t. All the tasks were optimized in around 1 sec (including both contact and momentum optimization). This is, to the best of our knowledge, much faster (at least one order of magnitude and much more in other cases) than other approaches to optimize motion for a humanoid robot under non-trivial conditions [6], [7], [13], [16], [19]. For the contact planner (MIQCQP), as mentioned previ- ously, we look ahead for 2 to 4 contacts and use a rough granularity for the dynamics. Depending on the number of terrain regions, the time required to find a contact plan varies between 200 and 500 milliseconds (See Fig. 1). The quadratic inequality constraint (16) helped to quickly discard infeasible stepping regions. Table 5 reports average time required in our formulation to build and solve the dynamics optimization problem (convex QCQP) given a fixed set of contacts for different horizon lengths. We used the ”Climbing an uneven terrain using hands” tasks because it is the task with the highest com- putational complexity. As can be seen in the table and the 2We found the same solutions using Snopt and Ipopt but the required time was orders of magnitude larger. Fig. 4: Climbing an uneven terrain using hands 0 20 40 60 80 100 Linear Momentum X -15 -10 -5 0 5 10 Dyn Kin Time [sec] 0 20 40 60 80 100 Angular Momentum Y -0.4 -0.2 0 0.2 Dyn Kin Fig. 6: Example momentum trajectories for a walking motion.n graph, thanks to the problem sparsity, the complexity does not deteriorate too quickly and for the operating conditions that we use, complexity increases close to linearly. If we use a dynamics granularity of 200ms (40 timesteps), it takes 142ms to find a 8 second long plan involving 4 contact changes. If we increase the dynamics granularity to 100ms (80 timesteps), it takes us around 421ms to solve the problem and we could solve it 2 times in a second. Another way to interpret this result is as follows: if we were to plan for a time horizon of 2 sec (maybe 2 footsteps), using a discretization step of 100ms (20 timesteps) we could solve the problem in around 85ms using the same granularity of 100ms, we could plan for an horizon of half a minute (320 timesteps) in 3 sec. Timesteps Time[ms] 10 40 20 85 40 142 80 421 Timesteps [n] 20 40 60 80 Timing [msec] 100 200 300 400 Fig. 5: Average time required to construct and solve the dynamics optimization problem given a fixed set of footsteps. B. Tightness of the solution As mentioned earlier, using the cost as defined in section III-B, all our numerical experiments found that the constraint relaxation is tight: the gap between upper, lower bounds and their corresponding true values is zero. It means that in these cases we found solutions that are also an optimal solution to the original (non-convex) problem. This result is potentially very interesting because if the relaxation was always tight, it would mean that it is always possible to find the global solution of the non-convex mo- mentum optimization problem by solving a convex QCQP. In the optimization literature, there are a few results about strong duality in non-convex quadratic optimization such as [20]. There is for example one strong result in the case when the Hessian of the objective has a null-space where it is shown for a simple case with a quadratic objective and a quadratic equality constraint, that because of the presence of this nullspace, the relaxation is tight (zero gap) and optimal. Since our objective function does not minimize for the angular momentum, there is also a nullspace, it might be possible that similar arguments could be used. However, a formal proof remains for future work. V. CONCLUSION We have proposed a convex relaxation of the momentum dynamics for leg robots that can be used to efficiently plan contact forces, CoM motion and momentum trajectories. Moreover, we have proposed an extension of a contact planning algorithm including our convex model to find dy- namically consistent contact sequences. Computation times of the algorithm are small enough to be used in a receding horizon fashion, moreover, numerical experiments suggest that the convex relaxation is tight. While a formal proof is missing, our result suggests that momentum and contact forces optimization, a non-convex problem, could be solved exactly using our convex relaxation, therefore removing the need for further model simplification. REFERENCES [1] H. Audren, J. Vaillant, A. Kheddar, A. Escande, K. Kaneko, and Yoshida. E. Model preview control in multi-contact motion-application to a humanoid robot. In IROS, pages 4030–4035, 2014. [2] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, 2009. [3] B. Brogliato. Nonsmooth Mechanics - Models, Dynamics and Control. Communications and Control Engineering. Springer-Verlag London, 2 edition, 1999. [4] F. Callier and C. Desoer. Linear Systems Theory. Springer, New York, 1994. [5] S. Caron, Q.C. Pham, and Y. Nakamura. Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas. In ICRA, pages 5107–5112. [6] J. Carpentier, S. Tonneau, M. Naveau, O. Stasse, and N. Mansard. A versatile and efficient pattern generator for generalized legged locomotion. In ICRA, 2016. [7] H. Dai, A. Valenzuela, and R. Tedrake. Whole-body motion planning with centroidal dynamics and full kinematics. In Humanoids, 2014. [8] R. Deits and R. Tedrake. Footstep planning on uneven terrain with mixed-integer convex optimization. In Humanoids, Madrid-Spain, pages 279–286, 2014. [9] D. Dimitrov, A. Paolillo, and P.B. Wieber. Walking motion generation with online foot position adaptation based on 1- and -norm penalty formulations. In ICRA, pages 3523–3529, 2011. [10] J. Englsberger, C. Ott, and A. Albu-Schffer. Three-dimensional bipedal walking control based on divergent component of motion. IEEE Trans. Robotics, 31(2):355–368, 2015. [11] A. Herzog, N. Rotella, S. Mason, F. Grimminger, S. Schaal, and L. Righetti. Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid. Auton. Robots, 40(3):473–491, 2016. [12] A Herzog, N Rotella, S Schaal, and L Righetti. Trajectory generation for multi-contact momentum-control. In Humanoids, 2015. [13] A. Herzog, S. Schaal, and L. Righetti. Structured contact force optimization for kino-dynamic motion generation. 2016. [14] A. Ibanez, P. Bidaud, and V. Padois. Emergence of humanoid walking behaviors from mixed-integer model predictive control. In IROS, 2014. [15] F. Kajita, S.and Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi, and H. Hirukawa. Biped walking pattern generation by using preview control of zero-moment point. In ICRA, pages 1620– 1626. IEEE, 2003. [16] I. Mordatch, E. Todorov, and Z. Popovic. Discovery of complex behaviors through contact-invariant optimization. ACM Trans. Graph., 31(4):43:1–43:8, 2012. [17] A. Nemirovski. Interior Point Polynomial Time Methods in Convex Programming. 2004. [18] Wieber. P.B. Viability and predictive control for safe locomotion. In IROS, pages 1103–1108, 2008. [19] M. Posa and R. Tedrake. Direct trajectory optimization of rigid body dynamical systems through contact. In WAFR, 2012. [20] M. Salahi. Convex optimization approach to a single quadratically constrained quadratic minimization problem. Central European J of Operations Res, 18(2):181–187, 2010. [21] P. Sardain and G. Bessonnet. Forces acting on a biped robot. center of pressure-zero moment point. IEEE Trans. Systems, Man, and Cybernetics, Part A, 34(5):630–637, 2004. [22] A. Sherikov, D. Dimitrov, and P.B. Wieber. Whole body motion controller with long-term balance constraints. In Humanoids, pages 444–450, 2014. [23] Y. Tassa, T. Erez, and E. Todorov. Synthesis and stabilization of complex behaviors through online trajectory optimization. In IROS, 2012. [24] P.-B. Wieber. Holonomy and nonholonomy in the dynamics of articulated motion. Fast Motions in Biomechanics and Robotics, pages 411–425, 2006. [25] P.B. Wieber. Trajectory free linear model predictive control for stable walking in the presence of strong perturbations. In Humanoids, pages 137–142, 2006.