Online Feedback Control for Input-Saturated Robotic Systems on Lie Groups Taosha Fan and Todd Murphey Department of Mechanical Engineering, Northwestern University Email: taosha.fan@u.northwestern.edu, t-murphey@northwestern.edu Abstract—In this paper, we propose an approach to designing online feedback controllers for input-saturated robotic systems evolving on Lie groups by extending the recently developed Sequential Action Control (SAC). In contrast to existing feedback controllers, our approach poses the nonconvex constrained non- linear optimization problem as the tracking of a desired negative mode insertion gradient on the configuration space of a Lie group. This results in a closed-form feedback control law even with input saturation and thus is well suited for online application. In extending SAC to Lie groups, the associated mode insertion gradient is derived and the switching time optimization on Lie groups is studied. We demonstrate the efficacy and scalability of our approach in the 2D kinematic car on SE(2) and the 3D quadrotor on SE(3). We also implement iLQG on a quadrator model and compare to SAC, demonstrating that SAC is both faster to compute and has a larger basin of attraction. Index-Terms— sequential action control; Lie group; online feedback control; input saturation I. INTRODUCTION Dynamic robotic tasks require suitable feedback controllers to simultaneously manage dynamics, kinematics, input satu- ration etc.. In the last thirty years, a number of techniques to design stabilizing control laws have been developed in robotics and control community, such as feedback linearization and backstepping [19, 22], linearzing the dynamics around a nominal trajectory to implement the linear quadratic regulator (LQR) or linear model predicative control (LMPC) [2, 5], dynamic programming [7] and nonlinear model predicative control (NMPC) [1, 27, 32, 34], or feedback motion planning via semidefinite programming to precompute a library of control policies filling up the space of possible initial states [18, 24, 33]. However, feedback linearization and backstepping are generally inapplicable when actuation limits exist while LQR and LMPC only work in a small neighbourhood of the nominal trajectory. Dynamic programming and NMPC are usually computationally expensive and feedback motion planning via semidefinite programming may need lots of time and memory to compute and store the library of control policies if the state space is high-dimensional and noncompact. Generally speaking, it is preferable to model robot locomo- tion systems, e.g., vehicular systems, on Lie groups rather than in coordinates since there are no singularity problems. Despite that numerous feedback controllers for systems evolving on Lie groups have been developed [17, 21, 23, 31, 37], all of them remain in the category of aforementioned techniques. Furthermore, feedback controllers on Lie groups resulted from existing control techniques tend to be much more complicated than their coordinate counterparts and as a result, might not be available for time-critical tasks in the presence of actuation limits. Unfortunately, nearly all robot locomotion tasks are required to manage input saturation while being executed in real time. Hence the application of these techniques [17, 21, 23, 31, 37] is severely restricted in practice. In this paper, we address the shortcomings of existing control techniques by proposing a novel approach to designing feedback controllers for systems evolving on Lie groups. Our approach is inspired by the recently developed Sequential Action Control (SAC) [3]. In contrast to existing control techniques, SAC improves the long-time performance of the controller by tracking a desired negative mode insertion gradi- ent and the resulting control action has a closed-form solution whose calculation is almost as inexpensive as simulation. More importantly, control actions in SAC can be efficiently saturated and hence it is well suited for robotic tasks in real time. SAC has solved various benchmark problems, such as cart-pendulum, pendubot, acrobot, spring-loaded inverted pendulum etc. in [3, 4, 35, 36]. Nevertheless, up till now, the application of SAC remains limited to nonlinear systems in coordinates and the purpose of this paper is to extend SAC to Lie groups so that we may synthesize control techniques directly on the Lie groups that are commonly used in robotics. A. Relationship to Existing Control Techniques The major difference of SAC from other optimization-based nonliner control techniques [1, 2, 5, 7, 18, 24, 27, 32, 33] is that expensive iterative computations of gradients and Hessians are avoided. Control inputs in SAC are computed and saturated based on the mode insertion gradient (Eq. (3)), whose argument is the time duration λ instead of control inputs u(t), and which can be exactly evaluated (not approximated) by Eq. (4). Moreover, most gradient-based methods such as DDP or iLQG attempt to update control inputs over the whole control horizon, whereas SAC only updates the next control input at a specified future time τ for a duration λ. Although SAC performs well for various nonlinear systems both in coordinates and on Lie groups as these shown in [3, 4, 35, 36] and Section V, we are not intending to draw an conclusion that SAC is superior to existing control techniques [1, 2, 5, 7, 18, 19, 22, 24, 27, 33]. Instead we are more interested in how our approach and existing control techniques can complement each other that results in a win-win solution. arXiv:1709.00376v1 [math.OC] 31 Aug 2017 As far as we know, SAC augments existing control tech- niques in the following ways: 1) SAC is implemented in the beginning to drive the non- linear system to a region where feedback linearization, backstepping, LQR, LMPC etc. are qualified for the final stabilization even with input saturation. 2) Control actions computed by SAC are used to initialize optimization-based techniques, e.g., NMPC, for which a reasonable initialization is of great significance. 3) In combination with feedback motion algorithms via semidefnitie programming [18, 24, 33], which requires to precompute a library of control policies, SAC serves as an early-stage online feedback motion planner to drive the nonlinear system to the target set of initial states that are covered by the feedback motion algorithms via semidefinite programming with a library of much smaller size so that precomputational time and storage memory are saved. B. Contribution and Organization The main contribution of this paper is extending the concept of sequential action control from coordinates to Lie groups so that feedback controllers can be constructed accordingly for input-saturated robotics systems evolving on Lie groups. A great advantage of our approach over SAC in coordinates is that our approach is not required to choose coordinates so that the expensive chart-switching and the associated sudden jump are avoided. In addition to extending SAC to Lie groups, switching time optimization on Lie groups is studied and the first order derivative, i.e., the mode transition gradient, is derived. The remainder of this paper is organized as following: Sec- tion II reviews sequential action control in coordinates and its advantages. Section III studies the switching time optimization for systems evolving on Lie groups and derives the associated mode insertion and transition gradients. Section IV analyzes the logarithm-constructed quadratic functions on Lie groups and presents exact expressions of d exp(ξ) and d exp−1(ξ) for SO(3), SE(2) and SE(3). Section V demonstrates the efficacy and scalability of our approach on two classic robotic systems evolving on Lie groups and Section VI concludes the paper. II. SEQUENTIAL ACTION CONTROL Sequential Action Control is an online model-based ap- proach to high-quality trajectory planning and tracking for nonlinear systems[3]. Compared with general NMPC meth- ods which minimizes the objective iteratively over a finite horizon while enforcing system dynamics as constraints, SAC computes control actions by tracking a desired negative mode insertion gradient, which is a control methodology common in hybrid systems. Since control actions in SAC are assumed to be applied in an infinitesimal duration λ →0+, there exists a closed-form solution to control actions having actuation limits which renders SAC well suited for online application. In this section, we review SAC in coordinates and discuss how to extend SAC to Lie groups. A. An Overview of Sequential Action Control Consider the affine nonlinear system f  x(t), u(t); t  = g  x(t); t  + h  x(t); t  · u(t) (1) where the state x(t) ∈Rn, control action u(t) ∈Rm, g  x(t); t  ∈Rn and h  x(t); t  ∈Rn×m. Let J1 = Z TN T0 L  g(τ)  dτ + ϕ  g(TN)  be the objective and suppose the dynamics f  x(t), u(t); t  first switches from the nominal mode f1  x(t); t  = f  x(t), u1(t); t  to the new mode f2  x(t); t  = f  x(t), u2(t); t  at time τ0 and then back to f1  x(t), u1(t); t  again after a duration of λ, i.e., u(t) = ( u1(t) t > τ0 + λ or t ≤τ0, u2(t) τ0 ≤t < τ0 + λ. (2) If the duration λ →0+ is infinitesimal, the mode insertion gradient dJ1 dλ 0+ is defined to be dJ1 dλ 0+ = lim λ→0+ ∆J1 λ (3) where ∆J1 is the objective change yielded by this switch. The model insertion gradient dJ1 dλ 0+ can be calculated exactly with Eq. (4) [11] dJ1 dλ 0+ = ρ(τ0)T · h f2  x(τ0); τ0  −f1  x(τ0); τ0 i (4) where ρ(t) are costates satisfying ˙ρ(s)T = −ρ(s)T · Dxf1  x(s); s  −DxL  x(s)  (5) subject to: ρ(TN) = Dgϕ  g(TN)  . By continuity, the objective change ∆J1 can be approximated with ∆J1 ≈dJ1 dλ 0+ · λ if the duration λ is small albeit not infinitesimal. So for sufficiently small λ > 0, J1 is reduced in the switch of Eq. (2) as long as the mode insertion gradient dJ1 dλ 0+ is negative. The idea of SAC is to compute control actions u∗(t) reducing the objective J1 by tracking a desired negative mode Algorithm 1 Sequential Action Control 1: Simulate states and costates (x(t), ρ(t)) with nominal control dynamics f1(t) 2: Compute u∗ 2(t) from (x(t), ρ(t)) by u∗ 2(t) = u1 + (Λ + R)−1h  x(t); t T ρ(t)αd 3: Modify u∗ 2(t) to satisfy the control bounds umin and umax 4: Determine the mode insertion time τ0 to maximize the negative mode insertion gradient 5: Determine the duration λ to apply u∗ 2(t) insertion gradient dJ1 dλ 0+ = αd instead of optimizing J1 iteratively. In tracking the negative mode insertion gradient αd, SAC incorporates feedback and computes control actions that guarantee an optimal/near-optimal reduction of J1, which eventually results in a long-time improvement. The mode insertion gradient tracking can be formulated as an adjoint optimization problem u∗ 2(t) = arg min u2(t) Z TN T0 "  dJ1 dλ 0+−αd 2 + ∥u2(s)−u1(s)∥2 R # ds (6) where R ∈Rm×m is symmetric positive-definite. For affine system Eq. (1), it has been proved that Eq. (6) has a closed- form solution u∗ 2(t) = u1 + (Λ + R)−1h  x(t); t T ρ(t)αd (7) where Λ := h  x(t); t T ρ(t)ρ(t)T h  x(t); t  [3]. As shown in Algorithm 1, control actions can be saturated in SAC and this is discussed in detail in Section II-B. For these not discussed, such as determining mode insertion time τ0 and control duration λ, interested readers can refer to [3] for a complete introduction. B. Saturating Control Actions If actuation limits exist, the adjoint optimization problem Eq. (6) can be reformulated by quadratic programming with umin(t) ≤u∗ 2(t) ≤umax(t) imposed as constraints. This is not preferable considering the high computational expense. Instead, if R ∈Rm×m is diagonal and let u′∗ 2(t)=u1 + (Λ + R)−1h  x(t); t T ρ(t)αd, then u∗ 2(t) can be saturated elementwise by [u∗ 2(t)]i =      [umin]i [u′∗ 2(t)]i < [umin]i, [u′∗ 2(t)]i [umin]i ≤[u′∗ 2(t)]i ≤[umax]i, [umax]i [u′∗ 2(t)]i > [umax]i. (8) Despite that u∗ 2(t) determined by Eq. (8) may not be optimal to Eq. (7), it works very well in practice and can guarantee a reduction of J1 if umin < u1(t) < umax. Most importantly, the computational expense introduced by Eq. (8) is almost negligible in contrast to quadratic programming. C. Extension of Sequential Action Control to Lie Groups As we can see from the overview above, to extend sequential action control to Lie groups, the first problem is calculating the mode insertion gradient on Lie groups. Unfortunately, Eqs. (4) and (5) may no longer hold for nonabelian Lie groups, mean- ing that the mode insertion gradient needs to be rederived. We study this problem in Section III, which is the fundamental contribution of this paper. Another problem, though not so obvious, is choosing appropriate objective functions. Though quadratic functions can be constructed easily on Lie groups through the logarithm map, it is difficult to evaluate their derivatives if there are no exact expressions of the trivialized tangent inverse of the exponential map d exp−1(ξ), which is discussed in detail in Section IV. III. MODE INSERTION AND TRANSITION GRADIENTS FOR SYSTEMS EVOLVING ON LIE GROUPS In this section we study the switching time optimization for systems evolving on Lie groups and derive the associated mode insertion and transition gradients. Readers are assumed to have a basic knowledge of Lie group theory, which can be found in a variety of textbooks, e.g., [16]. A. Review of Lie Group Dynamics and Linearization To be fully-illustrated, we first define the following operator that are frequently used in this paper. Definition 1. Given a differentiable map f : G −→Rn on a Lie group G, then Dgf|g : g −→Rn is defined to be the linear map such that Dgf|g · η = d dsf  g · exp(s · η)  s=0 (9) for all η ∈g. If G is abelian, then Dgf|g is just the first-order derivative in multi-variable calculus. For brevity, the dependence |g is dropped and Dgf|g is simply written as Dgf. Let G be the Lie group and g the associated Lie algebra, the continuous system evolving on G is defined to be ˙g(t) = g(t)ξ(t). (10a) ξ(t) = f  g(t), u(t); t  (10b) where f : G×U×R+ −→g. In the remainder of the paper, we may sometimes write f  g(t), u(t); t  as f(t) for simplicity. If f  g(t), u(t); t  is C1, it is possible to linearize the corresponding system, which is as follows [29]. Proposition 1. If f : G×U ×R+ −→g is differentiable, then a linearization of continuous system Eq. (10) is ˙η(t) =  Dgf(t) −adf(t)  · η(t) + Duf(t) · ν(t) (11) where η(t) = g(t)−1δg(t) ∈g and ν(t) ∈U is the variation of inputs u(t). For more details about Lie group dynamics and lineariza- tion, interested readers can refer to [8, 13, 25, 26]. B. Mode Transition Gradient Definition 2. If the inputs u(t) ∈U in Eq. (10) are transition times T = T1 T2 · · · TN−1 T , such that f  g(t), T ; t  = h 1(t −T0) −1(t −T − 1 ) i f1  g(t); t  + h 1(t −T + 1 ) −1(t −T − 2 ) i f2  g(t); t  + · · · · · · + h 1(t −T + N−1) −1(t −T − N ) i fN  g(t); t  (12) where T0 ≤T1 ≤· · · ≤TN−1 ≤TN and 1(t) : R −→{0, 1} is the unit step function, then Eq. (10) is a switched system on Lie group G and fi  g(t); t  : G × R+ −→g are mode functions. The optimization problem to determine transition times for switched system is called switching time optimization. The first- and second-order derivatives of switching time optimization can be evaluated exactly for switched system in coordinates [9, 11, 12]. However, to our knowledge, no work has been done for switching time optimization on Lie groups. Though switching time optimization is not the main focus of this paper, the mode transition gradient, i.e. the first- order derivative of the switching time optimization, is closely related with the mode insertion gradient that SAC tracks in computing control actions u∗ 2(t). Here we derive the mode transition gradient for systems evolving on Lie groups that is used to derive the modal insertion gradient in Section III-C. Lemma 1. Suppose J(T ) is an objective function of the switched system evolving on Lie group G defined by Eqs. (10) and (12) J(T ) = Z TN T0 L  g(τ)  dτ + ϕ  g(TN)  . (13) Provided each fi(·; ·) in f is C1, then the mode transition gradient dJ dT T of J(T ) is dJ dT T =  dJ dT1 dJ dT2 · · · dJ dTN−1 T T such that dJ dTi T = ρ(Ti)T h fi  g(Ti); Ti  −fi+1  g(Ti), Ti i (14) where ˙ρ(s)T = −ρ(s)T · h Dgf  g(s), T ; s  − adf(g(s),T ;s) i −DgL  g(s)  (15) subject to: ρ(TN) = Dgϕ  g(TN)  . Proof. The derivative dJ dTi T of J(T ) w.r.t Ti is dJ dTi T = Z TN T0 DgL  g(τ)  · DTig(τ)dτ+ Dgϕ  g(TN)  · DTig(TN) (16) where DTig(t) = g(t)−1 dg(t) dt ∈g and, according to Eq. (11), satisfies ∂ ∂t  DTig(t)  =  Dgf  g(t), T ; t  −adf(g(t),T ;t)  · DTig(t)+ DTif  g(t), T ; t  . (17) Note that fi(·; ·) is C1, thus Dgf  g(t), T ; t  is well defined. Since that all initial states are fixed, we have DTig(0) = 0. Therefore, the solution to Eq. (17) is DTig(t) = Z TN T0 Φ(t, s) · DTif  g(s), T ; s  ds, (18) where Φ(t, s) is the state transition matrix of Eq. (17). Substitute Eq. (18) to Eq. (16) and switch the integration order, dJ dTi T becomes dJ dTi T = Z TN T0 ρ(s)T · DTif  g(s), T ; s  ds (19) and the costate ρ(s)T is ρ(s)T = Z TN s DgL  g(τ)  Φ(τ, s)dτ+ Dgϕ  g(TN)  Φ(TN, s). (20) Taking derivative on both sides of Eq. (20), we have ˙ρ(s)T = −ρ(s)T h Dgf  g(s), T ; s  − adf(g(s),T ;s) i −DgL  g(s)  , (21) subject to: ρ(TN) = Dgϕ  g(TN)  , In addition, note that DTif  g(t), T ; t  = δ(t −T − i )fi  g(t); t  − δ(t −T + i )fi+1  g(t); t  . (22) Integrate δ-functions in Eq. (19) and the result is Eq. (14), which completes the proof. C. Mode Insertion Gradient We have obtained the mode transition gradient for switching time optimization on Lie groups, with which the mode inser- tion gradient can be derived just as the following corollary indicates. Corollary 1. Given a system evolving on Lie group G with f : G × U × R+ −→g taking the form f  g(t), λ; t  =    f1  g(t); t  , t < τ0 or t ≥τ0 + λ, f2  g(t); t  , τ0 ≤t < τ0 + λ (23) and the duration λ ∈R+ as input and an objective function J(λ) = Z TN T0 L  g(τ)  dτ + ϕ  g(TN)  , (24) then the mode insertion gradient is dJ dλ λ=0+ = ρ(τ0)T · h f2  g(τ0); τ0  −f1  g(τ0); τ0 i (25) where ˙ρ(s)T = −ρ(s)T · h Dgf1  g(s); s  − adf1(g(s);s) i −DgL  g(s)  (26) subject to: ρ(TN) = Dgϕ  g(TN)  . Proof. Let τ1 = τ0 + λ and the dynamical system defined by Eq. (23) is a switched system with T = [τ1]. Apply Lemma 1 and note dJ dλ = dJ dτ1 , we have dJ dλ = ρ(τ0 + λ)T · h f2  g(τ0 + λ); τ0 + λ  − f1  g(τ0 + λ); τ0 + λ i . (27) Substituting λ = 0+ into Eq. (27), the resulting equation is Eq. (25), which completes the proof. It can be shown that Eq. (7) and Eq. (8) apply to SAC on Lie groups as long as the mode insertion gradient Eq. (25) is given which is crucial in extending SAC to nonlinear systems evolving on Lie groups. IV. THE QUADRATIC FUNCTION ON LIE GROUPS AND ITS DERIVATIVE The quadratic function in Rn f(x) = 1 2∥x −xd∥2 M (28) where ∥x∥M = √ xT Mx and M ∈Rn×n is symmetric positive definite defines a class of functions frequently used in control theory and application. Their essence as Lya- punov functions and easiness in numerical computation render quadratic objective functions quite popular for both SAC and optimization in Rn. This suggests it may be necessary to reasonably extend quadratic functions to Lie groups. A. Extending Quadratic Functions to Lie Groups An immediate and common way in differential geometry to extend the quadratic function Eq. (28) in Rn to Lie groups is f(g) = 1 2∥log(g−1 d g)∥2 M (29) whose derivative is Dgf = d exp−1  −log(g−1 d g) T M log(g−1 d g), (30) where d exp−1(ξ) is the trivialized tangent inverse of the exponential map. If Lie group G has a bi-invariant pesudo-Riemannian met- ric, Eq. (29) is a reasonable extension of quadratic functions to Lie groups in sense that log(g−1 d g) is the initial velocity of the geodesic γ : [0, 1] −→G connecting gd and g under the bi-invariant pesudo-Riemannian metric. Fortunately, there always exists such bi-invariant pesudo-Riemannian metrics for subgroups of SE(3) [38], some of which actually turn out to be positive-definite, i.e., Riemannian metrics. Though there are a number of other ways to extend quadratic functions from Rn to Lie groups, such as these in [20, 29], none of them have the same geometric meaning as Eq. (28) in Rn that captures the bi-invariant pesudo- Riemannian structure. Most importantly, according to numer- ical tests, when using Eq. (29) as the objective, SAC on Lie groups has a larger region of attraction and converges a lot faster than objectives of any other kinds. B. Expressions of d exp(ξ) and d exp−1(ξ) for Some Common Lie Groups The trivialized tangent of the exponential map and its inverse d exp(ξ) and d exp−1(ξ) are frequently used in the construction and linearization of Lie group integrators [15, 29]. Since ∥ξ∥is usually small in Lie group integrators, it is possible to approximate d exp(ξ) and d exp−1(ξ) by truncating the series expansions [15] d exp(ξ) = ∞ X j=0 1 (j + 1)!adj ξ, d exp−1(ξ) = ∞ X j=0 Bj j! adj ξ where Bj are Bernoulli numbers. However, when ∥ξ∥is no longer small, e.g., in an objective function Eq. (29), the approximation by truncating series expansions is invalid. Here we give exact expressions (not series expansions) of d exp(ξ) and d exp−1(ξ) for SO(3), SE(2) and SE(3), all of which are commonly used in robotics. 1) SO(3): Elements of Lie algebra ˆω ∈so(3) is usually associated with R3 through the hat operator ∧: R3 −→so(3)   ω1 ω2 ω3   ∧ =   0 −ω3 ω2 ω3 0 −ω1 −ω2 ω1 0  , then d exp(ˆω) and d exp−1(ˆω) are [10] d exp(ˆω) = I + 1 −cos ∥ω∥ ∥ω∥2 ˆω + ∥ω∥−sin ∥ω∥ ∥ω∥3 ˆω2, (31a) d exp−1(ˆω)= I −1 2 ˆω + 1 2∥ω∥sin ∥ω∥+ cos ∥ω∥−1 ∥ω∥2(cos ∥ω∥−1) ˆω2. (31b) 2) SE(2): We represent ξ ∈se(2) in terms of coordinate   ω u v  ∈R3 such that ξ =   0 −ω u ω 0 v 0 0 0  . The exact expres- sions of d exp(ˆω) and d exp−1(ˆω) are d exp(ξ) =   1 0 0 #1 sin ω ω −1 −cos ω ω #2 1 −cos ω ω sinω ω   (32a) d exp−1(ξ) =   1 0 0 #3 ω 2 sin ω 1 −cos ω ω 2 #4 −ω 2 ω 2 sin ω 1 −cos ω   (32b) where #1 = u(ω −sin ω) + v(1 −cos ω) ω2 , #2 = v(ω −sin ω) −u(1 −cos ω) ω2 , #3 = −v 2 + u 2 ω sin ω + 2 cos ω −2 ω cos ω −ω , #4 = u 2 + v 2 ω sin ω + 2 cos ω −2 ω cos ω −ω . 3) SE(3): It is general to identify ξ ∈se(3) with the body-fixed velocity ω v  ∈R3 through ξ = ˆω v 0 0  where ω, v ∈R3 are respectively the body-fixed angular and linear velocities, and we have d exp(ξ) =  #1 O3×3 #2 #1  , (33a) d exp−1(ξ) =  #3 O3×3 #4 #3  (33b) where #1 = I + 1 −cos ∥ω∥ ∥ω∥2 ˆω + ∥ω∥−sin ∥ω∥ ∥ω∥3 ˆω2, #2 = 2 −2 cos ∥ω∥−1 2∥ω∥sin ∥ω∥ ∥ω∥2 ˆv+ ∥ω∥−sin ∥ω∥ ∥ω∥3 (ˆωˆv + ˆvˆω)+ 1 −cos ∥ω∥−1 2∥ω∥sin ∥ω∥ ∥ω∥4 (ˆω2ˆv + ˆωˆvˆω + ˆvˆω2)+ ∥ω∥−3 2 sin ∥ω∥+ 1 2∥ω∥cos ∥ω∥ ∥ω∥5 (ˆω2ˆvˆω + ˆωˆvˆω2) #3 = I −1 2 ˆω + 1 2∥ω∥sin ∥ω∥+ cos ∥ω∥−1 ∥ω∥2(cos ∥ω∥−1) ˆω2, #4 = −1 2 ˆv + 1 2∥ω∥sin ∥ω∥+ cos ∥ω∥−1 ∥ω∥2(cos ∥ω∥−1) (ˆωˆv + ˆvˆω)+ 1 4∥ω∥2 + 1 4∥ω∥sin ∥ω∥+ cos ∥ω∥−1 ∥ω∥4(cos ∥ω∥−1) (ˆω2ˆvˆω + ˆωˆvˆω2). Eq. (33) is derived based on the fact that ˆω3 = −∥ω∥2ˆω, in which Eq. (33a) is equivalent to that presented in [30]. V. EXAMPLES In this section, we implement our approach on the 2D kinematic car and 3D quadrotor respectively evolving on SE(2) and SE(3). The results indicate that our approach has a large region of attraction with input saturation. Furthermore, the computation is much faster than real time meaning that our approach can be implemented online. We also compare the performance of SAC with iLQG for quadrotor in in terms of stability and computational time. All the tests are run in C++ on a 2.7GHz Intel Core i7 Thinkpad T440p laptop. A. Example 1: The 2D Kinematic Car The configuration space of a kinematic car evolves on g ∈ SE(2) and the dynamic equation can be described as ˙g = g   0 −ω v∥cos(φ) ω 0 0 0 0 0  , (34a) ω = v∥sin(φ), ˙v∥= u1, ˙φ = u2 (34b) where ω is the angular velocity, v∥is the forward velocity, φ is the steering angle and u1, u2 ∈R are the control inputs. Here we implement SAC for feedback motion planning of car parking. In our tests, the control bounds are umin = −4 −5T and umax = 4 5T , and to be more consis- tent with a real car, the steering angle φ is constrained to [−1 3π, 1 3π]. The reference control inputs u1(t) are unknown in motion planning and thus we assume u1(t) ≡0 in SAC. However, in parallel parking, it can be checked that h  g(t); t T ρ(t) ≡0 if the weighting matrix M in Eq. (29) is diagonal and the control action u∗ 2(t) ≡0 by Eq. (7). Therefore, the car is stuck at the initial state despite the fact that it remains far away from the target. To help the car jump out of the singularity, we let off-diagonal elements of M vary slightly by random at intervals instead of keeping M unchanged and then u∗ 2(t) no longer remains 0.1 As shown in Fig. 1, this works well for the parallel parking problem.2 We also test SAC with around 45000 initial states randomly sampled from θ0 ∈[−π, π] rad, x0, y0 ∈[−10, 10] m. In fact, as stated in Section I-A, rather than make SAC search feedback motion plans all by itself, we prefer to combine it with feedback motion planning algorithms via semidefinite programming. In all of our tests, SAC successfully drives the car from initial states aforementioned into a region of 1M remains positive definite since off-diagonal elements merely slightly vary around 0. 2The desired steering angle φd is not specified as is seldom considered in motion planning. (a) (b) Fig. 1: Feedback motion planning for parallel parking. The initial state is θ0 = 0 rad, x0 = 0 m, y0 = 4 m, φ0 = 0 rad, ω0 = 0 rad/s, v0 = 0 m/s and the desired state is θd = 0 rad, xd = 0 m, yd = 0 m, ωd = 0 rad/s, vd = 0 m/s. It takes 0.4 s for SAC to successfully plan a trajectory to the desired configuration. The feedback sampling rate is 100 Hz. |θT | ≤0.15 rad, ∥xT ∥≤0.35 m, |v∥ T | ≤0.02 m/s after a computational time of around 0.8 s, which can be covered with significantly fewer control policies by feedback motion planning algorithms via semidefnite programming [18, 24, 33]. B. Example 2: The 3D Quadrotor The quadrotor is an underactuated system evolving on h = (R, p) ∈SE(3) and the dynamics can be formulated as ˙h = h ˆω v 0 0  , (35a) J ˙ω = M + Jω × ω, (35b) ˙v = 1 mFe3 −ω × v −gRT e3 (35c) where ω, v ∈R3 are respectively the body-fixed angular and linear velocities, F and M are forces and torques exerted on the quadrotor associated with control inputs ui (i = 1, 2, 3, 4) by F = kt(u2 1 + u2 2 + u3 3 + u2 4), M =   ktl(u2 2 −u2 4) ktl(u2 3 −u2 1) km(u2 1 −u2 2 + u3 3 −u2 4)  , (36) and kt, km, l are model parameters. The dynamics (Eq. (35)) used herein are derived on SE(3) and it is different from that on SO(3) × R3 due to the distinct Lie group structure[23]. Since in Eq. (36) u2 i ≥0 always holds, values of M and F are no longer arbitrary. Though numbers of papers develop various quadrotor controllers, most of them are either NMPC- based or assume M and F can be any values, which may not be implemented online with input saturation. The fact that the quadrotor dynamics is an affine system with u2 i ∈[0, +∞) means that SAC on Lie groups can be implemented to saturate control actions. In this example, we use the combination of SAC and LQR on a quadrotor with m = 0.6 kg, J = diag{0.04, 0.0375, 0.0675} kg · m2, kt = 0.6, km = 0.15, l = 0.2 m and u2 i ∈[0, 6] for i = 1, 2, 3, 4. The LQR is employed only when the tracking error is below a threshold of ∥log(g−1 d g)∥2 +∥ω−ωd∥2 +∥x−xd∥2 ≤62 and the control inputs given by LQR satisfy the actuation limits, otherwise SAC is applied. The reference control inputs are obtained by differential flatness [28] and the flat outputs are x(t) = 12 sin( πt 6 ) m, y(t) = 8 sin( πt 3 ) m, z(t) = 7 sin( πt 6 ) m and α = 1 2 sin( t 2) rad. Fig. 2 is a result of trajectory tracking while Fig. 3 are the statistics of 24000 trials whose initial states are randomly sampled from ∥x −xd∥∈[0, 30] m, ∥log(RT d R)∥∈[0, π] and in all of which the combination of SAC and LQR stabilizes the quadrotor to the reference trajectory in a simulation duration of 72 s with saturated control inputs. (a) (b) Fig. 4: The performance of iLQG under different number of iterations. The stability is tested with 150 random initial states with ∥x −xd∥= 30 m and ∥log(g−1 d g)∥= 3 rad and the rate of success of iLQG is shown in (a). The computational time of iLQG with different numbers of iterations is in (b) where the black dotted line represents the average computational time of SAC. The computational time of SAC is around 3.8 ∼4.1 ms. We implement iLQG [32, 34] for trajectory tracking for (a) (b) (c) Fig. 2: Trajectory tracking of a quadrotor with large initial error. The reference initial state is α = 0, β = 0, γ = 0, xq = [0 0 0]T m, ω = [0.94 −0.18 0.25]T rad/s and vq = [6.28 8.38 3.67]T m/s while the tracking initial state is α = 1.45, β = −0.92, γ = −0.70 xq = [12.38 8.10 −2.44]T m, ω = [−0.56 0.90 3.80]T rad/s and vq = [10.39 4.17 4.85]T m/s, where α, β, γ are respectively yaw, pitch, roll angles. The resulting performance of position tracking is in (a), orientation tracking in (b) and trajectory tracking in (c). The LQR controller takes effect at t = 5.20 s. SAC is around 5.3 times faster than real time. The results are based on a feedback sampling rate of 50 Hz. (a) t = 18 s (b) t = 30 s (c) Fig. 3: Statistics of SAC for quadrotor control with 24000 random initial states. In (a) and (b) different colors denote rates of success converging to the region of attraction of LQR at simulation times of t = 18 s and 30 s for different initial errors where red ≥95%, orange ≥90%, yellow ≥80%, green ≥70%, blue ≥60% while (c) is the overall rate of success w.r.t. different simulation times. At t = 63.4 s, SAC successfully drives the quadrotor to the region of attraction of LQR for all of the 24000 initial states. Note herein the region of attraction of LQR refers to that in the presence of actuation limits. the purpose of comparison. The iLQG method is one of the most efficient NMPC methods. As is shown in Figs. 3 and 4, SAC outperforms iLQG in both basin of attraction and computational efficiency for cases with large initial errors. Though iLQG achieves a relatively satisfactory performance with 20 iterations, it takes 45 ms for computation and still has failures. As a comparison, SAC takes around 4 ms and stabilizes quadrator in all of the 24000 trials including these with large initial errors (Fig. 3). SAC iLQG SNOPT Projected-Newton CPU time 3.8 ∼4.1 ms 217 ∼236 ms 37 ∼51 ms TABLE I: Computation time of SAC and iLQG. In our tests, the number of iterations in iLQG is set to 20 to obtain an accepted performance, whereas no iterations are needed in SAC. Usually iLQG needs a quadratic optimizer to saturate con- trol inputs. We tested both SNOPT [14] and the Projected- Newton method [6, 32] for quadratic optimization in iLQG. As noted in Table I, SAC is much faster than iLQG using both optimizers. We can also find that the performance of iLQG is severely affected by the chosen optimizers while SAC saturates control inputs in closed form and no optimizers are needed. VI. CONCLUSION In this paper, we propose an approach to designing online feedback controllers with input saturation for nonlinear sys- tems evolving on Lie groups by extending sequential action control, which demonstrates a large region of attraction for the kinematic car and quadrotor model though formal proofs of global stability are still needed. In addition, the associated mode insertion and transition gradients are derived and exact expressions of d exp(ξ) and d exp−1(ξ) for some common Lie groups are given. REFERENCES [1] Frank Allg¨ower, Rolf Findeisen, and Zoltan K Nagy. Nonlinear model predictive control: From theory to ap- plication. J. Chin. Inst. Chem. Engrs, 35(3):299–315, 2004. [2] Brian Anderson and John Moore. Optimal Control: Linear Quadratic Methods. Courier Corporation, 2007. [3] Alex Ansari and Todd D Murphey. Sequential action con- trol: Closed-form optimal control for nonlinear systems. IEEE Transactions on Robotics, Submitted. [4] Alex Ansari, Kathrin Flaßkamp, and Todd D Murphey. Sequential action control for tracking of free invariant manifolds. In Conference on Analysis and Design of Hybrid Systems, 2015. [5] Alberto Bemporad, Francesco Borrelli, and Manfred Morari. Model predictive control based on linear pro- gramming – the explicit solution. IEEE Transactions on Automatic Control, 47(12):1974–1985, 2002. [6] Dimitri P Bertsekas. Projected newton methods for optimization problems with simple constraints. SIAM Journal on control and Optimization, 20(2):221–246, 1982. [7] Dimitri P Bertsekas. Nonlinear Programming. Athena Scientific, 1999. [8] Francesco Bullo and Andrew Lewis. Geometric control of mechanical systems, volume 49. Springer-Verlag, 2005. [9] Timothy M Caldwell and Todd D Murphey. Switching mode generation and optimal estimation with application to skid-steering. Automatica, 47(1):50–64, 2011. [10] Elena Celledoni and Brynjulf Owren. Lie group methods for rigid body dynamics and time integration on man- ifolds. Computer Methods in Applied Mechanics and Engineering, 192(3):421–438, 2003. [11] Magnus Egerstedt, Yorai Wardi, and Florent Delmotte. Optimal control of switching times in switched dynam- ical systems. In IEEE Conference on Decision and Control, 2003. [12] Magnus Egerstedt, Yorai Wardi, and Henrik Axelsson. Transition-time optimization for switched-mode dynam- ical systems. IEEE Transactions on Automatic Control, 51(1):110–115, 2006. [13] Taosha Fan and Todd D Murphey. Structured lineariza- tion of mechanical systems on Lie group: a synthesis of analysis and control. In IEEE Conference on Decision and Control (CDC), 2015. [14] Philip E Gill, Walter Murray, and Michael A Saunders. SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM review, 47(1):99–131, 2005. [15] Ernst Hairer, Christian Lubich, and Gerhard Wanner. Ge- ometric numerical integration: structure-preserving al- gorithms for ordinary differential equations, volume 31. Springer Science & Business Media, 2006. [16] Brian C Hall. Lie groups, Lie algebras, and representa- tions, volume 222. Springer-Verlag, 2003. [17] Tiffany A Huang, Matanya B Horowitz, and Joel W Burdick. Convex model predictive control for vehicular systems. arXiv preprint arXiv:1410.2792, 2014. [18] Zachary Jarvis-Wloszek, Ryan Feeley, Weehong Tan, Kunpeng Sun, and Andrew Packard. Some controls applications of sum of squares programming. In IEEE Conference on Decision and Control (CDC), 2003. [19] Hassan K Khalil. Nonlinear Systems. Prentice Hall, 2002. ISBN 9780130673893. [20] Marin Kobilarov. Discrete optimal control on Lie groups and applications to robotic vehicles. In IEEE Interna- tional Conference on Robotics and Automation (ICRA), 2014. [21] Marin Kobilarov. Nonlinear trajectory control of multi- body aerial manipulators. Journal of Intelligent & Robotic Systems, 73(1-4):679–692, 2014. [22] Pelor V Kokolovi´e. The joy of feedback: nonlinear and adaptive. 1992. [23] Taeyoung Lee, Melvin Leoky, and N Harris McClam- roch. Geometric tracking control of a quadrotor UAV on SE (3). In IEEE Conference on Decision and Control (CDC), 2010. [24] Anirudha Majumdar, Ram Vasudevan, Mark M Tobenkin, and Russ Tedrake. Convex optimization of nonlinear feedback controllers via occupation measures. The In- ternational Journal of Robotics Research, pages 1209– 1230, 2014. [25] Jerrold E Marsden. Lectures on mechanics, volume 174. Cambridge University Press, 1992. [26] Jerrold E Marsden and Tudor S Ratiu. Introduction to mechanics and symmetry: a basic exposition of classical mechanical systems, volume 17. Springer-Verlag, 1999. [27] David Mayne. A second-order gradient method for determining optimal trajectories of non-linear discrete- time systems. International Journal of Control, 3(1): 85–95, 1966. [28] Daniel Mellinger and Vijay Kumar. Minimum snap trajectory generation and control for quadrotors. In IEEE International Conference on Robotics and Automation (ICRA), 2011. [29] Alessandro Saccon, John Hauser, and A Pedro Aguiar. Optimal control on Lie groups: The projection operator approach. IEEE Transactions on Automatic Control, 58 (9):2230–2245, 2013. [30] JM Selig. Lie groups and Lie algebras in robotics. In Computational Noncommutative Algebra and Applica- tions, pages 101–125. Springer, 2004. [31] Koushil Sreenath and Vijay Kumar. Dynamics, control and planning for cooperative manipulation of payloads suspended by cables from multiple quadrotor robots. In Robotics: Science and Systems (RSS), 2013. [32] Yuval Tassa, Nicolas Mansard, and Emo Todorov. Control-limited differential dynamic programming. In Robotics and Automation (ICRA), 2014 IEEE Interna- tional Conference on, pages 1168–1175. IEEE, 2014. [33] Russ Tedrake, Ian R Manchester, Mark Tobenkin, and John W Roberts. LQR-trees: Feedback motion planning via sums-of-squares verification. The International Jour- nal of Robotics Research, 2010. [34] Emanuel Todorov and Weiwei Li. A generalized itera- tive lqg method for locally-optimal feedback control of constrained nonlinear stochastic systems. In American Control Conference, 2005. Proceedings of the 2005, pages 300–306. IEEE, 2005. [35] Emmanouil Tzorakoleftherakis, Alexander Ansari, An- drew Wilson, Jarvis Schultz, and Todd Murphey. Model- based reactive control for hybrid and high-dimensional robotic systems. IEEE Robotics and Automation Letters, In Press. [36] Andrew D Wilson, Jarvis Schultz, Alex Ansari, and Todd D Murphey. Real-time trajectory synthesis for information maximization using sequential action control and least-squares estimation. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2015. [37] Guofan Wu and Koushil Sreenath. Safety-critical and constrained geometric control synthesis using control Lyapunov and control barrier functions for systems evolving on manifolds. In American Control Conference (ACC), 2015. [38] Milos Zefran, Vijay Kumar, and Christopher Croke. Choice of Riemannian metrics for rigid body kinematics. In ASME 24th Biennial Mechanisms Conference, 1996.