An Invariant Linear Quadratic Gaussian controller for a simplified car S ́ ebastien Diemer 1 and Silv` ere Bonnabel 1 1 MINES ParisTech, PSL - Research University, Center for robotics, 60 Bd St Michel 75006 Paris, France, [sebastien.diemer, silvere.bonnabel]@mines-paritech.fr October 20, 2018 Abstract In this paper, we consider the problem of tracking a reference trajectory for a simplified car model based on unicycle kine- matics, whose position only is measured, and where the con- trol input and the measurements are corrupted by indepen- dent Gaussian noises. To tackle this problem we devise a novel observer-controller: the invariant Linear Quadratic Gaussian controller (ILQG). It is based on the Linear Quadratic Gaus- sian controller, but the equations are slightly modified to ac- count for, and to exploit, the symmetries of the problem. The gain tuning exhibits a reduced dependency on the estimated trajectory, and is thus less sensitive to misestimates. Beyond the fact the invariant approach is sensible (there is no reason why the controller performance should depend on whether the reference trajectory is heading west or south), we show through simulations that the ILQG outperforms the conven- tional LQG controller in case of large noises or large initial uncertainties. We show that those robustness properties may also prove useful for motion planning applications. 1 Introduction The field of mobile robot control has been thoroughly stud- ied in the past. One class of problems of practical interest is the trajectory tracking problem, for which the robot’s goal is to follow a predefined time-parameterized path. The Lin- ear Quadratic Gaussian (LQG) controller is a standard tool of linear control that can handle process and measurement Gaussian noises and possesses optimality properties, see e.g. [17]. It consists of coupling a Kalman filter for the state esti- mation, and a Linear Quadratic (LQ) controller for the tra- jectory tracking. When the system is not linear, as this is the case for the unicycle model due to the cosines and sines terms, the LQG can be extended (at the price of a lineariza- tion of the model and output equations about the reference trajectory) but the (extended) LQG controller looses all its optimality properties. In particular, as soon as the true tra- jectory tends to deviate from the reference trajectory (due to a perturbation, that may originate from perceptual ambigu- ity for instance, or a large initial uncertainty, or merely large noises) there is no guarantee at all the LQG should be able to drive the robot back to the reference trajectory. In a deterministic setting, there has been numerous at- tempts to account for the symmetries of the problem in controller design (the geometric control literature being ex- tremely vast and dating back to [12]). Notably, the interested reader is referred to the more recent work [9] where nonlinear controllers are devised for a relevant class of simple mechan- ical systems. As concerns state estimation, there as been an increasing number of attempts to account for the symmetries in observer design over the last decade, the major body of literature on the subject having been motivated by attitude estimation (to cite a couple of papers see [13, 6]). The idea of combining invariant state estimation (that is, the art of building estimators that respect the symmetries of the prob- lem) and invariant control can be traced back to [11] which devises an observer-controller for the same problem as the one considered in the present paper, that is, trajectory tracking for a simplified car whose position only is measured. Finally, still in a deterministic setting, the work [8] discusses a sepa- ration principle for invariant observer-controllers. In a stochastic setting, where noises are to be explicitly taken into account, the invariant Extended Kalman Filter (IEKF) introduced in continuous time in [4, 7] and in discrete time in [1], is a novel methodology that aims at modifying the equations of the Extended Kalman Filter (EKF) a lit- tle so that they respect the symmetries of the problem (see also the more recent work of [2]). The IEKF possesses some convergence properties that the EKF lacks. The goal of the present paper is to combine an IEKF with a LQ controller that respects the symmetries of the system under rotation and translation, in order to have a simplified car track a pre- 1 arXiv:1406.1619v2 [cs.RO] 18 Jun 2014 defined trajectory in the presence of Gaussian noises. For this system, the invariant approach boils down to considering both the estimation error and the tracking error in the Fr ́ enet co- ordinates, that is, a moving frame attached to the car, and to devise a Kalman filter and a LQ controller for stabilizing the linearized errors. To this respect, the invariant LQ controller can be also related to the work [15] that advocates the use of Fr ́ enet coordinates for tracking in SE(2). The introduced invariant LQG controller possesses several properties. First of all, it is invariant to rotations and trans- lations, that is, it ensures that the behavior is the same whether the car is, for example, trying to park automatically along a North-oriented of West-oriented sidewalk. Surpris- ingly enough, this property does not generally hold when de- vising a standard LQG controller for this problem. Moreover, the Kalman gain is proved to be independent of the estimated trajectory, a property which reminds the linear case. Indeed, in extended Kalman filtering, the equations being linearized about the estimated trajectory, an erroneous estimate of the state can lead to a inappropriate gain that can in turn gen- erate an even more erroneous estimate. This kind of positive feedback can lead to divergence of the filter, and cannot oc- cur when dealing with the IEKF proposed herein. Without exploring the theory of IEKF stability, which goes beyond the scope of the present paper, we show through extensive simu- lations the robustness of the proposed invariant LQG versus conventional LQG. The remainder of this article is organized as follows. In section 2 we define formally the problem at stake, the robot kinematics and its environment, and the conventional LQG traditionally used for trajectory tracking. Then, we derive the equations of the invariant LQG in section 3 and review its basic properties. In section 4 we compare through simu- lations the performances of the proposed invariant LQG with those of the conventional LQG. Simulations show the invari- ant approach outperforms the conventional one in case of large noises or large uncertainty on the initial state. Finally, we adapt in section 5 an approach recently introduced in [3], and show how the linearized equations of the invariant observer and the invariant controller can be combined to determine in advance the probability distributions of the state of the robot along the reference trajectory. This information can be used, for instance, to evaluate the probability of success of a planned trajectory. In this respect, it can be used by a planner to explicitly account for sensors and control un- certainties as do the planners from e.g. [16, 18]. Simulations indicate that the computed probability distributions capture much more closely the true dispersion of the tracking error (obtained through Monte-Carlo simulations) when an invari- ant LQG is used rather than a conventional LQG. 2 Problem formulation and LQG controller 2.1 Problem formulation In this paper, we consider a non-holonomic unicycle robot (simple car model) moving in a two-dimensional world (see e.g. [10]. The robot is characterized by its state x = ( X, θ ) ∈ χ ⊂ R 3 where X = ( x, y ) is the robot’s position and θ its orientation. The dynamics governing the update of the state x t to x t +1 = f ( x t , u t , m ) writes: x t +1 = x t + τ ( u t + v ) cos( θ t ) y t +1 = y t + τ ( u t + v ) sin( θ t ) (1) θ t +1 = θ t + τ ( ω t + w ) where τ is the discretized time step, u = ( u, ω ) is the sys- tem inputs, and m = ( v, w ) is the model noise. The robot has access to its absolute pose in the environment through for instance a GPS or a video tracking system, yielding mea- surements z = H x = ( x + n x , y + n y ) . The orientation θ is supposed not to be measured. We suppose that both the motion and the measurement noises are white and Gaussian. m ∼ N (0 , M ) , n ∼ N (0 , N ) The noises m and n at all time steps t are assumed to be mutually independent. Finally, we suppose that n is isotropic (i.e. N = λI 2 ), a reasonable assumption for GPS measure- ments restricted to an horizontal plane. The environment contains a collection of obstacles χ obs that the robot must avoid colliding. We denote χ f ree = χ \ χ obs the free space and χ goal ⊂ χ f ree the goal region the robot must reach. We define the reference trajectory as a collec- tion of states x ∗ 0 , . . . , x ∗ n where x ∗ 0 = x start , x ∗ n ∈ χ goal , and ∀ t 0 ≤ t < n, x ∗ t +1 = f ( x ∗ t , u ∗ t , 0) that is to say, an evo- lution governed only by the dynamics and without motion noise. We denote by ̄ x = x − x ∗ and ̄ u = u − u ∗ the errors between the true and the reference trajectory. In order for the robot to stay near the reference trajectory despite of the uncertainties in the measurements and the con- trols, we design a linear-quadratic controller, which aims at minimizing, under the dynamics constraints of (1), the cost function J ( x , u ) : J ( x , u ) = E ( I ( x , u )) = E ( n ∑ t =0 ( ̄ x t C ̄ x T t + ̄ u t D ̄ u T t ) ) (2) with C and D definite positive matrices that penalize the devi- ations in the tracking and in the actuator’s command. J thus appears as the average over a great number of experiments of the overall deviation I associated to a single trajectory. 2 2.2 Conventional LQG One approach to attack the problem defined in Section 2.1, is to linearize f around the reference trajectory and use a conventional Linear Quadratic Gaussian (LQG) control. LQG combines a Kalman filter for state estimation and a Linear Quadratic controller for the control. It provides an optimal control, which minimizes the cost (2) in the case of linear dynamics (see e.g. [17]). The conventional LQG Algorithm is recalled in Algorithm 1. Algorithm 1 Conventional LQG Require: Reference trajectory (( x t , u t )) t =1 ...n Require: Initial covariance P 0 Require: Off-line calculations of the Riccati gains (12) u 0 ← u ∗ 0 ˆ x 0 ← x ∗ 0 for 0 < t ≤ n do • propagate estimation and covariance with (4) and (3) • acquire measurement z t • update the best estimate and the covariance using (7) and (6) • output new command u t computed by (9) end for The best estimate (in the sense of least squares over a great number of experiments and under the linear approximation), and its covariance updates are given by the conventional extended Kalman filter (EKF) equations: Process update (conventional EKF): P − t +1 = A t P t A T t + B t M B T t (3) ˆ x − t +1 = f (ˆ x t , u t , 0) (4) where here: A t = ∂f ∂ x ∣ ∣ ∣ ∣ ˆ x t , u t , 0 =   1 0 − τ u t sin(ˆ θ t ) 0 1 τ u t cos(ˆ θ t ) 0 0 1   , B t = ∂f ∂ m ∣ ∣ ∣ ∣ ˆ x t , u t , 0 = τ   cos(ˆ θ t ) 0 sin(ˆ θ t ) 0 0 1   Measurement update (conventional EKF): K t +1 = P − t +1 H T ( HP − t +1 H T + N t ) − 1 (5) P t +1 = ( I − K t +1 H ) P − t +1 (6) ˆ x t +1 = ˆ x − t +1 + K t +1 ( z t − H ˆ x − t +1 ) (7) Likewise, the LQ controller linearized equations read: ̄ x t +1 = A ∗ t ̄ x t + B ∗ t ̄ u t + B ∗ t m (8) where here: A ∗ t = ∂f ∂ x ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 =   1 0 − τ u t sin( θ ∗ t ) 0 1 τ u t cos( θ ∗ t ) 0 0 1   B ∗ t = ∂f ∂ u ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 = ∂f ∂ m ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 = τ   cos( θ ∗ t ) 0 sin( θ ∗ t ) 0 0 1   and the updated control law reads: u t = u ∗ t + L t (ˆ x t − x ∗ t ) (9) where the gains L t are computed through the following back- wards Riccati equation (12): S l = C (10) L t = − ( B ∗ t T S t B ∗ t + D ) − 1 B ∗ t T S t A ∗ t (11) S t = C + A ∗ t +1 T S t +1 A ∗ t +1 + A ∗ t +1 T S t +1 B ∗ t +1 L t +1 (12) One noticeable characteristic of the conventional LQG is that the linearized matrices depend on the trajectory through the estimated orientation ˆ θ t for the observer, and the reference orientation θ ∗ t for the controller. This feature is illustrated on the diagram of Figure 1. This is in sharp contrast with the case of linear systems, and might be a cause of divergence of the closed-loop system. In the next section, we design an invariant LQG observer-controller, for which the linearized matrices only depend on the inputs, and this will be shown to increase the robustness compared to the conventional ap- proach. Figure 1: The gain computation of the EKF depends directly on the last estimate. 3 3 The invariant LQG controller In this section we define a LQG observer-controller with in- variant properties. The design builds upon the following re- mark [11]: the dynamics (1) are invariant to rotations and translations, that is, they do not depend on the choice of frame. Proposition 1. The dynamics (1) are invariant to rotations and translations. Consequently the LQG equations for (1) do not depend on the choice of coordinates in the following sense: let ( x y θ ) = ( x 0 y 0 0 ) + ( X cos θ 0 − Y sin θ 0 Y sin θ 0 + X cos θ 0 Θ+ θ 0 ) be a change of coordi- nates from a reference system of coordinates ( X Y Θ ) to a rotated and translated system of coordinates ( x y θ ) , then the equations (1) write the same when written with the transformed vari- ables. Proof. If      X t +1 = X t + τ ( u t + v ) cos(Θ t ) Y t +1 = Y t + τ ( u t + v ) sin(Θ t ) Θ t +1 = Θ t + τ ( ω t + w ) Then      x t +1 , X 0 + X t +1 cos(Θ 0 ) − Y t +1 sin(Θ 0 ) y t +1 , Y 0 + X t +1 sin(Θ 0 ) + Y t +1 cos(Θ 0 ) θ t +1 , Θ 0 + Θ t +1 ⇔      x t +1 = X 0 + X t cos(Θ 0 ) − Y t sin(Θ 0 ) + τ ( u t + v ) cos(Θ t + Θ 0 ) y t +1 = Y 0 + X t sin(Θ 0 ) + Y t cos(Θ 0 ) + τ ( u t + v ) sin(Θ t + Θ 0 ) θ t +1 = Θ 0 + Θ t + τ ( ω t + w ) ⇔      x t +1 = x t + τ ( u t + v ) cos( θ t ) y t +1 = y t + τ ( u t + v ) sin( θ t ) θ t +1 = θ t + τ ( ω t + w ) It seems evident that the ability of an observer-controller to park along the pavement should not depend on whether the pavement is north-oriented or west-oriented. However, sur- prisingly, the conventional LQG controller (9) has a behaviour that does depend on orientation of the car ( θ ∗ t , ˆ θ t ). When de- signing both an observer and a controller, the problem can be remedied by deriving the observer and the controller equa- tions in the Fr ́ enet coordinates (see [15, 6]). In the remainder of the paper, we use the superscript  loc (loc stands for local) to identify the vectors expressed in the Fr ́ enet frame, that is, a frame attached to the car whose first axis coincides with the car’s heading direction θ t . 3.1 Invariant Extended Kalman Filter (IEKF) To estimate the state x of the robot, we use an invariant formulation of the extended Kalman filter as proposed in [4, 7]. Here it boils down to working in the Fr ́ enet frame as fol- lows. Let us define the estimation error ̃ x = ˆ x − x , the local estimation error ̃ x loc = Υ − θ ̃ x , and the local state deviation ̄ x loc = Υ − θ ∗ ̄ x , where Υ φ = ( R φ 0 0 1 ) . We can note that, by definition, Υ has the following properties 1 that will be used in the sequel: Υ 0 = I 3 , Υ φ + ψ = Υ φ Υ ψ and H Υ φ = R φ H . We search to estimate the state of the robot using a filter of the following form [6, 4]: ˆ x − t +1 = f (ˆ x t , u t , 0) (13) ˆ x t +1 = ˆ x − t +1 + Υ ˆ θ − t +1 K inv t R − ˆ θ − t +1 ( z t +1 − H ˆ x − t +1 ) (14) where R − ˆ θ − t +1 represents the 2D rotation of angle − ˆ θ − t +1 (the opposite of the third coordinate of ˆ x − t +1 ). The idea behind the proposed filter is merely to map the measurement error z t +1 − H ˆ x − t +1 into the Fr ́ enet frame of the estimated car, that is, applying a rotation of angle − ˆ θ , then apply the Kalman correction gain K , and finally map the obtained correction term back into the inertial frame through the operator Υ ˆ θ . The evolution of the local estimation error writes: ̃ x loc t +1 =Υ − θ t +1 (ˆ x t +1 − x t +1 ) =Υ − θ t +1 ( f (ˆ x t , u t , 0) − f ( x t , u t , m )+ Υ ˆ θ − t +1 K inv t +1 R − ˆ θ − t +1 ( z t +1 − Hf (ˆ x t , u t , 0)) ) ≈ Υ − θ t +1 ∂f ∂ x ∣ ∣ ∣ ∣ ˆ x t , u t , 0 ̃ x t − Υ − θ t +1 ∂f ∂ m ∣ ∣ ∣ ∣ ˆ x t , u t , 0 m + Υ ̃ θ t − τ w K inv t R − ˆ θ t − τ ω t ( z t +1 − Hf (ˆ x t , u t , 0)) For sufficiently small τ and considering noises as first order 1 Note that those properties can be related to the theory of symmetry groups [14]. Yet, in the present paper, we prefer to keep calculations at a basic level to remain close to the computer implementation. 4 terms, we have up to second order terms: Υ − θ t +1 ∂f ∂ x ∣ ∣ ∣ ∣ ˆ x t , u t , 0 = Υ − τ ( ω t + w ) Υ − θ t ∂f ∂ x ∣ ∣ ∣ ∣ ˆ x t , u t , 0 = Υ − τ ( ω t + w )   cos θ t sin θ t 0 − sin θ t cos θ t τ u t 0 0 1   =   1 τ ω t 0 − τ ω t 1 τ u t 0 0 1   Υ − θ t = A t Υ − θ t Υ − θ t +1 ∂f ∂ m ∣ ∣ ∣ ∣ ˆ x t , u t , 0 = Υ − τ ( ω t + w ) Υ − θ t ∂f ∂ m ∣ ∣ ∣ ∣ ˆ x t , u t , 0 = τ   1 0 0 0 0 1   = B Likewise, we have up to second order terms: Υ ̃ θ t − τ w K inv t R − ˆ θ t − τ ω t ( z t +1 − Hf (ˆ x t , u t , 0)) = K inv t H Υ − ˆ θ t − τ ω t ( − ∂f ∂ x ∣ ∣ ∣ ∣ ˆ x t , u t , 0 ̃ x t + ∂f ∂ m ∣ ∣ ∣ ∣ ˆ x t , u t , 0 m ) + K inv t n t +1 = K inv t H ( − A t Υ − θ t ̃ x t + B m ) + K inv t n t +1 Finally, up to second order terms, ̃ x loc t follows the linear evolution: ̃ x loc t +1 = A t ̃ x loc t − B m − K inv t H ( A t ̃ x loc t − B m ) + K inv t n t (15) with: A t =   1 τ ω t 0 − τ ω t 1 τ u t 0 0 1   , B = τ   1 0 0 0 0 1   and where with a slight abuse of notation we replaced R − θ t n t with n t due to the measurement noise isotropy. We thus proved, that the invariant linearized estimation error ̃ x loc t fol- lows a linear equation for which the optimal gain K inv t is given by the Kalman updates: Process update (invariant Kalman): P − t +1 = A t P t A T t + BM B T (16) Measurement update (invariant Kalman): K inv t +1 = P loc − t +1 H T ( HP loc − t +1 H T + N t ) − 1 (17) P loc t +1 = ( I − K inv t +1 H ) P loc − t +1 (18) Finally, the invariant Kalman estimate is given by equation (14) where K inv t is computed by the above formula (17). We can notice that, as a byproduct of making use of the sym- metries of the problem, we derived a linearized equation for the observer in the Fr ́ enet coordinates, in which B does not depend on the trajectory at all, whereas A t only depends on the control inputs: Proposition 2. The linearized equation of the invariant Kalman filter in the Fr ́ enet coordinates is: ̃ x loc t +1 = A t ̃ x loc t − B m − K inv t H ( A t ̃ x loc t − B m ) + K inv t n t where B and A t = A ( u t ) do not depend on the state ˆ x t . The property is illustrated by the diagram of Figure 2. We will show experimentally in Section 4 that this property will endow the invariant LQG with better robustness to high noises and erroneous initialization, compared to the conven- tional LQG. This can be easily understood as the gain output by the IEKF around any trajectory is the same as the one output about the true trajectory. Although it does not en- sure the gain is optimal, it prevents the type of divergences due to a positive feedback between a misestimate and an in- appropriate gain as explained in the Introduction. does not exist anymore Figure 2: Contrarily to the EKF, the gain computation does not directly depend on the last estimate of the state in the case of the IEKF. 3.2 Invariant LQ Likewise, we can rewrite the dynamics of the reference tra- jectory error ̄ x in the Fr ́ enet coordinates, to get a linearized invariant formulation of the LQ controller’s equations. We define the local error to the reference ̄ x loc = Υ − θ ∗ ̄ x and dif- 5 ferentiate, neglecting the second order terms: ̄ x loc t +1 =Υ − θ t +1 ( f ( x t , u t , m ) − f ( x ∗ t , u ∗ t , 0)) =Υ − θ t +1 ∂f ∂ x ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 ̄ x t + Υ − θ t +1 ∂f ∂ u ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 ̄ u t + Υ − θ t +1 ∂f ∂ m ∣ ∣ ∣ ∣ x ∗ t , u ∗ t , 0 m ≈ A ∗ t ̄ x loc t + B ̄ u t + B m (19) with: A ∗ t =   1 τ ω ∗ t 0 − τ ω ∗ t 1 τ u ∗ t 0 0 1   Once again, the linearized matrices of the controller do not depend on the reference path ( x ∗ t ) t =1 ...n , as A ∗ t depends only on the inputs. Proposition 3. The linearized equation of the invariant LQ controller in the Fr ́ enet coordinates is ̄ x loc t +1 = A ∗ t ̄ x loc t + B ̄ u t + B m where B is constant and A ∗ t only depends on the inputs. We can then apply a LQ control policy to this linearized system, which minimizes the quadratic cost function J ( x , u ) : J ( x , u ) = E ( n ∑ t =0 ( ̄ x loc t C ( ̄ x loc t ) T + ̄ u t D ̄ u T t ) ) (20) under the constraints (19). The minimal cost is obtained for ̄ u t = L inv t ̄ x loc t with L inv t given by the solution to the back- wards Riccati equation (23): S l = C (21) L inv t = − ( B T S t B + D ) − 1 B T S t A ∗ t (22) S t = C + A ∗ T t +1 S t +1 A ∗ t +1 + A ∗ T t +1 S t +1 BL inv t +1 (23) However, in presence of measurement uncertainty, the true state x is unknown. Consequently, the control policy applied in practice is: ̄ u t = L inv t Υ − θ ∗ t (ˆ x t − x ∗ t ) (24) 3.3 Invariant LQG Finally, the invariant Kalman filter and the invariant LQ can be combined, in order to compute on-line the (approximate) best input given the current estimation, covariance and the latest input and measurement. The algorithm steps are sum- marized in Algorithm 2. Algorithm 2 Invariant LQG Require: Reference trajectory (( x t , u t )) t =1 ...n Require: Initial covariance P 0 Require: Off-line calculations of the Riccati gains (23) u 0 ← u ∗ 0 ˆ x 0 ← x ∗ 0 for 0 < t ≤ n do • propagate estimation and covariance with (13) and (16) • acquire measurement z t • update the best estimate and its covariance using (14) and (18) • output new command u t computed by (24) end for 4 Illustration of the robustness property through extensive simu- lations Beyond the fact that it is natural to use a closed-loop con- trol that does not depend upon a non-trivial choice of frame orientation, the IEKF is known to have some convergence guaranteed properties (see [4, 7, 5]) about trajectories de- fined by constant inputs: indeed for fixed u ∗ t , ω ∗ t we see that the linearized observation and control systems (15) and (19) become time-invariant, leading to convergence of the gain ma- trices. However for arbitrary reference trajectories on the one hand, and large noises that potentially make the observer- controller step out of the region where the linearization is valid on the other hand, the robustness of the IEKF has never been proved. In this section we show through simulations how the invariant LQG can exhibit increased robustness to noise and initial uncertainties compared to the conventional LQG. The simulations were performed using a reference trajectory composed of straight lines and curves, displayed on Figure 4. We considered a reference initial covariance P 0 0 and refer- ence model and measurement noise covariances M 0 and N 0 . We compared the performances of the invariant LQG and the conventional LQG by performing several simulations with ini- tial covariance α 2 P 0 0 and noises covariances β 2 M 0 , β 2 N 0 , for various factors ( α 2 , β 2 ). For each simulation ( α 2 , β 2 ) is fixed and we draw 5,000 random initial positions ( x 0 i ) i =1 ... 5000 and 5,000 noise samples (( m 0 , . . . , m n ) i , ( n 0 , . . . , n n ) i ) i =1 ... 5000 , Each sample i is used to simulate one robot trajectory us- ing the invariant LQG observer-controller and one robot tra- jectory using the conventional LQG. In total we have 10,000 simulated trajectories, half of them using invariant LQG, and 6 the other half using conventional LQG. For each simulated trajectory we evaluate the cost I as defined in (2). The re- sults are displayed on Figure 3, on which we plot: 1. The mean costs for invariant and conventional LQG in function of the noise factors ( α 2 , β 2 ). For each noise fac- tors couple, we also indicate above the bars and between parentheses the percentage of draws for which using the invariant LQG leads to a lower trajectory cost. 2. The number of trajectories that we can consider as “lost”. A trajectory is considered “lost” when the Mahalanobis distance between its final state and the final estimate exceeds a given threshold. The retained criterion used to label a trajectory as “lost” is: ( x n − ˆ x n , y n − ˆ y n ) P − 1 n [1:2 , 1:2] ( x n − ˆ x n y n − ˆ y n ) > F − 1 2 (0 . 999) (25) This Mahalanobis distance under the linear and Gaus- sian assumption follows a χ 2 2 distribution with 2 degrees of freedom, and F 2 denotes its cumulative distribution function, so that the threshold should not be exceeded for 99 , 9% of the trajectories on average. In practice a trajectory is “lost” when the robot’s state went outside of the “tube” inside which the linearization is valid. In this case the LQG control cannot be trusted anymore. We can draw two main conclusions from these simulations. First the trajectory cost is on average lower when the invariant LQG is used. This becomes more significant as the initial covariance increases. When the initial covariance is very high ( α 2 ≥ 100), the mean cost of invariant LQG trajectories is about twice lower than the corresponding conventional LQG mean cost. The percentage of samples for which the invariant LQG trajectory has a lower cost than its conventional LQG counterpart also increases with α 2 . The influence of the noise levels β 2 is less significant. The second bar chart of Fig. 3 plots the number of “lost” trajectories for different initial covariance and noises covari- ances levels. The number of “lost” trajectories increases with both α 2 and β 2 . However, in case of high α 2 and β 2 , the in- variant LQG observer-controller is much less prone to losing the reference trajectory than the conventional LQG. These “lost” trajectories have a very high cost insofar as they are very “far” from the reference. Consequently, this explains, at least partly, the gap between the mean costs observed on the first bar chart for high α 2 and β 2 . Figure 4 shows an example draw for which the invariant LQG manages to fol- low the reference trajectory, whereas the conventional LQG is completely lost. This figure greatly illustrates the increased robustness to uncertain initial condition gained by using the invariant approach. 0 10000 20000 30000 40000 50000 60000 70000 J ( x , u ) (51.6%) (65.8%) (56.3%) (55.4%) (53.4%) invariant LQG conventional LQG (1, 1) (100, 1) (100, 100) (500, 100) (500, 200) ( α 2 ,β 2 ) 0 1 2 3 4 5 6 7 8 9 Lost trajectories (%) Figure 3: (above) The empirical cost J ( x , u ) obtained for the invariant LQG is up to twice better than the corresponding conventional LQG cost in case of high noises. The percentages between parentheses indicate the proportion of samples for which the invariant LQG leads to a lower cost. (below) The number of “lost” trajectories (according to criterion (25)) is greatly reduced when using the invariant LQG in the case of high noises. 5 Consistency of the computed co- variance and motion planning In this section we explore experimentally to what extent the covariance returned by the observer-controller realistically represents the covariance of the actual discrepancy between the trajectory followed by the robot and a planned trajec- tory. We prove experimentally that the invariant approach captures more closely the uncertainties than the conventional approach. The main application of those results deal the improvements brought by our methodology for the so-called LQG-MP (motion planning) approach recently introduced in [3], where the idea is to pick some sensible trajectories based on the uncertainties they convey. 5.1 Assessing uncertainty to a planned tra- jectory The idea of LQG-MP [3] is to be able to assess uncertainties to the ability of a closed-loop LQG system to follow various planned trajectories. Many candidate trajectories are gen- erated, and only the trajectories satisfying some criteria are retained (for instance the ones maximizing the probability to 7 start goal reference path conventional LQG path invariant LQG path Figure 4: Example of a random draw under strong noises, for which the conventional LQG does not manage to follow the reference trajectory, whereas the invariant LQG does. For this particular draw the initial orientation is opposite to the reference orientation. reach the goal, or minimizing the probability of collision). A typical example where assessing a level of uncertainty to a planned trajectory may prove useful is displayed on Figure 5. A B Figure 5: A robust motion planning algorithm will most likely choose the path that bypasses the obstacles (dashed) rather than the one passing in between (plain), in order to avoid any collision when the plan is executed. 5.2 A priori probability distributions for an invariant LQG In this section, we use the methodology of [3], to compute the a priori distributions of the state of the robot along a given reference path. Indeed, the linearization performed in Section 3 make possible to analyze in advance how the state of the robot will evolve during the execution of the trajectory if the robot uses an invariant LQG. The expressions (15) and (19) can be combined to determine, in the Fr ́ enet coordinates, the a priori distributions of the state ( x t ) t =1 ...n around the refer- ence trajectory ( x ∗ t ) t =1 ...n when applying the optimal control ( ̄ u t ) t =1 ...n = ( L inv t Υ − θ ∗ t (ˆ x t − x ∗ t )) t =1 ...n : ( ̄ x loc ̃ x loc ) t +1 = ( A ∗ t + BL inv t BL inv t 0 A t − K inv t HA t ) ( ̄ x loc ̃ x loc ) t + ( B 0 K inv t HB − B K inv t ) ( m n t ) (26) In the above equation, all the matrices but A t and K inv t can be computed in advance, before any actual simulation or mea- surement. In order to derive the a priori distributions of the state, we make the approximation A t ≈ A ∗ t and we compute the Kalman gains K inv t by replacing A t by A ∗ t in (15). ( ̄ x loc ̃ x loc ) t +1 = ( A ∗ t + BL inv t BL inv t 0 A ∗ t − K inv t HA ∗ t ) ( ̄ x loc ̃ x loc ) t + ( B 0 K inv t HB − B K inv t ) ( m n t ) = F t ( ̄ x loc ̃ x loc ) t + G t q t (27) q t ∼N (0 , Q t ) , Q t = ( M 0 0 N t ) If the initial state covariance E ( ̄ x loc ( ̄ x loc ) T ) is a known Gaus- sian of mean 0 and covariance P 0 , and assuming the noises independence, the above formula shows that at each time step t , the state’s distribution (of the tracking error system) is centered and normal. Knowing that E ( ̄ x loc ( ̄ x loc ) T ) = P 0 , E ( ̃ x ( ̃ x loc ) T ) = P 0 , and E ( ̃ x ( ̄ x loc ) T ) = − P 0 , we can recursively compute the covariance matrices Σ t = E (( ̄ x loc ̃ x loc ) t ( ̄ x loc ̃ x loc ) T t ) with the following formula: Σ 0 = ( P 0 − P 0 − P 0 P 0 ) Σ t +1 = F t Σ t F T t + G t Q t (28) The submatrix of Σ t restricted to the three first lines and columns, denoted by Σ [1:3 , 1:3] t , is the covariance matrix of the state along the reference path in the Fr ́ enet coordinates. To get the corresponding covariance in the fixed frame, the matrix shall be rotated: Υ θ t Σ [1:3 , 1:3] t Υ − θ t . Consequently, for- mula (28), provides an invariant formulation of the a priori probability distributions of the state about the reference tra- jectory. 8 5.3 A priori probability distributions for a conventional LQG The approach advocated in [3] consists of linearizing both the observer equations and the controller equation about the reference path. For our specific model (1), the equations of [3], become: ( x − x ∗ ˆ x − x ∗ ) t +1 = F ′ t ( x − x ∗ ˆ x − x ∗ ) t + G ′ t q t (29) with: q t ∼ N (0 , Q t ) , F ′ t = ( A ′ t B ′ t L ′ t K ′ t A ′ t + B ′ t L ′ t − K ′ t HA ′ t ) , G ′ t = ( B ′ t 0 K ′ t HB ′ t K ′ t W t ) A ′ t =   1 0 − τ u ∗ t sin( θ ∗ t ) 0 1 τ u ∗ t cos( θ ∗ t ) 0 0 1   , B ′ t =   τ cos( θ ∗ t ) 0 τ sin( θ ∗ t ) 0 0 1   and K ′ t and L ′ t are respectively the Kalman and LQ gains for the linearized system: ̄ x t +1 = A ′ t ̄ x t + B ′ t ̄ u t + B ′ t m (30) Finally the prediction for the conventional LQG can be ob- tained by the following recursion: Σ ′ 0 = ( P 0 0 0 0 ) Σ ′ t +1 = F ′ t Σ ′ t F ′ T t + G ′ t Q t G ′ T t (31) As before, the submatrices 3 × 3 of Σ ′ t with indexes inferior to 3 denoted by Σ ′ [1:3 , 1:3] t give the state covariance around the reference path. Contrarily to the invariant formulation, the matrices B ′ t depend on the time. Likewise, the matrices A ′ t depend explicitly on the trajectory whereas the corresponding invariant matrices A t only depend on the inputs. 5.4 Simulation results In order to compare the prediction performances of the in- variant approach (subsection 5.2) and the conventional ap- proach (subsection 5.3), we reuse the simulations of Section 3. For each initial covariance level α 2 and noises level β 2 we can compute, in advance, the predicted covariance matrices Σ t and Σ ′ t thanks to (28) and (31) respectively. To measure the “distance” between the predicted distributions and the actual distributions obtained by simulation, we use the sym- metric Kullback-Leibler divergence (KL-divergence) that is a (1, 1) (10, 1) (100, 1) (100, 10) (100, 100) (500, 100) (500, 200) ( α 2 ,β 2 ) 10 -3 10 -2 10 -1 10 0 10 1 10 2 Mean KL-divergence invariant LQG conventional LQG Figure 6: When using the invariant LQG, the predictions of the state distribution better match the simulations (lower KL-divergence). This is increasingly significant as the noises ( α, β ) grow. The KL-divergence is shown using a logarithmic scale. natural way to measure a discrepancy between probability distributions. For two Gaussians N ( m 0 , Σ 0 ) and N ( m 1 , Σ 1 ) of dimension n , it is given by: KL = 1 4 ( tr(Σ − 1 1 Σ 0 ) + ( m 1 − m 0 ) T Σ − 1 1 ( m 1 − m 0 ) − log det Σ 0 det Σ 1 − k ) + 1 4 ( tr(Σ − 1 0 Σ 1 ) + ( m 0 − m 1 ) T Σ − 0 0 ( m 0 − m 1 ) − log det Σ 1 det Σ 0 − k ) (32) The results are displayed on Figure 6. The prediction per- formance is equivalent for low α 2 and β 2 but the invariant prediction is by far more accurate when these noise factors in- crease (more than ten times). In fact, we can see on Figure 7 that the predicted covariance matrices Υ θ ∗ t Σ [1:3 , 1:3] t Υ − θ ∗ t and Σ ′ [1:3 , 1:3] t are very close even for large noises when compared in the same frame. This is no surprise: about the reference trajectory as long as the linear approximation is valid the frame in which the equations are derived should not matter that much. However, when moving away from the reference trajectory, as in actual experiments, the non-linearities may play an important role, and the nice non-linear structure of the invariant LQG saves the day: as shown in Section 3, the invariant LQG is much more robust to high noise factors, while a non negligible number of conventional LQG trajecto- ries get lost and their behavior becomes random. This results in very high divergences for the conventional prediction while 9 the invariant prediction is still accurate. 0 2 4 6 8 10 12 Σ x,x conventional LQG invariant LQG 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 Σ x,y 0 50 100 150 200 250 300 t 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 Σ y,x 0 50 100 150 200 250 300 t 0 2 4 6 8 10 12 Σ y,y Figure 7: The evolutions over time of the invariant and con- ventional entries of the predicted covariance matrices are very close when rotated and compared in a common frame. The plot represents the matrices entries in the fixed frame for ( α 2 , β 2 ) = (100 , 100) 6 Conclusion We introduced a new invariant Linear Quadratic Gaussian controller for the control of a unicycle robot along a reference trajectory. We showed through extensive simulations that, when noises are strong, the achieved cost reflecting the mag- nitude of the tracking error is greatly reduced in comparison to the one obtained when using a conventional LQG. In prac- tice, the invariant LQG showed increased robustness to high noises, suggesting that the linearized equation of both the ob- server and the controller have a much higher “validity zone” than in the conventional LQG case. The trajectory cost for small noises is comparable, yet slightly better, than the one obtained using a conventional LQG. Consequently, we recom- mend the use of invariant LQG over conventional LQG in any application where the initial uncertainty and the model and measurement noises might be high and where symmetries can be exploited. In the future we would like to illustrate the results through real experimentations, and would also like to explore the superiority of the invariant approach from a the- oretical viewpoint. References [1] S. Bonnabel A. Barrau. “Intrinsic filtering on SO(3) with discrete-time observations”. In: IEEE Conference on Decision and Control . 2013. [2] Martin Barczyk and Alan F Lynch. “Invariant Observer Design for a Helicopter UAV Aided Inertial Navigation System”. In: IEEE Transactions on Control Systems Technology 21.3 (2013), pp. 791–806. [3] J. Van Den Berg, P. Abbeel, and K. Goldberg. “LQG- MP: Optimized path planning for robots with motion uncertainty and imperfect state information”. In: The International Journal of Robotics Research 30.7 (2011), pp. 895 –913. [4] S. Bonnabel. “Invariant Extended Kalman Filter”. In: Conference on Decision and Control 2007 (CDC07) . 2007. [5] S. Bonnabel, Ph. Martin, and P. Rouchon. “Non- linear Symmetry-preserving Observers on Lie Groups”. In: IEEE Trans. on Automatic Control 54.7 (2009), pp. 1709 –1713. [6] S. Bonnabel, Ph. Martin, and P. Rouchon. “Symmetry- preserving Observers”. In: IEEE Trans. on Automatic Control 53.11 (2008), pp. 2514–2526. [7] S. Bonnabel, P. Martin, and E. Salaun. “Invariant Ex- tended Kalman Filter: Theory and application to a velocity-aided attitude estimation problem”. In: Deci- sion and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Pro- ceedings of the 48th IEEE Conference on . Dec. 2009, pp. 1297 –1304. doi : 10.1109/CDC.2009.5400372 . [8] S. Bonnabel et al. “A Separation Principle on Lie Groups”. In: IFAC (available on Arxiv) . 2011. [9] F. Bullo and R.M. Murray. “Tracking for Fully Actuated Mechanical Systems: A Geometric Framework.” In: Au- tomatica 35.1 (1999), pp. 17–34. [10] G Campion, G Bastin, and B D’Andr ́ ea-Novel. “Struc- tural Properties and Classification of Kinematic and Dynamic Models of Wheeled Mobile Robots”. In: IEEE Trans. Robotics Automation 12 (1996), pp. 47–62. [11] D. Guillaume and P. Rouchon. “Observation and con- trol of a simplified car”. In: Workshop Motion Control, 1998. IFAC 1998. Proceedings 1998 IFAC International conference on . Citeseer, Sept. 1998, pp. 63 –67. [12] V. Jurdjevic and H. J. Sussmann. “Control systems on Lie groups”. In: J. Differential Equations (1972), pp. 313–329. 10 [13] R. Mahony, T. Hamel, and J-M Pflimlin. “Nonlin- ear complementary filters on the Special Orthogonal Group”. In: IEEE-Trans. on Automatic Control 53.5 (2008), pp. 1203–1218. [14] P. J. Olver. Equivalence, Invariants and Symmetry . Cambridge University Press, 1995. [15] P. Rouchon and J. Rudolph. “Invariant tracking and stabilization: problem formulation and examples”. In: Lecture Notes in Control and Information Sciences 246. Springer, 1999, pp. 261–273. [16] N. Roy et al. “Coastal navigation-mobile robot nav- igation with uncertainty in dynamic environments”. In: Robotics and Automation, 1999. Proceedings. 1999 IEEE International Conference on . Vol. 1. IEEE. 1999, pp. 35 –40. [17] R. F. Stengel. Optimal control and estimation . Courier Dover Publications, 2012. [18] R. Tedrake et al. “LQR-trees: Feedback motion plan- ning via sums-of-squares verification”. In: The Inter- national Journal of Robotics Research 29.8 (2010), pp. 1038 –1052. 11