Sparse Gaussian Processes for Continuous-Time Trajectory Estimation on Matrix Lie Groups Jing Dong, Byron Boots, and Frank Dellaert Abstract — Continuous-time trajectory representations are a powerful tool that can be used to address several issues in many practical simultaneous localization and mapping (SLAM) scenarios, like continuously collected measurements distorted by robot motion, or during with asynchronous sensor mea- surements. Sparse Gaussian processes (GP) allow for a prob- abilistic non-parametric trajectory representation that enables fast trajectory estimation by sparse GP regression. However, previous approaches are limited to dealing with vector space representations of state only. In this technical report we extend the work by Barfoot et al. [1] to general matrix Lie groups, by applying constant-velocity prior, and defining locally linear GP. This enables using sparse GP approach in a large space of practical SLAM settings. In this report we give the theory and leave the experimental evaluation in future publications. I. I NTRODUCTION Simultaneous localization and mapping (SLAM) is a fun- damental tool in robotics, enabling robots to autonomously operate in previously unseen environments. Currently, many researchers are focussing on nonlinear optimization tech- niques for SLAM. Compared to message-passing techniques like extended Kalman filtering and smoothing, nonlinear optimization is able to better contend with large scale en- vironments, and sometimes able to produce more consis- tent results [2]. For example, batch nonlinear least squares optimization has been widely used in both the computer vision [3] and SLAM communities [4], [5], [6] for years, and, recently, incremental approaches to smoothing and mapping [7], [8] demonstrate that batch solutions can be efficiently and incrementally updated as new measurements are collected. The majority of existing SLAM algorithms use discrete- time representations of robot trajectories. Although discrete- time approaches are sufficient for many problems, there are two important situations that discrete-time approaches have difficulty handling: (1) when sensors measure the environ- ment continuously, for example with spinning LIDAR or rolling-shutter cameras, measurements will be distorted by the robot’s motion; and (2) when sensor measurements arrive asynchrounously. In these cases, a continuous-time trajectory representation provides a solution. Unlike discretized trajectory representa- tions that are parameterized at fixed discrete time intervals, continuous-time representations can be queried to recover the robot state at any time of interest. Although continuous- time trajectory representations have successfully been used J. Dong, B. Boots, and F. Dellaert are with the College of Com- puting, Georgia Institute of Technology, USA. jdong@gatech.edu, { bboots,frank } @cc.gatech.edu in state estimation [9] for years, continuous-time localization and mapping is a recent tool in robotics. Several popular continuous-time trajectory representations include linear in- terpolation [10], [11], [12], splines [13], [14], [15], [16], [17], [18], and hierarchical wavelets [19], all of which have been used in both filtering and batch estimation approaches. This report focuses on an alternative probabilistic non- parametric representation for trajectories based on Gaus- sian processes (GPs). Tong et al. [20], [21] showed that simultaneous trajectory estimation and Mapping (STEAM), the continuous-time extension of SLAM, can be reduced to GP regression. By placing various GP priors on robot trajectories, this approach can solve different types of trajec- tory estimation problems. However, if standard kernels, such as the squared exponential kernel, are used, the method is expensive, with polynomial time and space complexity. Maintaining sparsity in SLAM problems has been well- studied [3], [6], [22], and it is the key to maintaining scalable optimization in many modern SLAM algorithms. Barfoot et al. [1] shows that by applying a linear time- varying stochastic differential equation (LTV-SDE) prior on trajectories, the inverse kernel matrices are exactly sparse, leading to efficient GP regression. This approach is further extended to GP priors driven by nonlinear time-varying stochastic differential equations (NTV-SDEs) [23], and an incremental GP regression framework [24]. The major drawback of all of these Gaussian process- based approaches is they require the system state to live in a vector space , which is not a valid assumption in many trajectory estimation problems. For example, typical vector- valued representations for 3D rigid-body rotations either exhibit singularities (Euler angles) or impose extra nonlinear constraints (quaternions). Sparse GP regression for STEAM [1] has been extended to the special Euclidean group SE(3) in Anderson et al. [25]. In this technical report we extend the sparse GP regression approach [1] to work with arbitrary matrix Lie groups [26], extending the approach to a much more general setting. For example, an attitude and heading reference system (AHRS) [27] can be treated as trajectory estimation on the special orthogonal group SO(3), and 3D trajectory estimation for a monocular camera without scale information can be treated as trajectory estimation on the similarity transforma- tion group Sim(3) [28]. In this report we only summarize the theory, and leave all the experimental evaluations in future publications. arXiv:1705.06020v1 [cs.RO] 17 May 2017 II. P RELIMINARIES A. Problem Definition We consider the problem of continuous-time trajectory estimation, in which a continuous-time system state x ( t ) is estimated from observations [20]. The system model is described as x ( t ) ∼ GP ( μ ( t ) , K ( t, t ′ )) (1) z i = h i ( x ( t i )) + n i , n i ∼ N ( 0 , Σ i ) , (2) where x ( t ) is represented by a Gaussian Process (GP) with mean μ ( t ) and covariance K ( t, t ′ ) . A measurement z i at each time t i is obtained by the (generally nonlinear) discrete- time measurement function h i in Eq. (2) and assumed to be corrupted by zero-mean Gaussian noise with covariance Σ i . B. Maximum a Posteriori Estimation The Maximum a Posteriori (MAP) estimate of the tra- jectory can be computed through Gaussian process Gauss- Newton (GPGN) [20]. We first write down the objective function, with assumption that there are M observations and the definitions of following terms x . =    x ( t 1 ) . . . x ( t M )    , μ . =    μ ( t 1 ) . . . μ ( t M )    , K . = [ K ( t i , t j )] ∣ ∣ ∣ ij, 1 ≤ i,j ≤ M , z . =    z 1 . . . z M    , h ( x ) . =    h 1 ( x 1 ) . . . h M ( x M )    , Σ . =    Σ 1 . . . Σ M    , the MAP estimation can be written as x ∗ = argmax x { 1 2 ‖ x − μ ‖ 2 K + 1 2 ‖ h ( x ) − z ‖ 2 Σ } , (3) where ‖‖ Σ is Mahalanobis distance defined as ‖ x ‖ 2 Σ . = x > Σ − 1 x . MAP estimation is therefore translated into a nonlinear least square optimization problem. We use a Gauss-Newton approach to solve the nonlinear least squares problem. By linearizing the measurement func- tion h i around a linearization point x i , we obtain h i ( x i + δ x i ) ≈ h i ( x i ) + H i δ x i , H i . = ∂ h i ∂ x ∣ ∣ ∣ x i , (4) in which H i is the Jacobian matrix of measurement func- tion (2) at linearization point x i . By defining H . = diag( H 1 , . . . , H M ) , we get a linearized least squares prob- lem around linearization point x δ x ∗ = argmax δ x { 1 2 ‖ x + δ x − μ ‖ 2 K + 1 2 ‖ h ( x )+ H δ x − z ‖ 2 Σ } . (5) The GPGN algorithm starts from some initial guess of x , then, at each iteration, the optimal perturbation δ x ∗ is found by solving linear system ( K − 1 + H > Σ − 1 H ) δ x ∗ = K − 1 ( μ − x ) + H > Σ − 1 ( z − h ) . (6) and updating the solution by x ← x + δ x ∗ until convergence. The information matrix K − 1 in Eq. (6) encodes the GP prior information, and H > Σ − 1 H represents information from the measurements. H > Σ − 1 H is block-wise sparse in most SLAM problems [6], but K − 1 is not usually sparse for most commonly used kernels. We define a GP prior with sparse structure and exploit this structure to efficiently solve the linear system. III. S PARSE GP P RIORS FOR T RAJECTORY E STIMATION A class of exactly sparse GP priors for trajectory estima- tion is proposed in Barfoot et al. [1]. Unfortunately, only vector-valued system states are correctly handled by this approach. In this section we first briefly revisit the approach in [1] for vector spaces, we then extend this approach to two types of Lie groups, the special orthogonal group SO(3) and the special Euclidean group SE(3). A. GP Priors for Vector Space Here we consider GP priors for vector-valued system states x ( t ) generated by linear time-varying stochastic differential equations (LTV-SDEs) [1] ̇ x ( t ) = A ( t ) x ( t ) + u ( t ) + F ( t ) w ( t ) , (7) where u ( t ) is the known system control input, w ( t ) is white process noise, and both A ( t ) and F ( t ) are time-varying system matrices. The white process noise is represented by w ( t ) ∼ GP ( 0 , Q C δ ( t − t ′ )) , (8) where Q C is the power-spectral density matrix, which is a hyperparameter [23], and δ ( t − t ′ ) is the Dirac delta function. The mean and covariance of LTV-SDE generated GP are μ ( t ) = Φ ( t, t 0 ) μ 0 + ∫ t t 0 Φ ( t, s ) u ( s ) ds (9) K ( t, t ′ ) = Φ ( t, t 0 ) K 0 Φ ( t ′ , t 0 ) > + ∫ min( t,t ′ ) t 0 Φ ( t, s ) F ( s ) Q C F ( s ) > Φ ( t ′ , s ) > ds (10) where μ 0 is the initial mean value of first state, K 0 is the covariance of first state, and Φ ( t, s ) is transition matrix. In [1] it is proved that if the system is generated by the LTV-SDE in Eq. (7), the inverse covariance matrix K − 1 is block-tridiagonal. The constant-velocity GP prior is generated by a LTV-SDE with white noise on the acceleration and has previously been used in trajectory estimation [20], [1], [25]. ̈ p ( t ) = w ( t ) , (11) where p ( t ) is the N -dimensional vector-valued position (or pose) variable of trajectory, if the system has N degrees of freedom. To convert this prior into the LTV-SDE form of Eq. (7), a Markov system state variable is declared x ( t ) . = [ p ( t ) ̇ p ( t ) ] , (12) The prior in Eq. (11) then can easily be converted into a LTV-SDE in Eq. (7) by defining A ( t ) = [ 0 1 0 0 ] , u ( t ) = 0 , F ( t ) = [ 0 1 ] . (13) B. GP Priors on SO(3) Before discussing sparse GP priors for general Lie groups, we discuss several specific examples. The Special Orthogo- nal Group SO(3), is a matrix Lie group and represents 3D rotation matrices R defined by { R ∈ R 3 × 3 : RR > = I , det( R ) = 1 } . The continuous-time trajectory is then represented by the function R ( t ) that maps time to rotation matrices. The relation between rotation and body-frame angular velocity is given by [29, p.52] ̇ R ( t ) = R ( t ) b ω ( t ) ∧ , (14) where b ω ( t ) is the body-frame angular velocity (the subscript b means the angular velocity is defined in body-frame), and ∧ operator constructs a 3 × 3 skew symmetric matrix from a vector in R 3 ω ∧ =   ω 1 ω 2 ω 3   ∧ =   0 − ω 3 ω 2 ω 3 0 − ω 1 − ω 2 ω 1 0   . (15) Assume that body-frame angular velocity is constant ( constant-velocity prior ), and the body-frame angular accel- eration is corrupted by white noise ̇ b ω ( t ) = w ( t ) , (16) then we can write down the nonlinear SDE that represents this prior ̇ x ( t ) = d dt { R ( t ) b ω ( t ) } = { R ( t ) b ω ( t ) ∧ w ( t ) } , (17) where x ( t ) . = { R ( t ) , b ω ( t ) } are the Markov system states. The SDE is nonlinear, so we cannot leverage the approach in [1] to get exactly sparse linear system. However, similar to approach in [23], [25], it is possible to linearize the system around the current point estimate x ( t ) , and achieve a locally linear SDE, which can utilize the exactly sparse GP prior proposed in [1]. To define a locally linear GP prior, we first look at how to handle uncertainty and define a locally linear GP on SO(3). Various approaches have been proposed to handle uncertainty for SO(3), or even general Lie groups, include [26], [30], [31], [32], [33]. Here we adopt the approach in [33] and define the Gaussian distribution on SO(3) as a Gaussian distribution on the tangent space that is then mapped back to SO(3) by an exponential map ̃ R = R exp(  ∧ ) ,  ∼ N ( 0 , Σ) , (18) where ̃ R is noisy rotation, R is noise-free rotation, and  ∈ R 3 is a small perturbation which is normally distributed with zero mean and Σ covariance. By adopting this definition of a Gaussian distribution on SO(3), the GP on SO(3) can be defined locally. Considering a rotation R i at time t i close to R ( t ) at t (the time interval between t and t i is small), the GP model in Eq. (1) gives R ( t ) = R i exp( ξ i ( t ) ∧ ) , ξ i ( t ) ∼ N ( 0 , K ( t i , t )) . (19) A local variable ξ i ( t ) ∈ R 3 around R i is defined as ξ i ( t ) . = log( R − 1 i R ( t )) ∨ , (20) where ∨ is the inverse operator of ∧ , and log( · ) is the logarithm map of SO(3), which is inverse function of the exponential map. The time derivative of ξ i ( t ) has [26, p.26] R ( t ) − 1 ̇ R ( t ) = ( J r ( ξ i ( t )) ̇ ξ i ( t ) ) ∧ , (21) where J r is the right Jacobian of SO(3) [26, p.40]. With Eq. (14) we have ̇ ξ i ( t ) = J r ( ξ i ( t )) − 1 b ω ( t ) . (22) If the small time interval assumption is satisfied, since J r is identity at zero, and ξ i ( t ) is close to zero, we have a good approximation of ̇ ξ i ( t ) ̇ ξ i ( t ) ≈ b ω ( t ) . (23) Here ̇ ξ i ( t ) has an explicit meaning: it is the body-frame angular velocity. Considering the case specified by Eq. (16) where white noise is injected into time derivative of ̇ ξ i ( t ) by ̈ ξ ( t ) = w ( t ) , (24) the local constant-velocity LTV-SDE of SO(3) is finally written ̇ γ i ( t ) = d dt [ ξ i ( t ) ̇ ξ i ( t ) ] = [ ̇ ξ i ( t ) w ( t ) ] , (25) where γ i ( t ) . = [ ξ i ( t ) , ̇ ξ i ( t )] > is the local Markov system states around R i . The SDE in Eq. (25) is linear, so we can apply the approach in Section III-A. C. GP Priors on SE(3) We can define locally linear constant-velocity GP priors for the Special Euclidean Group SE(3) in similar manner to GP priors on SO(3) in Section III-B. The Special Euclidean Group SE(3) represents rigid motion in 3D, which is defined by transformation matrices T = [ R t 0 1 ] , (26) and where t is the translation term of motion. Similar to SO(3), the body-frame velocity b v ( t ) has the following relation with the time derivative of transformation matrix ̇ T ( t ) [29, p.55] ̇ T ( t ) = T ( t ) b v ( t ) ∧ , (27) where ∧ operator constructs a matrix from a velocity v ∈ R 6 v ∧ = [ ω v ] ∧ = [ ω ∧ v 0 0 ] , (28) where v = [ v 1 v 2 v 3 ] > is the body-frame translational velocity. We define the local variable ξ i ( t ) ξ i ( t ) . = log( T − 1 i T ( t )) ∨ , (29) and similar to SO(3), we get the time derivative of local variable ξ i ( t ) from Eq. (27) ̇ ξ i ( t ) ≈ b v ( t ) . With local variable ξ i ( t ) and γ i ( t ) defined, SE(3) can have locally linear SDE formulated similarly to Eq. (25), and an be used to generate a GP prior in the same way. IV. S PARSE GP P RIORS ON L IE G ROUPS Discussions in Section III-B and III-C will be expanded, and a locally linear GP prior on general real matrix Lie groups will be formally defined and discussed in this section. We begin by providing notation and several definitions. Every N -dimensional matrix Lie group G has an associated Lie algebra g [26, p.16]. The Lie algebra g coincides with the local tangent space to the manifold of G . Example Lie algebras of SO(3) and SE(3) are defined by skew symmetric matrices in Eq. (15) and Eq. (28) respectively. The exponential map exp : g → G and logarithm map log : G → g define the mapping between the Lie group and Lie algebra respectively [26, p.18]. G also has an associating hat operator ∧ : R N → g and vee operator ∨ : g → R N that convert elements in local coordinates R N to the Lie algebra g and vice versa [26, p.20]. A. Constant-Velocity GP Priors on Lie Groups We use T ∈ G to represent an object in G , so the continuous-time trajectory is written as T ( t ) , and trajectory states to be estimated at times t 1 , . . . , t M are T 1 , . . . , T M . To perform trajectory estimation on G , we first define the Markov system states x ( t ) . = { T ( t ) , $ ( t ) } , (30) where $ ( t ) is the ‘body-frame velocity’ variable defined by $ ( t ) . = ( T ( t ) − 1 ̇ T ( t )) ∨ . (31) Since ∀ T ∈ G, T − 1 ̇ T ∈ g [26, p.20], we can apply the ∨ operator on T ( t ) − 1 ̇ T ( t ) . In SO(3) and SE(3), $ ( t ) is the body-frame velocity (see Eq. (14) and (27)). So we call $ ( t ) the ‘body-frame velocity’ in general Lie groups. The constant ‘body-frame velocity’ prior is defined as ̇ $ ( t ) = w ( t ) , w ( t ) ∼ GP ( 0 , Q C δ ( t − t ′ )) , (32) but this is a nonlinear SDE, which does not match the LTV- SDE defined by Eq. (7). B. Locally Linear Constant-Velocity GP Priors To define a LTV-SDE which can leverage the constant- velocity GP prior, we linearize the Lie group manifold around each T i , and define both a local GP and LTV-SDE on the linear tangent space. We first define a local GP for any time t on trajectory which meets t i ≤ t ≤ t i +1 , T ( t ) = T i exp( ξ i ( t ) ∧ ) , ξ i ( t ) ∼ N ( 0 , K ( t i , t )) . (33) the local pose variable ξ i ( t ) ∈ R N around T i is defined by ξ i ( t ) . = log( T − 1 i T ( t )) ∨ . (34) The local LTV-SDE that represents constant-velocity infor- mation is ̈ ξ i ( t ) = w ( t ) , w ( t ) ∼ GP ( 0 , Q C δ ( t − t ′ )) . (35) If we define the local Markov system state γ i ( t ) . = [ ξ i ( i ) ̇ ξ i ( t ) ] , (36) the local LTV-SDE is rewritten as ̇ γ i ( t ) = d dt [ ξ i ( t ) ̇ ξ i ( t ) ] = [ ̇ ξ i ( t ) w ( t ) ] . (37) To prove the equivalence between the nonlinear SDE in Eq. (32) and the local LTV-SDE in Eq. (35), we first look at [26, p.26] T ( t ) − 1 ̇ T ( t ) = ( J r ( ξ i ( t )) ̇ ξ i ( t ) ) ∧ , (38) where J r is the right Jacobian of G . With Eq. (31) we have ̇ ξ i ( t ) = J r ( ξ i ( t )) − 1 $ ( t ) . (39) If the small time interval assumption between any t i and t i +1 is satisfied, we have a good approximation of ̇ ξ i ( t ) ̇ ξ i ( t ) ≈ $ ( t ) . (40) So we have proved that the LTV-SDE in Eq. (35) and (37) is a good approximation of constant ‘body-frame velocity’ prior defined by Eq. (32).Note that Section III-B is a specialization of the above discussion to SO(3). Both the local GP and LTV-SDE are defined on the tangent space, so they are only valid around current linearization points T i . But if all time stamps have a small enough interval, the GP and LTV-SDE can be defined in a piecewise manner, and every point on the trajectory can be converted to local variable ξ i ( t ) based on its nearby estimated state T i . C. A Factor Graph Perspective Once the local GP and constant-velocity LTV-SDE are defined, we can write down the cost function J gp used to incorporate information about the GP prior into the nonlinear least squares optimization in Eq. (3). As discussed, the GP prior cost function has the generic form J gp = 1 2 ‖ μ − x ‖ 2 K , (41) but if the trajectory is generated by a constant-velocity LTV- SDE in Eq. (37), the GP prior cost can be specified as as [1] J gp = ∑ i 1 2 e > i Q − 1 i e i , (42) e i = Φ ( t i +1 , t i ) γ i ( t i ) − γ i ( t i +1 ) , (43) where Q i is covariance matrix by [1] Φ ( t, s ) = [ 1 ( t − s ) 1 0 1 ] , Q i = [ 1 3 ∆ t 3 i Q C 1 2 ∆ t 2 i Q C 1 2 ∆ t 2 i Q C ∆ t i Q C ] , (44) where ∆ t i = t i +1 − t i . Since the GP prior cost J gp has been written as a sum of squared cost terms, and each cost term is only related to nearby (local) Markov states, we can represent the least square problem by factor graph models. In factor graphs the system states are represented by variable factors, and the cost terms are represented by cost factors. An example factor graph is shown in Fig. 1. By converting nonlinear least squares problems into factor graphs we can take advantage of factor graph inference tools to solve the problems efficiently. Additional information about the relationship between factor graphs and sparse GP and SLAM problems can be found in [6], [1], [23], [24] J 0 = 1 2 e > 0 K − 1 0 e 0 , e 0 = x 0 − μ 0 J i = 1 2 e > i Q − 1 i e i , e i = Φ ( t i +1 , t i ) γ i ( t i ) − γ i ( t i +1 ) x 0 x 1 x i x i +1 Fig. 1: An example factor graph, showing states (triangles) and factors (black boxes). GP prior factors connect consec- utive states, and define the prior information on first state. D. Querying the Trajectory One of the advantages of representing the continuous-time trajectory as a GP is that we have the ability to query the state of the robot at any time along the trajectory. For constant- velocity GP priors, the system state x ( τ ) , t i ≤ τ ≤ t i +1 can be estimated by two nearby states x ( t i ) and x ( t i +1 ) [1], which allows efficient O (1) interpolation. We first calculate the mean value of local state ˆ γ i ( τ ) ˆ γ i ( τ ) = Λ ( τ )ˆ γ i ( t i ) + Ψ ( τ )ˆ γ i ( t i +1 ) , (45) where Λ ( τ ) = Φ ( τ, t i ) − Q τ Φ ( τ, t i ) > Q − 1 i +1 Φ ( t i +1 , t i ) , (46) Ψ ( τ ) = Q τ Φ ( τ, t i ) > Q − 1 i +1 . (47) Once we have the mean value of local state ˆ γ i ( τ ) , the mean value of the full state ˆ x ( τ ) = { ˆ T ( τ ) , ˆ $ ( τ ) } is ˆ T ( τ ) = ˆ T i exp (( Λ 1 ( τ )ˆ γ i ( t i ) + Ψ 1 ( τ )ˆ γ i ( t i +1 ) ) ∧ ) , (48) ˆ $ ( τ ) = J r ( ˆ ξ i ( τ ) ) − 1 ( Λ 2 ( τ )ˆ γ i ( t i ) + Ψ 2 ( τ )ˆ γ i ( t i +1 ) ) , (49) where Λ ( τ ) = [ Λ 1 ( τ ) Λ 2 ( τ ) ] , Ψ ( τ ) = [ Ψ 1 ( τ ) Ψ 2 ( τ ) ] , ˆ γ i ( t i ) = [ 0 ˆ $ ( t i ) ] , ˆ γ i ( t i +1 ) = [ ˆ ξ i ( t i +1 ) J r ( ˆ ξ i ( t i +1 )) − 1 ˆ $ ( t i +1 ) ] , ˆ ξ i ( τ ) = log( ˆ T − 1 i ˆ T ( τ )) ∨ , ˆ ξ i ( t i +1 ) = log( ˆ T − 1 i ˆ T i +1 ) ∨ , E. Fusion of Asynchronous Measurements Continuous-time trajectory interpolation affords GP-based trajectory estimation methods several advantages over discrete-time localization algorithms. In addition to providing a method for querying the trajectory at any time of interest, GP interpolation can be used to reduce the number of states needed to represent the robot’s trajectory, and elegantly handle asynchronous measurements. Assume there is a measurement z τ of state x ( τ ) available at an arbitrary time τ, t i ≤ τ ≤ t i +1 , with measurement func- tion h τ and corresponding covariance Σ τ . The measurement cost in the factor graph can be written as J τ ( x ( τ )) = 1 2 ‖ z τ − h τ ( x ( τ )) ‖ 2 Σ τ . (50) Since system state x ( τ ) is not explicitly available during optimization, we perform trajectory interpolation between x i and x i +1 by Eq. (48) – (49), and rewrite the cost in terms of the interpolated mean value ˆ x ( τ ) J τ ( x i , x i +1 ) = 1 2 ‖ z τ − h τ ( ˆ x ( τ )) ‖ 2 Σ τ . (51) x ( t i +1 ) x ( t i ) x ( τ ) (a) Meausurement x ( t i +1 ) x ( t i ) x ( τ ) (b) Interpolated Factor Fig. 2: (a) Measurement at time τ , dashed line indicates it’s not an actual factor. (b) The interpolated factor encodes measurement at time τ . GP Prior Factor: J i = 1 2 e > i Q − 1 i e i , e i = Φ ( t i +1 , t i ) γ i ( t i ) − γ i ( t i +1 ) Landmark Factor: J i = 1 2 e > i Σ − 1 i e i , e i = h ( x i , l j ) − z i Fig. 3: A factor graph of an example STEAM problem containing GP prior factors and landmark measurements factors. Landmarks are illustrated with open circles. Because the measurement cost is represented by x i and x i +1 , a binary factor can be added to the factor graph and optimized without explicitly adding an additional state. An example is illustrated in Fig. 2. F. Simultaneous Trajectory Estimation and Mapping The proposed approach could be extended from trajectory estimation to Simultaneous Trajectory Estimation and Map- ping (STEAM) by combining landmarks l with trajectory states x , resulting a combined state z . = [ x , l ] > , l . = [ l 1 , . . . , l L ] > , where L is number of landmarks [1]. The resulting linear system has the form [ W xx W > lx W lx W ll ] ︸ ︷︷ ︸ W [ δ x ∗ δ l ∗ ] ︸ ︷︷ ︸ δ z ∗ = [ b x b l ] ︸ ︷︷ ︸ b , (52) where W xx is block-tridiagonal due to the GP prior, W ll is block-diagonal, and W lx depends on landmark observations, but is generally sparse [6]. Block-wise sparse Cholesky decomposition can be applied to solve the linear system [ V xx 0 V lx V ll ] ︸ ︷︷ ︸ V [ V > xx V > lx 0 V > ll ] ︸ ︷︷ ︸ V > = [ W xx W > lx W lx W ll ] ︸ ︷︷ ︸ W . (53) A factor graph representing an example STEAM problem is shown in Fig. 3. V. C ONCLUSION We extend the sparse GP regression approach of Barfoot et al. [1] to general matrix Lie groups. As a result, we can solve a broader class of trajectory estimation problems, such as attitude and trajectory estimation in 3D space. Additionally, the continuous-time trajectory representation gives us the ability to fuse asynchronous sensor measurements during estimation, which is useful in practice. Although the proposed method is defined and evaluated in batch settings, it can be immediately extended to an in- cremental estimation framework by combining the proposed sparse GP priors with the incremental GP regression frame- work of Yan et al. [24]. In fact, the proposed GP prior is not limited to just STEAM problems: any technique that benefits from continuous-time trajectories represented by sparse GPs can use our priors. For example, by using our approach in motion planning [34], [35] one could immediately extend vector-space motion planning algorithms to motion planning algorithms on general matrix Lie groups. R EFERENCES [1] T. Barfoot, C. H. Tong, and S. Sarkka, “Batch continuous-time trajectory estimation as exactly sparse gaussian process regression,” Robotics: Science and Systems (RSS) , 2014. [2] S. Thrun and J. J. Leonard, “Simultaneous localization and mapping,” in Springer handbook of robotics , pp. 871–889, Springer, 2008. [3] B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, “Bundle adjustment – a modern synthesis,” in Vision Algorithms: Theory and Practice (W. Triggs, A. Zisserman, and R. Szeliski, eds.), vol. 1883 of LNCS , pp. 298–372, Springer Verlag, 2000. [4] F. Lu and E. Milios, “Globally consistent range scan alignment for environment mapping,” Autonomous Robots , vol. 4, no. 4, pp. 333– 349, 1997. [5] S. Thrun and M. Montemerlo, “The graph SLAM algorithm with applications to large-scale mapping of urban structures,” Intl. J. of Robotics Research , vol. 25, no. 5-6, pp. 403–429, 2006. [6] F. Dellaert and M. Kaess, “Square Root SAM: Simultaneous localiza- tion and mapping via square root information smoothing,” Intl. J. of Robotics Research , vol. 25, pp. 1181–1203, Dec 2006. [7] M. Kaess, A. Ranganathan, and F. Dellaert, “iSAM: Incremental smoothing and mapping,” IEEE Trans. Robotics , vol. 24, pp. 1365– 1378, Dec 2008. [8] M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. Leonard, and F. Del- laert, “iSAM2: Incremental smoothing and mapping using the Bayes tree,” Intl. J. of Robotics Research , vol. 31, pp. 217–236, Feb 2012. [9] R. E. Kalman and R. S. Bucy, “New results in linear filtering and prediction theory,” Journal of basic engineering , vol. 83, no. 1, pp. 95– 108, 1961. [10] M. Bosse and R. Zlot, “Continuous 3D scan-matching with a spinning 2D laser,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , pp. 4312–4319, IEEE, 2009. [11] M. Li, B. H. Kim, and A. I. Mourikis, “Real-time motion tracking on a cellphone using inertial sensing and a rolling-shutter camera,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , IEEE, 2013. [12] H. Dong and T. D. Barfoot, “Lighting-invariant visual odometry using lidar intensity imagery and pose interpolation,” in Field and Service Robotics (FSR) , pp. 327–342, Springer, 2014. [13] C. Bibby and I. Reid, “A hybrid SLAM representation for dynamic marine environments,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , pp. 257–264, IEEE, 2010. [14] S. Anderson and T. D. Barfoot, “Towards relative continuous-time SLAM,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , IEEE, 2013. [15] P. Furgale, J. Rehder, and R. Siegwart, “Unified temporal and spatial calibration for multi-sensor systems,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS) , IEEE, 2013. [16] S. Leutenegger, S. Lynen, M. Bosse, R. Siegwart, and P. Furgale, “Keyframe-based visual–inertial odometry using nonlinear optimiza- tion,” Intl. J. of Robotics Research , vol. 34, no. 3, pp. 314–334, 2015. [17] A. Patron-Perez, S. Lovegrove, and G. Sibley, “A spline-based tra- jectory representation for sensor fusion and rolling shutter cameras,” International Journal of Computer Vision , vol. 113, no. 3, pp. 208– 219, 2015. [18] P. Furgale, C. H. Tong, T. D. Barfoot, and G. Sibley, “Continuous-time batch trajectory estimation using temporal basis functions,” Intl. J. of Robotics Research , vol. 34, no. 14, pp. 1688–1710, 2015. [19] S. Anderson, F. Dellaert, and T. Barfoot, “A hierarchical wavelet decomposition for continuous-time SLAM,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , 2014. [20] C. H. Tong, P. Furgale, and T. D. Barfoot, “Gaussian process gauss– newton for non-parametric simultaneous localization and mapping,” Intl. J. of Robotics Research , vol. 32, no. 5, pp. 507–525, 2013. [21] C. H. Tong, S. Anderson, H. Dong, and T. D Barfoot, “Pose interpo- lation for laser-based visual odometry,” J. of Field Robotics , vol. 31, no. 5, pp. 731–757, 2014. [22] R. Eustice, H. Singh, and J. Leonard, “Exactly sparse delayed- state filters for view-based SLAM,” IEEE Trans. Robotics , vol. 22, pp. 1100–1114, Dec 2006. [23] S. Anderson, T. D. Barfoot, C. H. Tong, and S. S ̈ arkk ̈ a, “Batch non- linear continuous-time trajectory estimation as exactly sparse gaussian process regression,” Autonomous Robots , vol. 39, no. 3, pp. 221–238, 2015. [24] X. Yan, V. Indelman, and B. Boots, “Incremental sparse GP regression for continuous-time trajectory estimation & mapping,” Proc. of the Intl. Symp. of Robotics Research (ISRR) , 2015. [25] S. Anderson and T. D. Barfoot, “Full STEAM ahead: Exactly sparse gaussian process regression for batch continuous-time trajectory es- timation on SE (3),” IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS) , 2015. [26] G. S. Chirikjian, Stochastic Models, Information Theory, and Lie Groups, Volume 2: Analytic Methods and Modern Applications , vol. 2. Springer Science & Business Media, 2011. [27] J. Farrell, Aided Navigation: GPS with High Rate Sensors . McGraw- Hill, 2008. [28] J. Engel, T. Sch ̈ ops, and D. Cremers, “LSD-SLAM: Large-scale direct monocular SLAM,” in European Conf. on Computer Vision (ECCV) , Springer, 2014. [29] R. Murray, Z. Li, and S. Sastry, A Mathematical Introduction to Robotic Manipulation . CRC Press, 1994. [30] A. W. Long, K. C. Wolfe, M. Mashner, and G. S. Chirikjian, “The banana distribution is gaussian: A localization study with exponential coordinates.,” in Robotics: Science and Systems (RSS) , 2012. [31] C. Hertzberg, R. Wagner, U. Frese, and L. Schr ̈ oder, “Integrating generic sensor fusion algorithms with sound state representations through encapsulation of manifolds,” Information Fusion , vol. 14, no. 1, pp. 57–77, 2013. [32] T. D. Barfoot and P. T. Furgale, “Associating uncertainty with three- dimensional poses for use in estimation problems,” IEEE Trans. Robotics , vol. 30, no. 3, pp. 679–693, 2014. [33] C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza, “IMU preinte- gration on manifold for efficient visual-inertial maximum-a-posteriori estimation,” Robotics: Science and Systems (RSS) , 2015. [34] M. Mukadam, X. Yan, and B. Boots, “Gaussian process motion planning,” in IEEE Intl. Conf. on Robotics and Automation (ICRA) , 2016. [35] J. Dong, M. Mukadam, F. Dellaert, and B. Boots, “Motion planning as probabilistic inference using gaussian processes and factor graphs,” Robotics: Science and Systems (RSS) , 2016.