arXiv:1509.08155v2 [cs.RO] 29 Aug 2016 Information-based Active SLAM via Topological Feature Graphs Beipeng Mu 1 Matthew Giamou 1 Liam Paull 2 Ali-akbar Agha-mohammadi 3 John Leonard 2 Jonathan How 1 Abstract — Active SLAM is the task of actively planning robot paths while simultaneously building a map and localizing within. Existing work has focused on planning paths with occupancy grid maps, which do not scale well and suffer from long term drift. This work proposes a Topological Feature Graph (TFG) representation that scales well and develops an active SLAM algorithm with it. The TFG uses graphical models, which utilize independences between variables, and enables a unified quantification of exploration and exploitation gains with a single entropy metric. Hence, it facilitates a natural and principled balance between map exploration and refinement. A probabilistic roadmap path-planner is used to generate robot paths in real time. Experimental results demonstrate that the proposed approach achieves better accuracy than a standard grid-map based approach while requiring orders of magnitude less computation and memory resources. I. I NTRODUCTION The exploration of an unknown space is a fundamental capability for a mobile robot, with diverse applications such as disaster relief, planetary exploration, and surveillance. In the absence of a global position reference (e.g., GPS) the robot must simultaneously map the space and localize itself within that map, referred to as SLAM. If a mobile robot is able to successfully recognize parts of the map when it returns to them, referred to as loop closure, then it can significantly reduce its mapping and localization error. The problem of active SLAM focuses on designing robot trajectories to actively explore an environment and minimize the map error. Previous work has been done on designing trajectories to reduce robot pose uncertainty when the map is known or there exists a global position reference [1]–[5]. There also exists work that maximizes myopic information gain on the next action with partially known maps [6], [7]. However, in this work, the goal is to build a map of an unknown environment thus the robot needs to plan its path and perform SLAM at the same time (active SLAM) with a focus on global map quality. Active SLAM is non-trivial because the robot must trade-off the benefits of exploring new areas and exploiting visited areas to close loops [8]. Previous work has heavily relied on the occupancy grid (OG) map (grid of independent binary random variables denoting occupancy) to compute the information gain on map exploration and check feasibility For example, the seminal 1 Laboratory for Information and Decision Systems, MIT, 77 Mass Ave, Cambridge, MA, USA { mubp, mgiamou, jhow } @mit.edu 2 Computer Science and Artificial Intelligence Laboratory, MIT, 77 Mass Ave, Cambridge, MA, USA, { lpaull, jleonard } @mit.edu 3 Qualcomm Research, 5775 Morehouse Drive, San Diego, CA, USA, aliagha@qualcomm.com Fig. 1: Simultaneous planning, localization and mapping problem – purple polygons represent obstacles, green circles represent fea- tures with their size denoting uncertainties in pose estimates. The problem is to find milestones (gray circles) of robot poses, and plan a trajectory (red line) that minimizes feature uncertainties. work of Bourgault et al. [9] formulates the problem as a trade-off between information gain about the map and entropy reduction over the robot pose: u ∗ = max u w 1 I SLAM ( x, u ) + w 2 I OG ( x, u ) (1) where I OG is the information gained over the occupancy grid (OG) map (grid of independent binary random variables de- noting occupancy) and I SLAM is the information gained over of the robot poses (dependent Gaussian random variables). Similarly, Stachniss et al. [10] use a Rao-Blackwellized particle filter (RBPF) to represent the robot poses and the map, and then consider the informativeness of actions based on the expected resultant information gain. Other information metrics within a similar framework, such as the Cauchy- Schwarz quadratic mutual information [11], the D-optimality criterion [12], and the Kullback-Leibler divergence [13] have also been proposed recently. These two information gains are computed separately and maintaining the balance between them often requires careful parameter tuning on the weights w 1 , and w 2 . Recently, graph-based optimization approaches to the SLAM problem have become very popular due to their ability to exploit the naturally sparse connectivity between robot poses and features in the map [14]. These approaches have proven to have better scalability than the RBPF approaches, which ultimately suffer from particle depletion as the size of the environment grows. Within the graph-based approaches there are two main flavors: pose graphs and feature-based graphs. In the pose-graph approaches, sensor data is used to generate relative transformation constraints between robot poses directly, and an underlying OG map is often required to represent the environment. For example, [15], [16] optimizes the robot trajectory by iteratively computing transformations between laser scans, but still maintains an underlying OG map and plans paths using sample-based approaches such as the probabilistic roadmap or the RRT* algorithm. [17] optimizes robot trajectory with features in a structured en- vironment for also maintains an OG for collision check. Information quantification over the OG map representation carries over known shortcomings of bad scalability and robustness [18]. The grid map is also an approximation because the conditional dependencies between the grid cells are discarded. For example, if there is significant drift in the robot’s pose estimate, this uncertainty is not reflected explicitly in the OG map. As a result, a straight corridor will appear curved, but their relative map entropies will be equivalent. In addition, OG maps also have large memory footprints. In a feature-based representation, features are explicitly maintained in the graph and give a layout of the map. However, active-SLAM on feature-based graphs is hard because features do not offer obstacle information, which is crucial for the robot to check path feasibility. This paper proposes the first, to our knowledge active SLAM approach that plans robot paths to directly optimize a global feature- based representation without any underlying OG representa- tion. Rather than formulating the problem as area coverage over an OG map [19], we set it up as entropy reduction over the map features subject to a budget constraint. Since the feature estimates and pose trajectory are necessarily correlated, we can remove the pose uncertainty from the traditional objective function (1) and directly optimize over the map quality. When features and robot poses are modeled as joint Gaus- sian variables, we can directly quantify the information gain of new data on all variables with the same entropy metric. When the robot moves to a frontier of the mapped area, it can potentially observe new features. The observed features and new features are measured with a unified information metric, therefore we can balance between exploitation and exploration automatically. The feature-based graph model is sparser than grid maps, thus scales much better and enables real-time robot state estimation and path planning. Further- more, it models the environment with dependent features and robot poses rather than with i.i.d. binary cells. Therefore, when the robot returns to a visited place and closes a loop, it can correct long-term drift and propagate the changes to all existing features and poses through the dependencies between variables. Figure 1 shows an example scenario. The locations of fea- tures are marked by green circles and the size of each circle represents its uncertainty. Gray circles represent samples of robot poses, and purple polygons represent obstacles. The planning problem is then to quantify information gains on the samples and find a trajectory connecting the samples that can minimize feature uncertainties. In summary, there are four primary contributions. 1) Propose a feature-based topology graph to represent the map of features as well as obstacles in an efficient way. 2) Develop a feature-focused information metric to quan- tify uncertainties in the map, in both visited and unvisited places. 3) Present a path planning algorithm using the feature- based topological graph to enable the robot to actively explore with the objective of directly reducing the uncertainty of the map 4) Test the proposed approach in a Gazebo simulated environment, as well as in a real world environment with a turtlebot. II. P ROBLEM S TATEMENT Assume that there exists a library of static features that can be uniquely identified as landmarks to localize the robot in the environment, denoted as L = { L 1 , L 2 , · · · L M } . Notice that the number of features present in the environment could be less than M , and is not known a priori. The exact locations of the present features are not known a priori either and need to be established by the robot. When moving in the environment, the robot’s trajectory is a sequence of poses X T = { X 0 , X 1 , · · · , X T } , where X 0 gives the initial distri- bution of the robot pose, typically set as the origin with low uncertainty. The robot can obtain two kinds of observations. The odometry o t is the change between two consecutive poses with probability model p ( o t | X t , X t − 1 ) . A feature measurement z k t is a measurement between the current pose X t and feature y t . The corresponding probability model of z k t is p ( z k t | X t , L y k t ) . Denote z t = { z t, 1 , · · · , z t,K t } . A factor graph is a sparse representation of the variables. Each node represents either a feature L i or a robot pose X t . Let p ( L ) = ∏ M i =1 p ( L i ) denote the prior for features. Each factor is a feature prior p ( L i ) , an odometry o t or a feature measurement z k t . The joint posterior of X and L is then the product of priors and likelihood of the observations o = { o 1 , · · · , o T } and z = { z 1 , · · · , z T } : p ( X , L | o , z ) ∝ p ( L ) T ∏ t =1 p ( o t | X t , X t − 1 ) K t ∏ k =1 p ( z k t | X t , L y k t ) . (2) The SLAM problem of jointly inferring the most likely posterior (MAP) feature positions and robot poses can be defined as: ( X ∗ , L ∗ ) = arg max X , L p ( X , L | o , z ) (3) With factor graph representation, (3) can be solved by readily available graph-SLAM algorithms/packages such as g2o, iSAM or GTSAM [20], [21]. The problem has traditionally solved by manually operat- ing the robot in the environment to gather a dataset first and then optimize the map in a batch update. In this work, the robot actively plans its own trajectory to incrementally learn the map. Considering that robots are typically constrained in computation/memory, the trajectory should be planned in such a way that resources should be spent on gathering L X 0 X 1 · · · X T z 1 u 1 z T u T Fig. 2: Active Focused Planning . The robot uses landmark measurements z t and odometry to design a control policy u 1: T to maximize information gain over environmental features L . information that is directly related to the robot’s goal. The focus of this paper is to incrementally build a map of the environment, therefore information gain is defined as entropy reduction only on variables representing features. Shannon entropy [22] is a measure of uncertainty in a random variable x thus widely used as information metric. Let p ( x ) denote the probability distribution of x , then entropy H ( x ) is defined as H ( x ) = ∑ x p ( x ) log p ( x ) for discrete variables and H ( x ) = ∫ p ( x ) log p ( x ) dx for continuous variables. Denote the control command at time t as u t , and let u T = { u 1 , · · · , u T } . The active focused planning problem is summarized as follows. Problem 1. Active SLAM: Design control commands u T = { u 1 , u 2 , · · · , u T } , such that the robot follows a trajectory that the obtained odometry o = { o 1 , · · · , o T } and feature measurements z = { z 1 , · · · z T } can minimize the entropy H ( · ) over the belief of map features L : max u T = { u 1 , ··· ,u T } H ( L | o , z ) s.t. q ( u T ) ≤ c X t = g ( X t − 1 , u t ) o t = X t ⊖ X t − 1 + v, v ∼ N (0 , Q ) z k t = L y k t ⊖ X t + w, w ∼ N (0 , R ) t = 1 , · · · , T (4) where q ( · ) is a measure of control cost, in the case of finite time horizon, q ( u T ) = T . Function X t = g ( X t − 1 , u t ) describes the robot dynamics. Function o t = X t ⊖ X t − 1 + v describes the odometry measurement model and z k t = L y k t ⊖ X t + w is the feature measurement model. Fig. 2 presents a graphical model of this problem. X t represents robot poses, L represents environment features. The goal is to design control policies u T to maximize information gain over feature belief L . III. M ETHOD A. Topology feature graph One important reason that the use of grid-map represen- tation has been a popular choice for active-SLAM is that a grid-based map contains all the necessary information for path planning. A feature-based representation, although much sparser, lacks information about free/occupied space and the topology of the environment. Consequently, planning paths over a traditional feature-based representation is ill- posed. To overcome this, we propose to store additional information with each feature that allows us to generate a full, yet sparse, representation of the environment over which we can then plan paths. In this paper, we assume the robot is a ground robot that operates in 2D space 1 . Relying on the fact that features are usually on the surface or corner of obstacles, we propose the Topological Feature Graph (TFG) representation. A TFG is a graph G = { L, E } , with its vertices representing features and edges representing obstacles. More specifically, if two features are connected by an edge, then these two features belongs to the same flat obstacle surface and the edge is not traversable 2 . These edges can be learned from either a depth image, a laser scan or even sequences of images [23]. The robot first segments the depth map or laser scan into several compo- nents representing different obstacle surfaces, then checks if two features detected belong to the same component. If so, the robot creates an edge between these two features. This idea is illustrated in Figure 3a. Compared to the grid map representation, the TFG of- fers several advantages in structured environments. First it requires many fewer variables to represent the environment, and thus provides significant memory savings. Second, the map complexity can easily adapt to various complexities in the environment. Instead of using equal sized cells at all places, a TFG can model more features in cluttered/narrow spaces and less features in wider/simpler spaces. Third, if new loop closures are detected and drifts of some subgraphs are corrected, the obstacles will be corrected with the feature positions: the robot does not have to relearn the occupancy of the associated space. And finally, this representation has a closed-form collision check for robot path planning rather than sampling-based methods, leading to significant computation savings in path planning. B. Sequential Planning Recall that our goal is to plan robot controls that gain maximal information from the environment as formulated in Problem 1. Notice that solving Problem 1 in batch is hard in general, because at any time t , observations beyond t are not available, thus planning controls u t , · · · u T will require modeling future observations and taking into account all possible outcomes, which is typically intractable. To solve this problem in a tractable manner, a widely used technique is to split Problem 1 into T stages, optimize a goal point at each stage [24], [25]. For each stage, a separate path planner can be used to generate controls. Let p ( L ) denote a prior of the landmarks. At stage t , the observation history o 1: t and z 1: t can be summarized in a 1 Extension to 3D scenarios can be achieved by triagularizing obstacle surfaces and is left to future work 2 Features can be extended to objects that have sizes, in which case obstacles would be represented by both objects represented by vertices and surfaces represented by edges posterior distribution of L , X at time t . Denote the maximal posterior(MAP) values of X and L as X ∗ t and L ∗ t , they can be obtained by standard SLAM solvers: X ∗ t , L ∗ t =argmax p ( X , L | o t , z t ) (5) =argmax p ( L ) t ∏ τ =1 p ( o τ | X τ , X τ − 1 ) p ( z τ | X τ , L ) Use the standard procedure of Laplacian approximation of X and L : a Gaussian distribution with the mean being its MAP values ( X ∗ t , L ∗ t ) , and the information matrix Λ being the second moment: X , L | o t , z t ∼ N ( X ∗ t , L ∗ t ; Λ − 1 ) (6) Λ = ∂ 2 log p ( L ) ∂ ( X , L ) 2 + t ∑ τ =1 ∂ 2 log p ( o τ ) ∂ ( X , L ) 2 + M ∑ i =1 ∂ 2 log p ( z τ ) ∂ ( X , L ) 2 = [ Λ f Λ f r Λ rf Λ r ] (7) where Λ f corresponding to landmarks and Λ r corresponds to robot poses. Laplacian approximation gives close-form solutions for entropy. The marginal information matrix for landmarks is Λ L = Λ f − Λ f r Λ − 1 r Λ rf , and the entropy is H ( L | o t , z t ) = − 1 2 log | Λ L | + constant (8) Further with the associated connectivity edges between land- marks, we obtain the TFG at time t . Denote it as T F G t , which summarizes the information the robot has about the environment up until time t . The Laplacian approximation simplifies the information quantification, but directly optimizing over controls u t is still very difficult. Control inputs u t affect robot paths though robot dynamics, and optimization under both robot dynamic constraints and obstacle constraints would be computation- ally prohibitive. As such, the problem is further simplified here by planning a trajectory for the robot first, then using a separate path-following controller to drive the robot along the planned trajectory. In this way, controller design is decoupled from path planning. Problem 2. Path Planning for Active SLAM At stage t , given prior topological feature graph T F G t , find a path ̂ X t · · · , ̂ X τ , · · · , ̂ X t +1 , such that the posterior entropy on landmarks L is minimized: min ̂ X t , ··· , ̂ X t +1 H ( L | T F G t , ˆ o t +1 , ˆ z t +1 ) s.t. ̂ X τ = g ( ̂ X τ − 1 , u τ ) ˆ o τ = ̂ X τ ⊖ ̂ X τ − 1 + v, v ∼ N (0 , Q ) ˆ z k τ = L y K τ ⊖ ̂ X τ + w, w ∼ N (0 , R ) τ = t, · · · , t + 1 (9) where ˆ o τ = { ˆ o t , · · · , ˆ o τ , · · · , ˆ o t +1 } represents the odometry obtained along the trajectory. And ˆ z t +1 = { ˆ z t , · · · , ˆ z τ , · · · ˆ z t +1 } represents landmark measurements obtained along the trajectory. Given the path ̂ X t , · · · , ̂ X t +1 and the partial TFG at time t , a separate path-following controller could be used to drive the robot along the trajectory. In this way, path planning and control are decoupled from the active SLAM problem, and we gain performance in computation and speed. C. Expected Information Gain Quantifying the exact information gain from ̂ X t to ̂ X t +1 in Problem 2 is challenging because it involves discretizing the trajectory from X t to ̂ X t +1 into a sequence of robot poses ̂ X τ , then computing the information gain from measurements at each pose. Information gain of measurements on later poses will depend on earlier poses along the path. Therefore, the complexity will grow exponentially with the path length. To solve the information quantification problem in real-time, we only plan a goal point for the robot, design the robot to stabilize its pose at the goal point, rotate in-place to obtain accurate observations of the local environment, and compute information gain only on these locally observable landmarks at the goal point. The observation would be some layout of a subset of the local landmarks. As shown in Figure 3b, gray balls denote observation points, and the blue circle indicate the set of landmarks it can observe at those observation points. Problem 3. Goal Planning for Active SLAM At stage t , given prior topological feature graph T F G t , find the next goal point ̂ X t +1 such that the entropy on landmarks L is minimized: min ̂ X t +1 H ( L | T F G t , ˆ z t +1 ) s.t. ˆ z t +1 = h ( ˆ X t +1 , L ) (10) Goal points also provide a way to segment the overall map into local maps and sparsify the underlying SLAM factor graph: the robot accurately maps the environment at goal points, thus measurements between two goal points contains less information compared to those at goal points. Therefore, landmark measurements along the path are only used to localize the robot, but are not used to update landmark esti- mates. This may cause some loss of information. However, with this simplification, we can marginalize out robot poses between two observations points, and the SLAM factor graph will become a joint graph of partial graphs at goal points. In this way, the complexity of the SLAM factor graph only scales with the number of observation points and not the number of robot poses. Furthermore, paths are generated with respect to the cur- rent estimate of landmark locations. If measurements along a path are used to update landmark estimates, new loop closures may cause shifts in landmark locations. The old path may become invalid and the robot may run into obstacles. Leaving out measurements along the path also avoids this potential failure. Notice that ̂ X t +1 is in continuous R 2 space. Differ- ent ̂ X t +1 would give different combinations of observable landmarks, thus solving problem 3 exactly would be hard. (a) Topology Feature Graph(TFG) (b) Goal points Fig. 3: Topological Feature Graph (TFG) and goal points. Vertices (stars) represent features, edges (black lines) represent obstacle sur- faces, blue stars represent features at a frontier, gray balls represent goal points. Blue regions illustrate local observable features. Instead, we use a random sampling approach. Given T F G t , a location is reachable if it can be observed from some previous goal location. The planner samples locations in the robot’s reachable space, computes entropy reduction for each goal point, then selects the next goal point as the one that gives maximal entropy reduction. The maximal entropy reduction problem can be stated as follows: ̂ X t +1 = arg max X t +1 ∆ H ( L | X t +1 , T F G t ) (11) = arg max X t +1 H ( L | T F G t ) − H ( L | X t +1 , T F G t ) Theorem 1. Set prior covariance for unknown landmarks in such a way that it is much larger than covariance of observed landmarks. Given topological landmark graph T F G t , the entropy reduction at goal location X t +1 can be approximated by the sum of entropy reduction on local observable landmark dH o , and of new landmarks dH u ∆ H ( X t +1 | X t +1 , T F G t ) ≈ ∆ H o + ∆ H u (12) where ∆ H u = n x log | I + σ u a u | , n x is the number of new landmarks n x , log | I + σ u a u | the expected information gain on an unknown landmarks. Proof. For simplicity, the subscript t is dropped in the following, but it should be noted that this analysis is based on T F G t . The information matrix can be written into two parts that corresponds to observed landmarks Λ f or robot poses Λ r Λ = [ Λ f Λ f r Λ rf Λ r ] The robot task here is to map the landmark, therefore we only look at the marginal information matrix on landmarks: Λ t L = Λ f − Λ f r Λ − 1 r Λ rf = [ Λ o 0 0 Λ u ] where Λ o corresponds to landmarks observed at least once, and Λ u corresponds to landmarks that have not been ob- served yet. At goal point ̂ X t +1 , denote ˆ z t +1 are the expected new landmark measurements, then the new joint likelihood becomes: p ( T F G t , ˆ z t +1 ; ̂ X t +1 , X , L ) ∼ p ( T F G t ; X , L ) p (ˆ z t +1 | ̂ X t +1 , T F G t ) (13) The corresponding factor graph is the factor graph at t plus new landmark measurements p (ˆ z t +1 | ̂ X t +1 , T F G t ) . Using the same ML values X ∗ t , L ∗ t in (3), the new information ma- trix Λ t +1 L would be the original information matrix Λ t L , plus some new terms coming from factors p (ˆ z t +1 | ̂ X t +1 , T F G t ) : Λ t +1 L =   Λ o 0 0 0 Λ u 0 0 0 0   +   A o 0 H o 0 A u H u H T o H T u B   (14) where B = B o + B u [ A o H o H T o B o ] = ∂ 2 p ( z t +1 | ̂ X t +1 , T F G t ) ∂ ( L o , ̂ X t +1 ) 2 (15) [ A u H u H T u B u ] = ∂ 2 p ( z t +1 | ̂ X t +1 , T F G t ) ∂ ( L u , ̂ X t +1 ) 2 The marginal information matrix on landmarks can be com- puted from the Schur complement: Λ t +1 L = [ Λ o + A o 0 0 Λ u + A u ] − HB − 1 H T H = [ H o H u ] (16) Note that elements in A and H are 0 if the corresponding landmark is not observable at observation point ˆ X t +1 . The incremental change in the information objective H ( · ) is: ∆ H = − log | Λ t L | + log | Λ t +1 L | = log ∣ ∣ ∣ ∣ [ Λ o + A o 0 0 Λ u + A u ] − HB − 1 H T ∣ ∣ ∣ ∣ − log ∣ ∣ ∣ ∣ [ Λ o 0 0 Λ u ]∣ ∣ ∣ ∣ (17) Take the inverse of the matrix in second term, combine it with the first term, then use Λ − 1 o = Σ o , Λ − 1 u = Σ u to obtain ∆ H = log ∣ ∣ ∣ ∣ [ I + Σ o A o 0 0 I + Σ u A u ] − [ Σ o 0 0 Σ u ] HB − 1 H T ∣ ∣ ∣ ∣ Extract the first term to get ∆ H = log ∣ ∣ ∣ ∣ I + [ Σ o A o 0 0 Σ u A u ]∣ ∣ ∣ ∣ + log ∣ ∣ ∣ ∣ ∣ I − [ I + Σ o A o 0 0 I + Σ u A u ] − 1 HB − 1 H T ∣ ∣ ∣ ∣ ∣ Apply | I − BA | = | I − AB | on the second term = log ∣ ∣ ∣ ∣ I + [ Σ o A o 0 0 Σ u A u ]∣ ∣ ∣ ∣ + log ∣ ∣ ∣ ∣ I − B − 1 H T [ ( I + Σ o A o ) − 1 0 0 ( I + Σ u A u ) − 1 ] H ∣ ∣ ∣ ∣ = log | I + Σ o A o | + log | I + Σ u A u | (18) + log ∣ ∣ I − B − 1 H T o ( I + Σ o A o ) − 1 H o − B − 1 H T u ( I + Σ u A u ) − 1 H u ∣ ∣ When a landmark has not been previously observed, the prior covariance Σ u is typically large, therefore H T u ( I + Σ u A u ) − 1 H u is small compared to H T o ( I + Σ o A o ) − 1 H o . Furthermore, notice that when the prior Σ u and information delta A u are block diagonal, with each block representing a landmark, log | I + Σ u A u | = n x log | I + σ u a u | , and we have the following approximation: ∆ H ≈ log | I + Σ o A o | + log ∣ ∣ I − B − 1 H T o ( I + Σ o A o ) − 1 H o ∣ ∣ + n x log | I + σ u a u | = log | I + Σ o A o − H o B − 1 H o | + n x log | I + σ u a u | =∆ H o + ∆ H u (19) where ∆ H o = log | I +Σ o A o − H o B − 1 H o | is the information gain obtained by having new measurements on observed landmarks, ∆ H u = n x log | I + σ u a u | is information obtained by having new measurements on previously unobserved landmarks, n x is the number of new landmarks observed, and σ u and a u are the variance and information gain of a single new landmark. Theorem 1 indicates that the information gain on a goal point can be split into two parts: the first part ∆ H o is the information gain obtained by re-observing and improving known landmarks, and ∆ H u is the information gain from ex- ploring new landmarks. In our experiments, n x is computed by using a predefined landmark density in the environment multiplied by the size of a frontier at observation point X t +1 . D. Frontier Detection In order to detect frontiers, we track how each landmark is connected to its neighbors. A landmark borders a frontier if at least one side of it is not connected to any neighbors. As shown in Figure 3a, the blue stars represent landmarks at frontiers. At each sample location, the size of frontier is computed as following: 1) Compute landmarks the robot expects to observe 2) Sort the landmarks according to their orientation rela- tive to the robot 3) If two consecutive landmark are not connected to any neighbors, they represent a frontier. With this frontier detection approach, we have a uniform information metric for both observed landmarks and unob- served landmarks at frontiers. Thus our approach gives a natural balance between exploration and exploitation: if there are large frontiers offering the potential to discover many new landmarks, the robot will pick observation points to explore frontiers. If there are only small frontiers or none at all, the robot might go to visited places to improve existing landmarks estimates. E. Path Planning In Section III-B, we obtained a set of collision-free sam- ples, therefore the path planner will only compute connectiv- ity and cost between these samples, and form a probabilistic roadmap (PRM). The trajectory to the next best observation point is generated by computing a minimum cost path on a PRM. The cost of an edge between two sample points involves two factors: • The length of the link, which reflects the distance that needs to be traveled and thus the control costs. • Collision penalty. Computing the exact collision probability of a given path is a computationally expensive procedure. However, exploiting the fact that a collision check for a point using a TFG rep- resentation can be carried out analytically enables expensive methods such as Monte Carlo methods for real-time collision evaluation. in this work, assuming Gaussian localization uncertainty, we rely on very efficient approximate methods to compute a measure of risk instead of the exact collision probability. Denote x o as the closest obstacle point. Then || x − x o || 2 represents the squared distance to the closest point and reflects the chance of collision. Thus we use || x − x o || 2 as an additive penalty in the edge cost in path planning. With our TFG representation, computing || x − x o || 2 reduces to computing point-line and line-line distances, which can be achieved trivially. One of the key benefits of relying on the TFG for path planning is the analytic computation of collisions. In other words, since TFG is composed of set of lines, one can analytically verify a given point is in the obstacle region or not by checking TFG lines around the robot. Such a fast collision check enables accurate methods such as Monte Carlo to evaluate collision probability along the path. We rely on a chance constraint formulation (similar to [3], [26], [27]) to compute paths that satisfy Pr( path ∈ Obstacle ) < δ . If one relaxes this constraint to Pr( x k ∈ Obstacle ) < δ ∀ k, x k ∼ N (ˆ x k , P k ) Pr( x k ∈ TFG edge ) < δ ∀ k, x k ∼ N (ˆ x k , P k ) (20) where, x k is the k -th point on the trajectory. IV. E XPERIMENTS A. Information Measures We first illustrate how the proposed framework can bal- ance exploration and exploitation. Figure 4 shows an ex- ample scenario: black lines represent obstacles, stars rep- resent features with blue stars bordering frontiers. Circles are samples in the free space with color representing their information gain: red is high gain and blue is low. Figure 4a displays the information gain on observed features: the total information gain is largest at samples that can po- tentially observe the greatest number of features. On the other hand, Figure 4b shows the information gain on new features. The samples closer to frontiers will have a chance to observe new features, thus they have higher exploration gains than samples further from frontiers. Assuming a fixed new feature density, larger frontiers offer the potential to observe more new features and therefore nearby samples have greater exploration information gain. Figure 4 shows the total exploration and exploitation information gain. Summing both the exploitation and exploration informa- tion, Figure 5 displays the total information gain under high/medium/low prior variance on new features. When prior (a) Exploitition (b) Exploration (c) Total Fig. 4: Information gain. Black lines (obstacles) and stars (features) comprise the TFG. Blue stars indicate frontier features. Circle color represents information gain on samples. (a) High (b) Medium (c) Low Fig. 5: Total information gain with varying unseen feature density. When the robot expects to see many features beyond frontiers, information gain at frontiers is high. Otherwise, the robot prefers spots that can observe the most features in visited places. variance on new features is high, observing a new feature will give large information gains, thus the exploration term dominates the exploitation term, and the robot prefers sample points at frontiers. On the other hand, if the prior variance is set to be low, observing new features does not add much information, and the robot will prefer to revisit places with observed features and improve its estimate of their positions. B. Simulation We compared our framework with a nearest-frontier ex- ploration algorithm [8] using the Gazebo simulator. The frontier exploration simulation used the popular GMapping [28] system for localization and mapping and used wavefront frontier detection [29] to identify frontiers. The simulated TurtleBot receives noisy odometry, laser scans, and feature measurements. Table I contains the simu- lation parameters. Figure 6a displays a screenshot of the sim- ulated environment (simulated april tags are spaced roughly one meter apart along the walls). Figures 6b and 6c display the maps generated by frontier exploration and TFG active SLAM respectively over one run. Note that there is obvious distortion along the hallways, and the boundary of some obstacles in the center are blurred as well. On the other hand, TFG active SLAM was able to close loops on features and thus maintain the shape of the building in its map. Figure 8 compares the robot pose error of nearest-frontier and TFG active SLAM over 4 runs. The solid lines represent error mean and shades represent error range. TFG active SLAM consistently has significantly less error in its robot pose estimates, especially in position. Figure 7 compares the map coverage with time spent exploring. In TFG active SLAM, the robot balances exploration with loop closing and is thus slightly slower in covering the whole space when compared with greedy frontier exploration. Table II compares algorithm performance. Although TFG active SLAM takes slightly longer to explore the environment, it uses orders of magnitude fewer variables to represent the world, which leads to memory savings. Frontier exploration also updates particles and the grid map continuously while TFG explo- ration only updates its map at goal points, requiring only light computation throughout most of its operation. TABLE I: Simulation parameters size of environment 46m × 22m No. of landmarks 274 sensor range 10m field of view 124 degrees particles for gmapping 100 rate for gmapping update 0.33Hz rate for landmark measurements 10Hz TABLE II: Simulation Performance Comparison TFG Active SLAM grid map frontier No. of variables 274 800000 CPU idle time 75% 0% running time (s) 2433 ± 546 2293 ± 375 position error (m) 0.147 ± 0.115 5.26 ± 3.53 orientation error (rad) 0.0217 ± 0.016 0.0213 ± 0.0165 Fig. 7: Map coverage vs distance travelled. TFG builds an accurate map while exploring and is thus slightly slower than greedy nearest- frontier exploration. C. Hardware The new framework is tested in an indoor space with the TurtleBot platform, using a computer with specifications listed in Table III. The computational resources used are readily available in many modern on-board systems. The (a) Gazebo simulation environment (b) frontier exploration with grid map (c) active SLAM via TFG Fig. 6: SLAM result comparison. When the odometry drifted, frontier exploration with occupancy grid map have distorted maps. While active SLAM using TFG is able close loops on features and have much more accurate maps. Fig. 8: Robot pose error over multiple runs. Solid line repsents mean and shade represent range. TFG has consistently smaller errors. TABLE III: Hardware Specification Robot TurtleBot (Kobuki base) Processor Intel Core i3 dual @2.3GHz RAM 4GB Operating System Ubuntu 14.04 focus of this paper is not on feature detection or data association, thus april tags [30] are used as features in the indoor space. Figure 10 gives some example views of the environment. Figure 9 shows how the robot’s mapping progressed throughout the experiment. It started with a partial map, then gradually picked up the frontiers and expanded the map to cover the space. The black lines are obstacles and black dots are features. The red dot is the robot’s current position and the red lines are its planned trajectories. Figure 11 shows the maps generated by three different methods. The proposed TFG-based active SLAM method (Figure 11a) effectively recovers the space even with distur- bance. The map generated by human-operated data (Figure 11b) captures the basic structure, but missed some obstacles on the wall and some surfaces on the obstacles in the middle of the room. This is mainly because human operators do not have a metric model of the environment, and cannot tell if some place is well explored. Finally, there is significant distortion in grid maps (Figure 11c). After the disturbance, the map completely drifted. Laser scans do not use any features in the environment as landmarks, thus once the odometry has drifted, it is very hard to correct even if the robot comes back to the original place. V. C ONCLUSION This paper contributed to the problem of active simulta- neous localization and mapping (SLAM) in three ways: 1) proposed a topological feature graph (TFG) that ex- tends point estimates in SLAM to geometry represen- tation of the space. 2) An information objective that directly quantifies uncer- tainty of a TFG. It captures correlations between robot poses and features in the space in a unified framework, thus new feature observations can help close loops and reduce uncertainties on observed features. The exploration and exploitation naturally comes out of the framework for a given feature density. 3) An efficient sampling-based path planning procedure within the TFG, which enables active SLAM. Future work includes extending the algorithm to visual feature/object detection and association, and extending the TFG to work with 3D active SLAM. ACKNOWLEDGMENTS This research is supported in part by ARO MURI grant W911NF-11-1-0391, ONR grant N00014-11-1-0688 and NSF Award IIS-1318392. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Fig. 9: Robot path and TFG in hardware experiment. Black stars represent april tags, and black lines represent obstacles. The red circle represents the robot’s current location, and the red line represents robot’s planned trajectory. The robot started with a partial map, then gradually picked up the frontiers and expanded the map to cover the space. Fig. 10: Views of the space. An GPS-denied indoor environment with april tags as features. (a) Active Slam (b) Expert Operated (c) Grid map Fig. 11: Comparison of different policies. TFG-based active SLAM method effectively recovers the space even with disturbance. The map by human-operated data captures the basic structure, but missed some obstacles on the wall and some surfaces on the obstacles in the middle of the room. The grid map has significant distortion and completely drifted after the disturbance. R EFERENCES [1] Samuel Prentice and Nicholas Roy. The belief roadmap: Efficient planning in belief space by factoring the covariance. The International Journal of Robotics Research , 2009. [2] Yifeng Huang and K. Gupta. Collision-probability constrained PRM for a manipulator with base pose uncertainty. In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on , pages 1426–1432, oct. 2009. [3] L. Blackmore, M. Ono, and B.C. Williams. Chance-constrained optimal path planning with obstacles. Robotics, IEEE Transactions on , 27(6):1080–1094, dec. 2011. [4] Aliakbar Agha-mohammadi, Suman Chakravorty, and Nancy Amato. FIRM: Sampling-based feedback motion planning under motion uncer- tainty and imperfect measurements. International Journal of Robotics Research (IJRR) , 33(2):268–304, 2014. [5] Jur van den Berg, Pieter Abbeel, and Kenneth Y. Goldberg. LQG- MP: Optimized path planning for robots with motion uncertainty and imperfect state information. I. J. Robotic Res. , 30(7):895–913, 2011. [6] T. A. Vidal-Calleja, A. Sanfeliu, and J. Andrade-Cetto. Action selection for single-camera slam. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) , 40(6):1567–1581, Dec 2010. [7] Ruben Martinez-Cantin, Nando Freitas, Eric Brochu, Jos ́ e Castellanos, and Arnaud Doucet. A bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots , 27(2):93–103, 2009. [8] B. Yamauchi. A frontier-based approach for autonomous exploration. In Proceedings of the 1997 IEEE International Symposium on Com- putational Intelligence in Robotics and Automation , CIRA ’97, pages 146–, Washington, DC, USA, 1997. IEEE Computer Society. [9] F. Bourgault, A.A. Makarenko, S.B. Williams, B. Grocholsky, and H.F. Durrant-Whyte. Information based adaptive robotic exploration. In Intelligent Robots and Systems, 2002. IEEE/RSJ International Conference on , volume 1, pages 540–545, 2002. [10] C. Stachniss, G. Grisetti, and W. Burgard. Information gain-based ex- ploration using rao-blackwellized particle filters. In Proc. of Robotics: Science and Systems (RSS) , Cambridge, MA, USA, 2005. [11] Benjamin Charrow, Gregory Kahn, Sachin Patil, Sikang Liu, Ken Goldberg, Pieter Abbeel, Nathan Michael, and Vijay Kumar. Information-theoretic planning with trajectory optimization for dense 3d mapping. In Proceedings of Robotics: Science and Systems , Rome, Italy, July 2015. [12] H. Carrillo, I. Reid, and J.A. Castellanos. On the comparison of uncertainty criteria for active SLAM. In Robotics and Automation (ICRA), 2012 IEEE International Conference on , pages 2080–2087, May 2012. [13] L. Carlone, Jingjing Du, M.K. Ng, B. Bona, and M. Indri. An applica- tion of Kullback-Leibler divergence to active SLAM and exploration with particle filters. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on , pages 287–293, Oct 2010. [14] Frank Dellaert and Michael Kaess. Square root SAM: Simultaneous location and mapping via square root information smoothing. Int. J. of Robotics Research , 25(12):1181–1203, 2006. [15] J. Vallv and J. Andrade-Cetto. Active Pose SLAM with RRT*. In Robotics and Automation (ICRA), 2015 IEEE International Conference on , May 2015. [16] R. Valencia, J.V. Miro, G. Dissanayake, and J. Andrade-Cetto. Active pose SLAM. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on , pages 1885–1891, Oct 2012. [17] C. Leung, Shoudong Huang, and G. Dissanayake. Active slam in structured environments. In Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on , pages 1898–1903, May 2008. [18] G. Grisetti, R. Kummerle, C. Stachniss, and W. Burgard. A Tutorial on Graph-Based SLAM. Intelligent Transportation Systems Magazine, IEEE , 2(4):31–43, winter 2010. [19] L. Carlone and D. Lyons. Uncertainty-constrained robot exploration: A mixed-integer linear programming approach. In Robotics and Automation (ICRA), 2014 IEEE International Conference on , pages 1140–1147, May 2014. [20] g2o: A general framework for graph optimization. https://openslam.org/g2o.html . [21] D.M. Rosen, M. Kaess, and J.J. Leonard. An incremental trust- region method for robust online sparse least-squares estimation. In Robotics and Automation (ICRA), 2012 IEEE International Conference on , pages 1262–1269, May 2012. [22] C. E. Shannon. A mathematical theory of communication. Bell Systems Technical Journal , 27:379–423, 1948. [23] Sudeep Pillai and John Leonard. Monocular SLAM Supported Object Recognition. In Proceedings of Robotics: Science and Systems , Rome, Italy, July 2015. [24] Jason L. Williams, John Iii, and Alan S. Willsky. Performance guarantees for information theoretic active inference. In Marina Meila and Xiaotong Shen, editors, Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS-07) , volume 2, pages 620–627. Journal of Machine Learning Research - Proceedings Track, 2007. [25] G.E. Newstadt, B. Mu, D. Wei, J.P. How, and A.O. Hero. Importance- weighted adaptive search for multi-class targets. Signal Processing, IEEE Transactions on , 63(23):6299–6314, Dec 2015. [26] M. Pavone, K. Savla, and E. Frazzoli. Sharing the load. Robotics Automation Magazine, IEEE , 16(2):52–61, June 2009. [27] B. Luders. Robust Sampling-based Motion Planning for Autonomous Vehicles in Uncertain Environments . PhD thesis, Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, May 2014. [28] Giorgio Grisetti, Cyrill Stachniss, and Wolfram Burgard. Improved techniques for grid mapping with rao-blackwellized particle filters. Robotics, IEEE Transactions on , 23(1):34–46, 2007. [29] Matan Keidar, Eran Sadeh-Or, and Gal A Kaminka. Fast frontier detection for robot exploration. In Advanced Agent Technology , pages 281–294. Springer, 2011. [30] Fast odometry from vision. http://april.eecs.umich.edu/wiki/index.ph