arXiv:1709.06675v4 [cs.RO] 10 Mar 2018 Talk Resource-Efficiently to Me: Optimal Communication Planning for Distributed Loop Closure Detection Matthew Giamou† Kasra Khosoussi† Jonathan P. How Abstract— Due to the distributed nature of cooperative simul- taneous localization and mapping (CSLAM), detecting inter- robot loop closures necessitates sharing sensory data with other robots. A na¨ıve approach to data sharing can easily lead to a waste of mission-critical resources. This paper investigates the logistical aspects of CSLAM. Particularly, we present a general resource-efficient communication planning framework that takes into account both the total amount of exchanged data and the induced division of labor between the participating robots. Compared to other state-of-the-art approaches, our framework is able to verify the same set of potential inter- robot loop closures while exchanging considerably less data and influencing the induced workloads. We develop a fast algorithm for finding globally optimal communication policies, and present theoretical analysis to characterize the necessary and sufficient conditions under which simpler strategies are optimal. The proposed framework is extensively evaluated with data from the KITTI odometry benchmark datasets. I. INTRODUCTION Multi-robot, or cooperative, simultaneous localization and mapping (CSLAM) is an active area of research with a wide spectrum of applications that span from robotic search and rescue in challenging environments to navigating fleets of autonomous cars; see [1]–[3] for recent surveys. Com- munication is a crucial aspect of the approach, without which CSLAM would simply reduce to decoupled copies of conventional SLAM. In applications without pre-existing infrastructure, ad-hoc wireless communication is subject to many shortcomings, including energy constraints, bandwidth, and range limitations; see, e.g., [2], [4]. Overlooking these challenges could lead to impractical solutions. Ensuring that agents are able to effectively and resource-efficiently communicate with one another is one of the most challenging problems facing distributed CSLAM architectures [2]. Communication is an essential prerequisite for establishing loop closures between different robots’ trajectories and maps. To search for inter-robot loop closures, robots need to compare and match the data acquired throughout each of their individual trajectories. However, each robot initially has access only to the data collected by its own onboard sensors. As a result, robots need to frequently share data among themselves. State-of-the-art techniques either employ a centralized architecture, or simply require each robot to broadcast a down-sampled history of its sensory readings; see, e.g., [5] and [6]–[9], respectively. A na¨ıve approach to the data sharing problem can easily lead to a waste of †Equal contribution. Authors are with the Laboratory for Information and Decision Systems (LIDS), MIT, Cambridge, MA — {mgiamou,kasra,jhow}@mit.edu mission-critical resources including battery, wireless band- width, and CPU time. We present a general communication planning framework for resource-efficient data exchange in the search for inter- robot loop closures in distributed CSLAM front-ends. Our framework has several appealing features: ⋄A guarantee to be lossless in the sense that, for any given set of candidate matches, the proposed framework allows for a complete search of all inter-robot loop closures that exist within that set. ⋄Efficient algorithms for finding optimal exchange policies with respect to the total amount of data transmission with minimal computational overhead. ⋄Providing a mechanism through which one can retain com- munication efficiency while influencing the final induced division of labor between the robots. This allows the team to balance the resulting induced workloads based on the distribution of computational resources among the robots. ⋄Applicability to systems that use measurements and maps composed of any data type, including dense 3D laser scans and local image features, e.g., BRIEF [10]. A. Related Work In a series of papers, Indelman et al. and Dong et al. [6]– [8] develop a pose-graph CSLAM framework based on Expectation-Maximization. From the perspective of data ex- change efficiency, in [6]–[8] robots broadcast a downsampled subset of their (informative) raw measurements (e.g., laser scans) with each other. Our work can be employed alongside these and similar systems to provide an alternative resource- efficient communication plan. Cieslewski and Scaramuzza [11] investigate the scalability of decentralized visual place recognition—in terms of the amount of exchanged data per place recognition query— in large teams of robots. In particular, they propose a decentralized approach whose scalability is comparable to that of centralized architectures and significantly better than the existing decentralized approaches. In [11] it is empirically shown that their heuristic approximation only suffers a mild reduction in place recognition recall. The core idea in [11] is to send partial queries to every other robot, assess the returned image similarity scores, and send the full query only to the robot with the most likely candidate match. Unlike the online (frame-query) flavour of the problem addressed in [11], our work considers a batch formulation that arises in occasional, but larger, data exchanges. The batch setting is especially well suited to applications in which multiple ð ð Data M (a) ð ð T Data Data π⋆ M1 π⋆ M2 (b) ð ð ð T Data Data π⋆ M1 π⋆ M2 (c) ð T ð Data Data π⋆ M (d) Fig. 1: An overview of distributed sensory data exchange approaches in CSLAM. Figure 1a illustrates a monolog (unidirectional policy), in which one robot sends all of its scans to the other. In some of the state-of-the-art techniques, this process happens also in the opposite direction (both robots share all of their scans with each other); see, e.g., [6]. In addition to the sensory data, robots also need to transmit a smaller amount of information (“metadata” M) to help identify potential loop closures (e.g., compact bag-of-words (BoW) vectors for visual place recognition or sparse trajectories). Figures 1b, 1c, and 1d illustrate the proposed approach. Contrary to Figure 1a, here robots engage in a dialog, and each shares a subset of its sensory data with the other robot. We demonstrate that this process can significantly reduce the total amount of exchanged data. In our approach, robots still need to exchange metadata. The broker (T) then solves the optimal data exchange problem and sends the optimal exchange policy π⋆for execution to the robots. robots are distributed to cover a large space, and communi- cation is only possible during rendezvous. In such settings, rendezvous are seen as short-lived valuable opportunities that can be leveraged to better achieve the mission objective. Furthermore, while [11] concerns fleet-wide communication efficiency in detecting potential inter-robot loop closures, here we focus on local (pairwise) efficiency in exchanging sensory data. Despite these differences, an extension of our framework to n-way data exchanges can use the idea behind [11] to improve the communication efficiency of its metadata exchange phase (see Section II). Sharing compressed beliefs and graphs constitutes another type of information exchange that arises in CSLAM. State- of-the-art techniques often marginalize out unnecessary in- termediate poses from the belief to reduce the amount of exchanged data; see, e.g., [4], [12]. The resulting information matrix, however, is generally dense. This has led to the study of approximate sparisfication techniques to “compress” the reduced beliefs. Paull et al. [4] investigate CSLAM with acoustic communication in the context of autonomous un- derwater vehicles. They propose a consistent (conservative) sparsification scheme based on Kullback-Leibler divergence. Lazaro et al. [9] propose to transmit a reduced representation of robots’ graphs (“condensed graphs”), as well as the most recent laser scans. Sharing only the most recent laser scans comes at the cost of losing potential loop closures in the regions that robots had explored separately prior to the encounter (i.e., before establishing a communication link). Cunningham et al. [12] propose a fast RANSAC-based data association scheme for CSLAM. The communication module in [12] shares the reduced beliefs (“condensed maps”) with a bounded number of robots within communication range. In contrast to our work, [12] considers a feature-based formulation with purely geometric (point) features. In that setting, each landmark measurement consists of a pair of range and bearing values, which is typically too lightweight to necessitate a communication planning framework. Forster et al. [5] propose a centralized framework, in which the base station aggregates all visual information and establishes inter-robot and intra-robot loop closures; see [2] for more centralized CSLAM approaches. Centralized approaches have limited applications and, compared to our work, leave no room for communication efficiency. Montijano et al. [13] and Leonardos et al. [14] explore el- egant formulations and algorithms for solving the distributed data association problem with an emphasis on maintaining association consistency across the communication graph. Unlike [13], [14], our paper takes a step back and investigates the logistics of distributed data association through exchang- ing data between pairs of robots. Our approach is orthogonal to such techniques and can be employed alongside distributed solvers. In summary, our framework neither tells the agents what to say to each other—a question that is partly a system- dependant design choice, partly addressed by belief compres- sion methods, see e.g., [4], [9], [15], and partly addressed by measurement selection schemes, e.g., see [16], [17]—nor does it tell them what to do with the exchanged data—i.e., how to solve the data association or the resulting inference problem, which is addressed by works such as [7], [12]–[14] among others; it rather advises them on how to communicate more effectively and efficiently. B. Contribution This paper addresses the data exchange problem, a key prerequisite for realizing resource-efficient distributed inter- robot loop closure detection and place recognition. We for- malize the problem, provide a theoretical analysis, and shed light on its connection to the weighed minimum bipartite vertex cover problem. These insights ultimately lead to a fast algorithm for finding globally optimal communication plans based on linear programming. Additionally, we ex- perimentally validate the proposed framework based on real benchmark datasets. Notation Bold lower-case and upper-case letters are reserved for vectors and matrices, respectively. 1 and 0 denote, respec- tively, the column vectors of all ones and all zeros. Sets are shown by upper-case letters. |A| denotes the cardinality of set A. The disjoint set union operator is denoted by ⊎such that A ⊎B = A ∪B and implies that A ∩B = ∅. For any two vertices u and v in a given graph, u ∼v means that there is an edge connecting u to v. Finally, for any set of vertices S, N(S) is the neighbourhood of S in the graph. II. OPTIMAL DATA EXCHANGE This section proposes a resource-efficient framework to facilitate the search for inter-robot loop closures in CSLAM via exchanging collected sensory data (collectively called “scans” in this paper). Each exchange operation is mod- erated by an exchange broker, which can be a trusted hardware/software component located at one of the two participating robots (Figure 1d), or a trusted third party (e.g., another robot or a base station)—see Figures 1b and 1c. The broker has the duty of initiating, planning, and executing the operation. Unlike the servers in centralized approaches, the exchange brokers are not meant to aggregate scans, but rather to advise robots on the “optimal” exchange policy. Although only one broker is needed per exchange process, the total number of (potential) brokers in a team may vary between 1 (a central broker) and the number of robots (each robot can act as a broker if necessary), as long as the broker is able to communicate with the two participating robots during an exchange. A. Initiating an Exchange The exchange process can be initiated between two robots when they are within communication range. First, the broker has to form the exchange graph G. Definition 1 (Exchange Graph). An exchange graph is an undirected bipartite graph G = (V1 ⊎V2, L) whose vertices correspond to the two robots’ poses involved in the data exchange problem, and V1 ∋u ∼v ∈V2 iff there is a “po- tential” inter-robot loop closure between their corresponding poses. Without loss of generality and for convenience, we assume the degree of each vertex in the exchange graph is at least one. L is a set of plausible inter-robot loop closure candidates, determined based on geometry (e.g., trajectory estimates and sensor characteristics such as field of view and range) and/or appearance (e.g., visual place recognition sys- tems). Visual place recognition techniques like the DBoW2 system of [18] can be used to form elements of L using only information local to individual measurements. In the case of DBoW2, this information is vocabulary labels of BRIEF [10] features extracted from query images. In both cases, G is populated without sharing the entirety of the robots’ measurement data. Instead, L is formed using a compact representation of the sensory data (hereafter, “metadata”), e.g., a collection of bag-of-words (BoW) vectors. Robots cooperate with the broker (by, e.g., providing information about their beliefs over their trajectories or BoW vectors) to form L. In practice, a considerable number of potential edges are not plausible given the available information. This often makes G far less dense than the complete bipartite Algorithm 1 Execute a DEP 1: for v ∈V do 2: if π(v) = 1 then 3: Share Sv (corresponding sensory data). graph. The structure of G ultimately depends on a variety of factors, including the particular sensors and perception models utilized, the level of uncertainty in the robots’ beliefs, and perceptual aliasing. B. Optimal Data Exchange The optimal data exchange problem is now formally defined. Definition 2 (Data Exchange Policy). A data exchange policy (DEP) is a vertex labeling that specifies which “scans” should be exchanged between a pair of robots. Formally, we call π : V →{0,1} a DEP over V ≜V1 ⊎V2 in which π(v) = 1 (resp., π(v) = 0) indicates that the scan collected at vertex v should (resp., should not) be sent to the other robot. Based on the above definition, π can be executed simply by scanning the labels and transmitting the scans marked with “1” (i.e., to be sent); see Algorithm 1. Definition 3 (Admissible Policy). A DEP is called admissi- ble iff it allows for a complete search; i.e., finding all possible loop closures in L. This can be achieved iff, for each edge in the exchange graph, at least one robot shares its associated scan with the other robot. Formally, π is admissible iff for all u ∼v, π(u) + π(v) ≥1. Definition 4 (Monolog). Let Vsource ∈{V1,V2}. An exchange policy π : V →{0,1} is called a monolog if π : v 7→ ( 1 v ∈Vsource, 0 otherwise. (1) Lemma 1. Every monolog is admissible. The broker can guarantee the completeness of search by proposing an admissible policy—but which one of them? Two primitive objectives are considered in this work: 1) Communication: The first objective quantifies the commu- nication cost incurred during the execution of an exchange policy—mainly due to bandwidth and energy consumption. The communication cost is measured by the total amount of exchanged data. From this perspective, π is preferred over π′ iff it can conduct a complete search by exchanging less data between the two robots. More precisely, let ws : V →R≥0 be a weight function defined over V such that ws(v) quantifies the “size” of scan Sv collected at the corresponding pose. Then, the communication cost incurred as a result of executing policy π can be modelled as fO(π) ≜ X v∈V ws(v) π(v). (2) In the special case of uniform weights, fO reflects the number of exchanges made by π (up to a constant). ð1 ð2 (a) An exchange graph G ð1 ð2 (b) An admissible policy π ð1 ð2 (c) The division of labor induced by π Fig. 2: A simple data exchange problem between two robots. Each vertex corresponds to a robot pose, and each edge represents a potential loop closure between the corresponding robot poses. There is a “scan” associated with each robot pose. To verify a potential inter-robot loop closure between two connected vertices, at least one robot needs to share its scan with the other robot. (a) A simple exchange graph. (b) An admissible exchange policy in which each robot shares the sensory data collected at its red vertex with the other robot. The orientation of each edge signifies the direction of exchange (i.e., vertex label). (c) The workload induced by π: ð2 is responsible for searching for loop closures among the green candidates, ð1 will search among the blue candidates. Note that the thick candidate edge will be screened by both robots. 2) Induced Division of Labor: Upon executing an exchange policy, each robot has to perform sensor registration on a subset of L. The exchange policy implicitly determines the distribution of the workload between the robots. The second objective captures this induced workload. To quantify this workload, first note that any admissible policy π divides the initial candidate set into L = Lπ 1 ∪Lπ 2 in which Lπ 1 (resp., Lπ 2) is the set of edges incident to V2 (resp., V1) at a vertex v such that π(v) = 1. These sets can be empty (monolog) and are not necessarily disjoint: Lπ 12 ≜Lπ 1 ∩Lπ 2 is the set of edges like {u,v} ∈L such that π(u) = π(v) = 1 (see Figure 2b). Lπ 1 \Lπ 2 (resp., Lπ 2 \Lπ 1) can only be searched by the first (resp., second) robot. On the contrary, in principle both robots can screen the candidates in Lπ 12. We can either divide the burden of searching in Lπ 12 between the robots, or simply let each robot screen it on its own. The latter is preferred due to the following advantages. First, from a robustness perspective, verifying Lπ 12 separately on each robot creates a desirable redundancy in case robots are unable to exchange their newly discovered loop closures due to problems like communication failure. Furthermore, the cost of post-exchange communication will be slightly reduced since we do not need to exchange the loop closures found in Lπ 12 (Section II-D). Finally, as we will see shortly, this choice leads to tractable optimization problems. Suppose the computational cost of verifying candidate inter-robot loop closure {u,v} is quantified by cuv ≥0. The total computational cost due to sensor registration induced by exchange policy π on robot i ∈{1,2} is given by ℓπ i = X v∈V\Vi X u∼v cuvπ(v). (3) Note that under uniform {cuv}u∼v, ℓπ i is proportional to the number of potential loop closures that must be verified by robot i as a result of exchange policy π. Let α1 and α2 be non-negative parameters that control the induced workload balance between the two robots, such that, e.g., increasing αi will shift the balance in favor of robot i. For example, in a heterogeneous data exchange between a typical robot and a tactical supercomputer, one may seek to choose an admissible policy such that most of the induced workload is redirected toward the tactical supercomputer. This narrative results in f¤(π; α1,α2) ≜α1ℓπ 1 + α2ℓπ 2 (4) = X v∈V wℓ(v) π(v), (5) in which wℓ: v 7→ ( α2 P u∼v cuv v ∈V1, α1 P u∼v cuv v ∈V2. (6) Problem 1 (Optimal Data Exchange Problems (ODEP)). minimize π f¤(π; α1,α2) subject to π is admissible. (P1) minimize π fO(π) subject to π is admissible. (P2) minimize π f•(π; α1,α2,ω) subject to π is admissible. (P3) f•(π; α1,α2,ω) ≜fO(π) + ωf¤(π; α1,α2) (7) = X v∈V w•(v) π(v), (8) in which w• : v 7→ws(v) + ω wℓ(v). C. Solving the Optimal Data Exchange Problem It is easy to see that P1:3 are all instances of the weighted minimum bipartite vertex cover problem.1 To see this, first note that the admissibility constraint needed for guaranteeing the completeness of search is identical to the constraint in vertex cover. Translating an instance of one of these narratives to an equivalent instance of the other (i.e., mapping a lossless exchange policy to an equivalent vertex cover π 7→Π and vice versa) is trivial: Π ≜  v ∈V : π(v) = 1 and π : v 7→1Π(v) where 1Π(v) ≜ ( 1 if v ∈Π, 0 if v ∈V \ Π. (9) 1Finding a subset of vertices in a vertex-weighted bipartite graph with the minimum sum of vertex weights such that it covers every edge. Finally, note that the cost of π (in ODEP) is equal to that of Π in the weighted minimum bipartite vertex cover, and vice versa. Consequently P1:3 can all be solved using the same machinery. Furthermore, this result characterizes the communication cost incurred in the search for inter-robot loop closures and the induced workload balance in terms of the graph topology and vertex/edge weights through a well- understood graph invariant. Algorithm Although the weighted minimum vertex cover problem is NP-hard in general, it can be solved efficiently in bipartite graphs; see, e.g., [19]. Therefore, by virtue of the above- mentioned observation, we can solve any ODEP efficiently by casting it as a weighted minimum bipartite vertex cover problem. Moreover, Algorithm 1 can be slightly restructured to execute the vertex cover translation of an optimal policy— see Algorithm 2. It remains to describe an algorithm based on linear programming (LP) for efficiently solving ODEP. Let w ∈{wℓ, ws, w•}. The corresponding ODEP can then be formulated as the following integer linear program (ILP): minimize π X v∈V w(v) π(v) subject to π(u) + π(v) ≥1 u ∼v, π(u) ∈{0,1} u ∈V. (PILP) The admissibility constraint in PILP can be compactly written as A⊤π ≥1, in which A is the unoriented incidence matrix of the exchange graph, and π is the stacked vector of values π(u) for u ∈V. Let w be the stacked vector of vertex weights. PILP admits a natural LP relaxation by expanding its feasible set FILP into FLP ≜{π : A⊤π ≥1, π ≥0} ⊃FILP: minimize π w⊤π subject to π ∈FLP. (PLP) It is well known that A is totally unimodular, and therefore FLP is integral; i.e., PLP has an integral solution that can be found using the simplex algorithm (see, e.g., [19, Ch. 18]). Any integral solution corresponds to an optimal exchange policy for Problem 1. In the special case of uniform weights, we can construct the optimal policy directly from the maxi- mum bipartite matching in G; see K¨onig’s theorem [19]. Optimality Conditions for Monologs ODEP is built on the presumption that exploiting bidi- rectional communication can lead to more resource-efficient strategies. While this is generally true, in some special cases, monologs may perform optimally as well. The following theorem offers the necessary and sufficient condition for the most general form of P1:3 under which a monolog is optimal. Theorem 1. Consider a vertex-weighted exchange graph G with non-negative weights assigned by w : V →R≥0. Let V◦∈{V1,V2}. The monolog π defined as π◦: v 7→ ( 1 v ∈V◦, 0 otherwise, (10) minimizes the cost function f(π) ≜ X v∈V w(v) π(v) (11) among all admissible policies if and only if G satisfies what we call the generalized Hall’s condition (GHC): ∀S ⊆V◦: X v∈S w(v) ≤ X v∈N(S) w(v). (GHC) Proof. [⇒] We show the contrapositive. Suppose there exists a S ⊆V◦that violates GHC. Consider, π∗: v 7→ ( 1 v ∈(V◦\ S) ⊎N(S) 0 otherwise. (12) π∗is admissible since the vertices in V◦\ S cover the edges that are not incident to S, while those in N(S) cover every edge incident to S. Now since S violates GHC we have, f(π∗) = X v∈V◦\S w(v) + X v∈N(S) w(v) (13) < X v∈V◦\S w(v) + X v∈S w(v) (14) = X v∈V◦ w(v) (15) = f(π◦). (16) [⇐] Now we show GHC is sufficient. Suppose GHC holds and let π⋆be the optimal admissible policy. For simplicity and without loss of generality let us assume V◦= V1. Define Π⋆≜{v ∈V : π⋆(v) = 1} and Π⋆ i ≜Π⋆∩Vi (i = 1,2). If Π⋆ 2 is empty, π⋆= π◦. Furthermore, based on GHC, Π⋆ 1 cannot be empty unless V1 and V2 have equal costs, which also implies that π⋆= π◦. Thus we can assume both are non-empty. Since π⋆is admissible, there must be no edges between V1 \ Π⋆ 1 and V2 \ Π⋆ 2. Therefore, N(V1 \ Π⋆ 1) ⊆Π⋆ 2. From GHC and the fact that vertex weights are non-negative we have, X v∈V1\Π⋆ 1 w(v) ≤ X v∈N(V1\Π⋆ 1) w(v) (17) ≤ X v∈Π⋆ 2 w(v). (18) Consequently, f(π⋆) = X v∈Π⋆ 1 w(v) + X v∈Π⋆ 2 w(v) (19) ≥ X v∈Π⋆ 1 w(v) + X V1\Π⋆ 1 w(v) (20) = X v∈V◦ w(v) (21) = f(π◦). (22) This concludes the proof. Remark 1. Theorem 1 states that π◦is optimal iff, for any subset of vertices in V◦, the amount of data that needs to Algorithm 2 Execute a DEP via Vertex Cover Π 1: for v ∈Π do 2: Send Sv to the other robot. be transmitted from V◦to the other robot is not greater than the amount of data needs to be transmitted in the opposite direction. Although this result is intuitive, the fact that the GHC is both necessary and sufficient is non-trivial. Corollary 1. Let Vmax ∈{V1,V2} be the vertex set with the larger αi. The monolog π1 defined as π1 : v 7→ ( 1 v ∈Vmax, 0 otherwise. (23) is optimal with respect to P1. Corollary 1 implies that P1 always has a trivial optimal monolog solution. Nonetheless note that P3 still allows us influence the induced division of labor based while retaining communication efficiency. Moreover, Corollary 1 also im- plies that the two objective functions f¤ and fO blended together in P3 are competing with each other to shift the structure of the optimal policy towards monologs (ideal workload balance) and dialogs (communication efficiency), respectively. Corollary 2. Let Vmin ∈{V1,V2} be the vertex set with smaller cardinality. The monolog π2 defined as π2 : v 7→ ( 1 v ∈Vmin, 0 otherwise, (24) is optimal with respect to P2 under uniform weights iff G satisfies Hall’s condition (HC): ∀S ⊆Vmin : |S| ≤|N(S)|. Corollary 2 states the necessary and sufficient condition under which the monolog π2 is optimal. This result also follows directly from Hall’s marriage theorem and K¨onig’s theorem [19]. As an example, consider the case of k-regular bipartite graphs.1 A well-known application of Hall’s mar- riage theorem implies that k-regular bipartite graphs satisfy HC [19]. Similarly, it is easy to check that HC holds in the complete bipartite graph. Corollary 3 follows from this result and Corollary 2. Corollary 3. The monolog π2 is optimal with respect to P2 under uniform weights in k-regular, and in complete bipartite graphs. D. Post-Exchange Protocol After executing the optimal policy π⋆, each robot has to verify the potential loop closures in a subset of L (Lπ 1 and Lπ 2; see Section II-B) via sensor registration. Examining the candidates will lead to a set of inter-robot loop closures L⊞⊆L. Because of the admissibility constraint, we know that L⊞ 1 ∪L⊞ 2 = L⊞in which L⊞ i is the set of loop closures discovered by robot i ∈{1,2} after executing an admissible exchange policy (i.e., the search is guaranteed to 1A graph is called k-regular if all of its vertices have degree k ≥1. Algorithm 3 Optimal Data Exchange 1: Robots: Send the essential metadata to the broker 2: Broker: Form G (w/ dynamic pricing) 3: Broker: Form and solve ODEP via LP relaxation 4: Robots: Execute π⋆— exchange scans 5: Robots: Search for loop closures in Lπ 1 and Lπ 2 6: Robots: Exchange the discovered loop closures: L⊞ i \L⊞ 12 be complete). At this point, each robot is aware of its own set of newly discovered inter-robot loop closures; these sets will have a non-empty overlap iff Lπ⋆ 12 ∩L⊞is non-empty. If the communication channel is still available, robots can immediately share their newly discovered positive matches with each other by transmitting L⊞ i \ L⊞ 12 (i = 1,2). The exchange process ends here. At this stage, robots are able to closely examine every potential candidate, perform geometric verification, solve the sensor registration and data association problems, and establish relative measurements; see, e.g., [8], [13], [14], [20]. E. Exchange Inertia and Dynamic Pricing In Problem 1, vertex weights quantify quantities such as the size of a scan, computational cost of sensor registration for the corresponding potential loop closures, and the desired workload balance. From a broader perspective, the weights can be interpreted as the exchange inertia, such that a smaller weight signifies more desire to share the associated scan with other robots, and vice versa. This broader interpretation allows us to incorporate a wider spectrum of objectives and constraints using the same underlying framework. In partic- ular, robots and/or the broker may utilize a dynamic pricing strategy driven by various internal/external incentives. For example, these dynamic pricing schemes may depend on the specific role of a robot in the team, its capabilities, clearance level, privacy restrictions, and the available mission-critical resources. III. EXPERIMENTS Algorithm 3 summarises the entire ODEP process. This section presents results obtained using the KITTI dataset [21] to formulate realistic ODEP instances. KITTI was chosen for its long, data-rich trajectories, and accurate ground truth. ODEP instances are solved with the Gurobi LP solver.2 Solving ODEP takes about 0.41 seconds in one of the largest exchange graphs encountered in our datasets (with more than 2 × 103 vertices and 96 × 103 edges) on an Intel Core i7- 6820HQ CPU operating at 2.70 GHz. The runtime in realistic settings and using DBoW2 with α = 0.3 (see Section III-B) is about 0.03 seconds. Due to space limitation, in this section we focus mainly on P2. A. Trajectory Geometry Experiments In order to create instances of ODEP with the KITTI dataset, we chose sequences of the odometry benchmark that contained considerable amounts of self-intersection and re-tracing in their ground truth trajectory. Each sequence is 2http://www.gurobi.com divided into two parts corresponding to two distinct robots. For each pose in the trajectory, Oriented FAST and Rotated BRIEF (ORB) features [22] exceeding a variable FAST detection threshold are extracted from the associated color camera image. Since this set of features can be used to detect and compute loop closures between poses as part of a SLAM system [20], the number of extracted features determines the vertex weight ws(v) for the pose at vertex v. In regions with greater environmental detail, a greater number of ORB features are extracted. The KITTI dataset’s odometry ground truth is then used to form edges between nearby poses associated with each robot. This process results in an exchange graph G with weights ws(v) that depends on a number of parameters: 1) FAST threshold kF used to detect ORB features, 2) Data rate or measurement frequency f (KITTI data is provided at 10 Hz), 3) Maximum distance dmax between poses that are candidate matches (i.e. u ∼v), 4) Minimum fraction η of range limited camera field of view (FOV) between poses that are candidate matches. Varying these parameters leads to different structures in G and variable communication savings when using ODEP. In practice, different sensors and varying confidence in robot trajectory estimates would permit empirical modelling of exchange graph formation. In this paper, we analyze ranges of the above parameters to capture a variety of problem instances. For example, large values of dmax correspond to scenarios where each robot’s trajectory estimate is highly uncertain and, therefore, a greater range of nearby poses need to be considered loop-closure candidates. Figure 3 displays edges of L in green for a particular set of parameters on KITTI odometry sequence 0 and sequence 6. Figure 4a and 4d display the communication savings of the optimal policy relative to monolog policies for sequences 0 and 6 when L is formed between poses within a variable dmax. Figures 4b and 4e report similar results when L is formed with a variable minimum FOV overlap η. The abrupt jumps in cost seen in Figures 4d and 4e are caused by sequence 6’s particular trajectory. Figure 3b displays the simple elongated loop that sequence 6 follows, along with some candidate edges formed by the field of view threshold of η = 0.4. These settings lead to candidate edges across the thin loop which vanish for shorter values of dmax and higher values of η, reducing the required communication cost. Figure 4 demonstrates that solving ODEP enables the robots to reduce the amount of data to be exchanged by up to 5 MB over some monologs. Note that in some of the state-of-the-art systems, full bidirectional communication of measurements is utilized by default, resulting in at least the sum of the communication costs of both monologs (red and green curves) in Figures 4a-4f [6]. For a typical 11 Mb/s ad hoc WiFi network tested in our laboratory, 5 MB corresponds to approximately 5 seconds of transmission time. Thus, in addition to reducing use of network bandwidth and battery usage, communication reduction could potentially help to 0 100 200 300 400 −300 −200 −100 0 100 200 300 Robot 1 Robot 2 Loop Closure Candidates (a) Sequence 0 −100 0 100 200 300 −20 −100 10 20 30 (b) Sequence 6 Fig. 3: Ground truth for KITTI odometry dataset sequences 0 and 6 with parameters f = 2 Hz, η = 0.4, dmax = 30m between robot 1 (blue) and robot 2’s (red) for edges (green). The edges and weights formed with ORB feature counts produce the exchange graph G. significantly shorten robot rendezvous periods in time-critical missions. B. Place Recognition Experiments An appearance-based place recognition system like DBoW2 [18] can also be used for inter-robot loop closure detection and to generate the candidate edge set L. In a situation where robots do not have an accurate estimate of the transformation between their trajectories’ frames of reference, place recognition must be leveraged instead of viewpoint proximity to find potential loop closures. To facil- itate place recognition, DBoW2 only needs the “word” in the bag-of-words vocabulary describing each ORB feature [18]. This word can typically be described in 3 or fewer bytes, which is less than one tenth of the size the standard 32 byte BRIEF descriptor used in ORB. Thus, an inexpensive exchange of vocabulary vectors (i.e., metadata) allows robots to search for promising candidates and form an exchange graph G for ODEP. In our experiments, we trained a DBoW2 vocabulary with parameters kw = 10 and Lw = 6 on ORB features from 5 KITTI odometry benchmark sequences. The two candidate edges with the highest normalized score exceeding threshold α [18] were used to form candidate edges in L. Communication savings from ODEP instances produced with KITTI odometry sequences 0 and 6 are displayed in Figures 4c and 4f. Although the structure of exchange graphs resulting from appearance based methods were very different from the geometric methods of the previous section, the cost of communication can still be significantly reduced using our method. For low α thresholds, the optimal policy affords significant savings of around 5-10 MB over the monolog policies in sequence 0 (Figure 4c). In the smaller sequence 6, the net communication savings are smaller because there are fewer candidates, but the optimal policy is still almost 10% more efficient than the best monolog policies at α thresholds in the range of 0.2 to 0.4. It is also worth noting 5 10 15 20 25 30 27 28 29 30 31 32 33 34 35 36 37 (a) Sequence 0, dmax threshold 0.2 0.3 0.4 0.5 0.6 0.7 0.8 27 28 29 30 31 32 33 34 35 36 (b) Sequence 0, η threshold 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 15 20 25 30 35 40 45 50 55 60 65 (c) Sequence 0, α threshold 5 10 15 20 25 30 5 6 7 8 9 10 11 12 13 14 (d) Sequence 6, dmax threshold 0.2 0.3 0.4 0.5 0.6 0.7 0.8 5 6 7 8 9 10 11 12 13 (e) Sequence 6, η threshold 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 4 5 6 7 8 9 10 11 12 13 (f) Sequence 6, α threshold Fig. 4: Communication cost for KITTI odometry sequences 0 and 6 with kF = 100, f = 10 Hz, and varying dmax, η, or α (x-axes). Figures 4a and 4d form the candidate edge graph L using maximum Euclidean distance dmax between poses from KITTI groundtruth data, whereas Figures 4b and 4e use the fraction of overlapping fields of view η to determine candidates. Figures 4c and 4f form the candidate graph using the candidates with the 2 best DBoW2 scores greater than threshold α. that relying on a single communication direction (i.e. using only one fixed monolog policy) throughout a mission is a poor communication heuristic that could produce arbitrarily bad and inconsistent results. This is illustrated in Figures 4f and 4b where depending on the value of α or η, the better choice of monolog and its performance penalty relative to the optimal policy changes drastically. IV. CONCLUSION AND FUTURE WORK Given the limitations of onboard resources, it is essential that robots communicate wisely. State-of-the-art techniques often have to sacrifice content by down-sampling the ex- changed data, e.g., [6]–[8], [11]. This comes at the risk of losing potential valuable inter-robot loop closures which are the essence of cooperative localization and mapping. This paper addressed this challenge by investigating the logistical aspect of sensory data sharing in distributed CSLAM front- ends. First, we formalized the optimal data exchange prob- lem that encompasses a wide range of sensing modalities (e.g., vision, 2D and 3D lasers). This led to a resource- efficient and provably lossless (i.e., ensuring a complete search) communication planning framework. The proposed framework takes into account both the quantity of exchanged data, and the resulting division of labor induced by the executed exchange policy. This allows us to design efficient communication plans while distributing the induced work- load based on, for example, the distribution of computational resources among robots. Additionally, ODEP can seamlessly incorporate privacy and security constraints through the con- cept of exchange inertia and dynamic pricing schemes. Our approach benefits greatly from several fundamental results in graph theory and combinatorial optimization. In particular, these results lead to a fast and provably tight LP relaxation scheme to find the globally optimal exchange polices. In addition, our theoretical analysis characterized the necessary and sufficient conditions under which simpler unidirectional exchange policies are optimal. Finally, we experimentally validated geometric and appearance-based realizations of the proposed framework using the KITTI odometry benchmark datasets. In retrospect, several crucial insights played major roles in the success of our approach. First and foremost, iden- tifying plausible inter-robot loop-closure candidates before transmitting the bulk of sensory data is what makes commu- nication planning possible. Forming the exchange graph and exploiting its unique structure (topology and the vertex/edge weights) allowed us to identify more efficient, yet lossless, exchange policies—often emerging as natural dialogs. Al- though this requires exchanging “metadata”, the incurred cost is often not comparable to the that of the actual data exchange. For example, visual place recognition systems like DBoW2 form loop closure candidates with sparse feature vectors that use an order of magnitude less data than the full descriptors used for subsequent loop closure verification. In our experiments where robots found candidate edges by exchanging pose graphs, poses are described by SE(2) or SE(3) objects that are much smaller than hundreds of visual descriptors. Furthermore, we exploited the sparsity pattern of the graph in our implementation to solve the resulting LP even faster. This paper provides a solid foundation for optimal com- munication planning in distributed CSLAM front-ends. Our approach is able to find the optimal exchange policy between a pair of robots during pairwise encounters. n-way (n > 2) scan exchange problems naturally arise in robotic networks with denser communication graphs. Although the proposed approach can still be used in these cases, it may not necessar- ily lead to the optimal strategy. Addressing the sensory data exchange between more than two robots requires exploring new mechanisms such as data caching and routing. The optimal n-way data exchange problem is our next challenge. ACKNOWLEDGEMENT This work was supported in part by the NASA Convergent Aeronautics Solutions project Design Environment for Novel Vertical Lift Vehicles (DELIVER) and by the Northrop Grumman Corporation. The authors would like to thank Sergii Iglin for the Graph Theory Toolbox1, the team behind the KITTI dataset [21], and Noam Buckman of LIDS for assistance with experiments. REFERENCES [1] C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. J. Leonard, “Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age,” IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309–1332, 2016. [2] S. Saeedi, M. Trentini, M. Seto, and H. Li, “Multiple-robot simulta- neous localization and mapping: A review,” Journal of Field Robotics, vol. 33, no. 1, pp. 3–46, 2016. [3] V. Indelman, “Distributed perception and estimation: a short survey,” Principles of Multi-Robot Systems, workshop in conjunction with Robotics Science and Systems, 2015. [4] L. Paull, G. Huang, M. Seto, and J. J. Leonard, “Communication- constrained multi-auv cooperative SLAM,” in Robotics and Automa- tion (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 509–516. [5] C. Forster, S. Lynen, L. Kneip, and D. Scaramuzza, “Collaborative monocular SLAM with multiple micro aerial vehicles,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on. IEEE, 2013, pp. 3962–3970. [6] J. Dong, E. Nelson, V. Indelman, N. Michael, and F. Dellaert, “Distributed real-time cooperative localization and mapping using an uncertainty-aware expectation maximization approach,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 5807–5814. [7] V. Indelman, E. Nelson, J. Dong, N. Michael, and F. Dellaert, “Incremental distributed inference from arbitrary poses and unknown data association: Using collaborating robots to establish a common reference,” IEEE Control Systems, vol. 36, no. 2, pp. 41–74, 2016. [8] V. Indelman, E. Nelson, N. Michael, and F. Dellaert, “Multi-robot pose graph localization and data association from unknown initial relative poses via expectation maximization,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on. IEEE, 2014, pp. 593–600. [9] M. T. Lazaro, L. M. Paz, P. Pinies, J. A. Castellanos, and G. Grisetti, “Multi-robot SLAM using condensed measurements,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on. IEEE, 2013, pp. 1069–1076. [10] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, “Brief: Binary robust independent elementary features,” Computer Vision–ECCV 2010, pp. 778–792, 2010. 1https://goo.gl/XJDwUf [11] T. Cieslewski and D. Scaramuzza, “Efficient decentralized visual place recognition using a distributed inverted index,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 640–647, 2017. [12] A. Cunningham, K. M. Wurm, W. Burgard, and F. Dellaert, “Fully distributed scalable smoothing and mapping with robust multi-robot data association,” in Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, 2012, pp. 1093–1100. [13] E. Montijano, R. Aragues, and C. Sagues, “Distributed data association in robotic networks with cameras and limited communications,” IEEE Transactions on Robotics, vol. 29, no. 6, pp. 1408–1423, 2013. [14] S. Leonardos, X. Zhou, and K. Daniilidis, “Distributed consistent data association via permutation synchronization,” in Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 2017, pp. 2645–2652. [15] L. Paull, G. Huang, and J. J. Leonard, “A unified resource-constrained framework for graph SLAM,” in Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016, pp. 1346– 1353. [16] K. Khosoussi, G. S. Sukhatme, S. Huang, and G. Dissanayake, “Designing sparse reliable pose-graph SLAM: A graph-theoretic ap- proach,” International Workshop on the Algorithmic Foundations of Robotics, 2016. [17] V. Ila, J. M. Porta, and J. Andrade-Cetto, “Information-based compact pose SLAM,” Robotics, IEEE Transactions on, vol. 26, no. 1, pp. 78–93, 2010. [18] D. G´alvez-L´opez and J. D. Tard´os, “Bags of binary words for fast place recognition in image sequences,” IEEE Transactions on Robotics, vol. 28, no. 5, pp. 1188–1197, October 2012. [19] A. Schrijver, Combinatorial optimization: polyhedra and efficiency. Springer Science & Business Media, 2003, vol. 24. [20] R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM: a versatile and accurate monocular SLAM system,” IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147–1163, 2015. [21] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” International Journal of Robotics Research (IJRR), 2013. [22] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in Computer Vision (ICCV), 2011 IEEE international conference on. IEEE, 2011, pp. 2564–2571.