Clustering in Discrete Path Planning for Approximating Minimum Length Paths Frank Imeson Stephen L. Smith Abstract— In this paper we consider discrete robot path planning problems on metric graphs. We propose a clustering method, Γ-Clustering for the planning graph that significantly reduces the number of feasible solutions, yet retains a solution within a constant factor of the optimal. By increasing the input parameter Γ, the constant factor can be decreased, but with less reduction in the search space. We provide a simple polynomial- time algorithm for finding optimal Γ-Clusters, and show that for a given Γ, this optimal is unique. We demonstrate the effectiveness of the clustering method on traveling salesman instances, showing that for many instances we obtain significant reductions in computation time with little to no reduction in solution quality. I. INTRODUCTION Discrete path planning is at the root of many robotic applications, from surveillance and monitoring for security, to pickup and delivery problems in automated warehouses. In such problems the environment is described as a graph, and the goal is to find a path in the graph that completes the task and minimizes the cost function. For example, in monitoring, a common problem is to compute a tour of the graph that visits all vertices and has minimum length [1]. These discrete planning problems are typically NP-hard [1], [2], and thus there is a fundamental trade off between solution quality and computation time. In this paper we propose a graph clustering method, called Γ-Clustering, that can be used to reduce the space of feasible solutions considered during the optimization. The parameter Γ serves to trade-off the feasible solution space reduction (and typically computation time) with the quality of the resulting solution. The idea behind Γ-Clustering is to group vertices together that are in close proximity to each other but are also far from all other vertices. Figure 1 shows an example of Γ- Clustering in an office environment. Given this clustering, we solve the path planning problem by restricting the path to visit vertices within each cluster consecutively (i.e., no path can visit any cluster more than once). This restriction reduces the number of possible solutions exponentially and thus reduces the amount of computational time needed to find good solutions. Unlike other clustering methods, Γ-Clustering does not accept as input a desired number of clusters. This means that some instances will have no clusters, while others will have many. In this way, Γ-Clusters only explore natural structures within the problem instances instead of imposing structures onto the instance. Additionally, when the graph is metric, This research is partially supported by the Natural Sciences and Engi- neering Research Council of Canada (NSERC). The authors are with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo ON, N2L 3G1 Canada (fcimeson@uwaterloo.ca; stephen.smith@uwaterloo.ca) Fig. 1: The results of Γ-Clustering used on an office environment. The red dots represent locations of interest and the red rectangles show clusters of size two or greater. we establish that for a given Γ-Clustering, the optimal path of the clustered planning problem is within a constant factor (dependent on Γ) of the true optimal solution. Related work: There are a number of clustering methods for Euclidean [3], [4] and discrete [5], [6], [7] environments. Typically the objective of these algorithms is to find a set of equal (or roughly equal) non-overlapping clusters that are grouped by similarity (close in proximity, little to no outgoing edges, etc ...), where each location in the graph is assigned to one cluster. For these methods, the desired number of clusters is given as an input parameter. In contrast, in Γ-Clustering, the idea is to simply find a specific form of clustering within the environment, if it exists. These Γ- Clusters may be nested within one another. There are other clustering methods that also look for specific structures within the graph such as community struc- tures [8], which is based on a metric that captures the density of links within communities to that between communities. In contrast, our clustering method is specifically designed to find structures that yield desirable properties for path planning on road maps. The use of clustering to save on processing/time is done in a variety of different fields such as data mining [9], parallel computer processing [10], image processing [11], and con- trol [12] for path planning. In environments that have regions with a high degree of connectivity, such as electronic circuits, clustering is commonly used to identify these regions and then plan (nearly) independently in each cluster [13]. For path planning problems with repetitive tasks, one can cluster a set of popular robot action sequences into macros [14], [15], allowing the solver to quickly discover solutions that benefit from these action sequences. In applications such as sensor sweeping for coverage problems [16] or in the routing arXiv:1702.08410v2 [cs.RO] 1 Mar 2017 of multiple agents [17], clustering has been used to partition the environment into regions that can again be treated in a nearly independent manner, reducing computation time. There is some prior work on partitioning in discrete path planning. Multilevel refinement [18] is the process of recursively coarsening the graph by aggregating the vertices together to create smaller and smaller instances, for which a plan can be found more easily. The plan is then recursively refined to obtain a solution to the original problem. The idea in coarsening a graph is that the new coarse edges should approximate the transition costs in the original graph. This differs from Γ-Clustering, which preserves the edges within the graph. There are a number of clustering approaches that aim to reduced the complexity of Euclidean and or planar TSP problems [19], [20]. Γ-Clustering is more general, in that it works on any graph, while the solution quality guarantees only hold for metric graphs. Contributions: The main contribution of this paper is the introduction of Γ-Clustering, a clustering method for a class of discrete path planning problems. We establish that the solution to the corresponding clustered problem provides a min 2, 1 + 3 2Γ  -factor approximation to the optimal solu- tion. We give some insight into the reduction of the search space as a function of the amount of clustering, and we provide an efficient algorithm for computing the optimal Γ- Clustering. We then use an integer programming formulation of the TSP problem to demonstrate that for many problem instances the clustering method reduces the computation time while still finding near-optimal solutions. II. PRELIMINARIES IN DISCRETE PATH PLANNING In this section we define the class of problems considered in this paper, review some semantics of clusters, review the traveling salesman problem (TSP) [21] and define its clustered variant, the general clustered traveling salesman problem (CTSP∗). A. Discrete Path Planning Given a graph G = ⟨V, E, w⟩, we define a path as a non- repeating sequence of vertices in V , connected by edges in E. A cycle is a path in which the first and last vertex are equal, and for simplicity we will also refer to cycles as paths. Let P represent the set of all possible paths in G. Then, abstractly, a path planning constraint defines a subset P1 ⊆P of feasible paths. Given a set of constraints P1, P2, . . . , Pm, the set of all feasible paths is P1∩P2∩· · ·∩Pm. In this paper we restrict our attention to the following class of constraints and planning problems. Definition II.1 (Order-Free Constraints). A constraint P1 is order-free if, given any p ∈P1, then all paths obtained by permuting the vertices of p are also in P1. Problem II.2 (Discrete Path Planning Problem). Given a complete weighted graph G = ⟨V, E, w⟩and a set of order- free constraints {P1, P2, . . . , Pm}, find the minimum length feasible path. Many discrete path planning problems for single and multiple robots fall into this class, so long as they do not restrict the ordering of vertex visits (i.e., no constraints of the form “visit A before B”). Some examples include single and multi-robot traveling salesman problems, point-to-point planning, and patrolling. As a specific example the GTSP is a problem where a robot is required to visit exactly one location in each non-overlapping set of locations [22]. This is naturally expressed in the above framework by having one constraint for each set: for each cluster Vi we have a constraint stating that exactly one vertex in Vi must be visited in the path. A metric discrete path planning problem is one where the edge weights in the graph G satisfy the triangle inequality: for va, vb, vc ∈V , we have w(va, vc) ≤w(va, vb) + w(vb, vc). To describe the number of feasible paths for a given planning problem, we use the phrase search space size. For example, a problem where we must choose an ordering of n locations has a search space size of n!, since there are n! combinations that a path may take. Note that as more constraints are added to the problem, the search space size can only be reduced, since a feasible path must lie in the intersection of all constraints. B. Clusters A cluster is a subset of the graph’s vertices, Vi ⊂V . Given the clusters V1 and V2, we say V1 is nested in V2 if V1 ⊆V2. The clusters V1 and V2 are overlapping if V1 ∩V2 ̸= ∅, V1 ⊈ V2, and V2 ⊈V1. A set of clusters (or clustering) is denoted by C = {V1, . . . , Vm}. A clustering C = {V1, V2, . . . , Vm} is a nested if there exists some Vi ⊆Vj for Vi, Vj ∈C. In this paper we seek to add clustering constraints to a discrete planning problem that reduce its search space size, but also retains low-cost feasible paths. The clustering constraints we consider are of the following form. Definition II.3 (Consecutive Visit Constraint). Given a graph G = ⟨V, E, w⟩and a cluster Vi ⊆V , a feasible path p must visit the vertices within the cluster Vi consecutively. Formally, the vertices visited by p, V [p] are visited consecu- tively if there exists a path segment p′ of p that visits every vertex in Vi ∩V [p] and is of length |Vi ∩V [p]|. Note that in the above definition, it is not necessary for all of the vertices in Vi to be visited. It is just necessary to visit the vertices consecutively in Vi that are visited. C. Traveling Salesman Problems The traveling salesman problem (TSP) is defined as fol- lows: given a set of cities and distances between each pair of cities, find the shortest path that the salesman can take to visit each city exactly once and return to the first city (i.e., the shortest tour). A tour in a graph that visits each vertex exactly once is called a Hamiltonian cycle (regardless of path cost). The general clustered version of TSP is the extension that requires the solution to visit the vertices within the clusters consecutively. The definition of these problems is as follows: Problem II.4 (Traveling Salesman Problem (TSP)). Given a complete graph G = ⟨V, E, w⟩with edge weights w : E → R≥0, find a Hamiltonian cycle of G with minimum cost. Problem II.5 (General-CTSP). Given a complete and weighted graph G = ⟨V, E, w⟩along with a clustering C = {V1, . . . , Vm}, find a Hamiltonian cycle of G with minimum cost such that the vertices within each cluster Vi are visited consecutively. The traditional version of the CTSP restricts the clusters to be non-overlapping (and non-nested). For this paper we use the syntax CTSP∗to emphasize when we are solving a General-CTSP problem CTSP to refer to the traditional problem. III. Γ-CLUSTERING In this section, we define Γ-Clustering and show that the Γ-Clustered path planning problem provides a min 2, 1 + 3 2Γ  approximation of the original path plan- ning problem. We then describe an algorithm for finding the optimal Γ-Clustering, and characterize the search space reductions. A. Definition of Γ-Clustering Below we define the notion of Γ-Clusters, Γ-Clusterings, and the clustered discrete path planning problem. Then we pose the clustering problem as one of maximizing the search space reduction. Definition III.1 (Γ-Metric of a cluster). Given a graph G = ⟨V, E, w⟩and a cluster Vi ⊂V , we define the following quantities for Vi relative to G: αi ≡ min va∈Vi,vb∈V \Vi(w(va, vb), w(vb, va)) βi ≡ max va,vb∈Vi,va̸=vb w(va, vb) Γi ≡αi βi , where αi represents the minimum weight edge entering or exiting the cluster Vi, and βi represents the maximum weight edge within Vi. The ratio Γi is a measure of how separated the vertices in Vi are from the remaining vertices in G. Definition III.2 (Γ-Clustering). Given an input parameter Γ ≥0 and a graph G = ⟨V, E, w⟩, a clustering C = {V1, V2, . . . , Vm} is said to be a Γ-Clustering if and only if V is covered by V1∪V2∪· · ·∪Vm; each Vi ∈C has a separation Γi ≥Γ; and the clusters are either nested (Vi ⊆Vj or Vj ⊆Vi for all Vi, Vj ∈C) or non-overlapping (Vi∩Vj = ∅). The search space reduction for path planning problems comes from restricting paths to visit the clusters consecu- tively and our goal is to maximize that reduction. Thus we are interested in the following two problems. Definition III.3 (The Clustered Path Planning Problem). Given a discrete path planning problem P and a clustering C, the clustered version of the problem P ′ has the constraint that the path must visit the vertices within each cluster consecutively, in addition to all the constraints of P. Definition III.4 (The Clustering Problem). Given a graph G = ⟨V, E, w⟩and a parameter Γ > 0, find a Γ-Clustering C∗such that the search space reduction is maximized. Remark III.5 (Overlap). Note that in Definition III.2 over- lapping clusters are not permitted. This is necessary for the problem in Definition III.3 to be well defined. In addition, we will see in the following section that clusters that have a separation of Γi > 1 cannot overlap. B. Solution Quality Bounds In this section, we show that when the graph G is metric and Γ > 1, then the solution to the Γ-clustered path planning problem provides a min 2, 1 + 3 2Γ  -factor approximation to the optimal. Theorem III.6 (Approximation Factor). Given a metric discrete path planning problem P with optimal solution p∗ and cost c∗, a Γ-Clustering C = {V1, V2, . . . , Vm} where Γ > 1, then the optimal solution (p′)∗to the clustered problem P ′ over the same set of vertices is a solution to P with cost (c′)∗≤min 2, 1 + 3 2Γ  c∗. To prove the main result, we begin by proving the first half of the bound (c′)∗≤2c∗. We do this by using an minimum spanning tree (MST) to construct a feasible solution for P ′. Lemma III.7 (Minimum Spanning Tree (MST)). Given a metric graph G and a Γ-Clustering C, with Γ > 1, every MST will have exactly one inter-set edge for each cluster Vi ∈C. Proof. We prove the above result by contradiction. Suppose the MST has at least two inter-set edges connected to Vi. Thus, there are at least two sets of vertices in Vi that are not connected to each other using intra-set edges. We can then lower the cost of the MST by removing one of these inter-set edges of weight ≥αi and replace it with an intra-set edge of weight ≤βi < αi. This highlights the contradiction and thus every MST will have exactly one inter-set edge for each cluster Vi ∈C. Lemma III.8 (Two-Factor Approximation). Consider a metric discrete path planning problem P with an optimal solution path p∗and cost c∗. Then given a Γ-Clustering C = {V1, V2, . . . , Vm} with Γ > 1, the optimal solution (p′)∗ for the clustered problem P ′ over the same set of vertices V [p∗] is a solution to P with cost (c′)∗≤2c∗. Proof. To prove the above result, we will use the MST approach described below and in [23] to construct a path p′ over the set of vertices in V [p∗]. This approach yields a solution p′ for P ′ that has our desired cost bound, c′ ≤ 2c∗[23]. The MST procedure is described below. 1) Find a minimum spanning tree for the vertices V [p∗]. 2) Duplicate each edge in the tree to create a Eulerian graph. 3) Find an Eulerian tour of the Eulerian graph. 4) Convert the tour to a TSP: if a vertex is visited more than once, after the first visit, create a shortcut from the vertex before to the one after, i.e., create a tour that visits the vertices in the order they first appeared in the tour. What remains is to prove that the above tour is a feasible solution for P ′. First we note that the above approach yields a single tour of all the vertices in V [p∗], i.e., there are no disconnected tours. Next we note that Lemma III.7 states that every MST uses exactly one inter-set edge for each cluster Vi ∈C. Thus when the edges are duplicated and a Eulerian tour is found, there are only two inter-set edges used for each Vi ∈C. Furthermore short-cutting the path does not change the number of inter-set edges used by the tour, thus the final solution p′ only has one incoming and one outgoing edge for each cluster Vi ∈C and so it is a clustered solution for P ′ that satisfies the bound since the MST approach also yields a solution with cost c′ ≤2c∗. Next we prove the second half of the bound (c′)∗≤ 1 + 3 2Γ  c∗by using Algorithm 1 to construct a feasible solution p′ for P ′. Additionally we use a modified graph ˆG defined in Definition III.12 to show that the cost of this solution satisfies our desired bound. Algorithm 1: DEFORM(p, Vi) 1 if p has two or less inter-set edges for Vi then 2 return p 3 k ←1 4 p′ ←[ ] /* empty array */ 5 while p[k] ̸∈Vi do 6 p′.append(p[k]) 7 k ←k + 1 8 for l ∈[k, k + 1, . . . , |p|] do 9 if p[l] ∈Vi then 10 p′.append(p[l]) 11 for l ∈[k, k + 1, . . . , |p|] do 12 if p[l] ̸∈Vi then 13 p′.append(p[l]) 14 return p′ Lemma III.9 (Correctness of Algorithm 1). Given a feasible path p for P and a cluster Vi ∈C that is not visited consecutively, then pi ← DEFORM(p, Vi) does visit Vi consecutively and any subsequent deformed paths pi+n+1 ← DEFORM(pi+n, Vi+n+1) also visit Vi consecutively, for n ∈ Z>0. Proof. By construction Vi is visited consecutively in pi ← DEFORM(p, Vi). The remaining claim that Vi continues to be visited consecutively is proven by showing that the number of inter-set edges for Vi in the subsequent paths remains the same. We prove this result by contradiction. Suppose there is a sequence ⟨va, vb⟩∈pj−1 for some j > i such that va, vb ∈Vi and va is no longer connected to vb in pj ← DEFORM(pj−1, Vj); that is, ⟨va, vb⟩̸∈pj. This would mean that one or more vertices were inserted in between va and vb, thus creating a new inter-set edge. In Algorithm 1, this cannot happen in lines 5-7 since the path for the 1st vertex to the kth is unchanged. This also cannot happen in lines 8-9 since this part of the algorithm is only connecting vertices within the cluster Vj together. Finally, this cannot happen in lines 11-13, since the path is not changing the order of the appearance of va and vb (no insertions, just deletions). Thus there are no additional inter-set edges created by Algorithm 1 for cluster Vi, which highlights our contra- diction. Therefore all subsequent paths must also visit Vi consecutively. Remark III.10 (Uniqueness). When Algorithm 1 is applied to p∗iteratively for each Vi ∈C to generate the solution p′, the solution is unique despite the order that DEFORM(p, Vi) was called for all Vi ∈C. Furthermore the order of the clusters is determined by their first appearance in p. This follows from the method in which Algorithm 1 reorders the vertices within the tour. Specifically, the order of vertices within each cluster Vi is preserved as DEFORM(p, Vi) is called as is the ordering of the remaining vertices. Lemma III.11 (Deformation cost in G). Consider a feasible path p for P that has 2(n + 1) ≥4 inter-set edges for Γ- Cluster Vi such that Γi > 1 and n ∈Z≥0. Then the cost to deform p into p′ ←DEFORM(p, Vi) is c′ −c ≤(2n + 1)βi. Proof. In this proof we analyze the cost to deform p into p′, which is a result of calling p′ ←DEFORM(p, Vi) (Algo- rithm 1). There are three types of deformations that result from the algorithm: 1) there are short cuts created within the cluster; 2) there are short cuts created outside of the cluster; and 3) there is a new outgoing edge for the cluster. These deformations are illustrated in Figure 2 and a classification of the edges in the figure are as follows: 1) edges ⟨v3, v4⟩and ⟨v6, v7⟩are short cuts within the cluster; 2) edges ⟨v10, v11⟩ and ⟨v12, v13⟩are short cuts outside of the cluster; and 3) edge ⟨v8, v9⟩is the new outgoing the edge for the cluster. We start by examining the incurred cost to short cut paths within the cluster. Consider a path segment ⟨va, vb, vc, . . . , vx, vy, vz⟩of p such that va is directly con- nected to vz in p′ with the edge ⟨va, vz⟩and va, vz ∈Vi. The incurred cost of each of these edges is ≤βi due to the fact that the cost of any intra-set edge has weight ≤βi. There are n such shortcuts of this nature incurred from performing DEFORM(p, Vi) (n captures the number of extra visits to the cluster), and so the total incurred cost for this type of shortcut is ≤nβi. Next we examine the incurred cost to short cut paths outside of the cluster. Consider a path segment ⟨va, vb, vc, . . . , vx, vy, vz⟩of p such that va is directly con- nected to vz in p′ with the edge ⟨va, vz⟩, va, vz ̸∈Vi and vb, vc, . . . , vy ∈Vi. The incurred cost for each of these short cuts is again ≤βi. This is due to the metric property of G: The cost of the direct path from va to vz is less than or equal to any path from va to vz, specifically c(va, vz) ≤ c(va, vb) + c(vb, vy) + c(vy, vz) ≤c(va, vb) + βi + c(vy, vz). Thus the incurred cost ∆, of this shortcut is bounded by the difference between the cost of the new edges in p′ and the removed edges in p∗, namely: ∆= c(va, vz) −c(va, vb) −c(vy, vz) ≤c(va, vb) + βi + c(vy, vz) −c(va, vb) −c(vy, vz) ≤βi There are n such shortcuts of this nature incurred by DE- FORM(p, Vi), and so the total incurred cost for this type of shortcut is also ≤nβi. Lastly we examine the incurred cost of the new outgoing edge. Consider the path ⟨va, vb, vc, . . . , vx, vy, vz⟩of p such that ⟨vb, vc⟩is the first outgoing edge of Vi and ⟨vx, vy⟩ is the last outgoing edge of Vi, thus ⟨vx, vc⟩is the new outgoing edge. Then due to the metric property we know that c(vx, vc) ≤c(vx, vb) + c(vb, vc) ≤βi + c(vb, vc). The incurred cost of this deformation is the difference between the cost of the new edge ⟨vx, vc⟩and the removed edge ⟨vb, vc⟩(this edge has not been considered in any previous incurred cost calculation): ∆= c(vx, vc) −c(vb, vc) ≤βi + c(vb, vc) −c(vb, vc) ≤βi This accounts for all of the incurred costs, and so the total cost to deform p into p′ via DEFORM(p, Vi) is c′ −c ≤ (2n + 1)βi. We introduce the modified graph ˆG in the following definition to aid with our ongoing proof of the bound. Definition III.12 (The modified graph). Given a graph G and a Γ-Clustering C, the modified graph ˆG is a copy of G with the following modifications: if ⟨va, vb⟩is an inter-set edge with va ∈Vi and vb ∈Vj then ˆc(va, vb) = c(va, vb) + 3 2 max(βi, βj), otherwise ˆc(va, vb) = c(va, vb), where βi and βj are as defined in Definition III.1. Lemma III.13 (Deformation cost in ˆG). Consider a feasible path p for P and a Γ-Cluster Vi such that Γi > 1. Then the cost to deform p into p′ ←DEFORM(p, Vi) in ˆG is ˆc′−ˆc ≤0. Proof. In this proof we analyze the cost to deform p into p′in ˆG, which is a result of single call p′ ←DEFORM(p, Vi) (Algorithm 1). From Lemma III.11 we see that the cost to deform p into p′ with respect to G is c′ −c ≤(2n + 1)βi, where there are 2(n + 1) ≥4 inter-set edges for Vi in p. The cost of p in ˆG is ˆc ≥c + 2(n + 1) 3 2  βi = c + 3(n + 1)βi and the cost of p′ in ˆG is ˆc′ ≤c′ + 2 3 2βi  ≤c + (2n + 1)βi + 3βi. Thus ˆc′ −ˆc ≤(2n + 1)βi + 3βi −3(n + 1)βi = 2nβi + βi + 3βi −3nβi −3βi = βi −nβi and since n ≥1 (otherwise we did not need to deform the path), then ˆc′ −ˆc ≤0. Lemma III.14 (Approximation Factor). Given a metric discrete path planning problem P with optimal solution cost c∗, Γ > 1 and a Γ-Clustering C = {V1, V2, . . . , Vm}, then the optimal solution (p′)∗to the clustered problem P ′ over the same set of vertices is a solution to P with cost (c′)∗≤(1 + 3 2Γ)c∗. Proof. To prove the above result we will work with the modified graph ˆG (defined in Definition III.12) and use the result from Lemma III.13 to show that there exists a clustered solution in ˆG that has the same cost or lower than ˆc(p∗). Then we will relate (c′)∗to c∗. First we show that there exists a clustered solution p′ that satisfies the following: ˆc(p′) ≤ˆc(p∗) To find a solution p′ that satisfies the above we will use Algo- rithm 1 to deform p∗into p′. The deform algorithm is called for each Vi ∈C in any order as pi+1 = DEFORM(pi, Vi) to form a solution for P ′ (see Lemma III.9). For each call of pi+1 = DEFORM(pi, Vi) the incurred cost ˆc(pi) −ˆc(pi+1) ≤ 0 (see Lemma III.13). Thus after the series of calls, we have a clustered solution p′ satisfying ˆc(p′) ≤ˆc(p∗). Next we relate c(p∗) to ˆc(p∗) by observing the following for an inter-set edge ⟨va, vb⟩∈p∗: ˆc(va, vb) = c(va, vb) + 3 2 max(βi, βj) = c(va, vb) + 3 2 max αi Γi , αj Γj  ≤  1 + 3 2 max  1 Γi , 1 Γj  c(va, vb) ≤  1 + 3 2Γ  c(va, vb) The above inequality for ˆc(p∗) is true for each edge ⟨va, vb⟩∈p∗, inter-set edge or not. Therefore ˆc(p∗) ≤  1 + 3 2Γ  c(p∗). Due to the construction of ˆG and how ˆc(p′) ≤ˆc(p∗) we deduce that (c′)∗≤ˆc(p∗), since (c′)∗≤c(p′) ≤ˆc(p′) ≤ˆc(p∗). Therefore we have (c′)∗≤  1 + 3 2Γ  c∗. The proof of the main result in Theorem III.6 directly follows from Lemma III.8 and Lemma III.14. Fig. 2: Path deformation example. Remark III.15 (Tightness of Bound). We have not fully characterized the tightness of the bounds from Lemma III.8 and Lemma III.14, but we we have a lower bound, which is illustrated in Figure 3. In this example, the graph is scale- able (we can add vertices three at a time). The clustered and non-clustered solutions for this example have a cost relation of lim |V |→∞(c′)∗=  1 + 2 2Γ + 1  c∗. We also provide the graph in Figure 4 to show how the gap changes as Γ varies. The bound for this graph is obtained as follows. Let n = |V | 3 −1 for |V | 3 ∈Z>0. Then the optimal non-clustered solution cost is (recall that Γ = α β ): p∗= ⟨v1, v2, v3, v4, v6, v5, . . .⟩ c∗= α + β + α + n (2α + β) = (n + 1) (2α + β) = (n + 1)  2 + 1 Γ  α The optimal clustered solution is: (p′)∗= ⟨v1, v2, v3, v5, v6, . . . , v4, v7, . . .⟩ (c′)∗= α + β + 2nβ + α + n(2α + β) = 2α + 3β −2β + n(2α + 3β) = (n + 1)(2α + 3β) −2β =  (n + 1)(2 + 3 Γ) −2 Γ  α As the instance grows (|V | →∞, which implies n →∞), we have the following: lim n→∞(c′)∗= (n + 1) 2 + 3 Γ  −2 Γ (n + 1) 2 + 1 Γ  c∗ = 2 + 3 Γ 2 + 1 Γ  c∗ =  1 + 2 2Γ + 1  c∗ Optimal Clustered Solution: Optimal Solution: Fig. 3: Metric example instance (α > β). Vertices in the top row (v2, v3, v5, v6, . . .) are in the cluster Vi and the vertices in the bottom row are not. Edge weights connecting vertices within Vi are β, edge weights connecting vertices not in Vi are 2α + β, edge weights connecting vertices not in Vi to vertices in Vi are α + β unless shown differently in diagram. Optimal solutions were verified with our solvers. 1 1.2 1.4 1.6 1.8 2 2.2 1 2 3 4 5 6 7 8 9 10 Normalized Cost Bound Gamma Upper Bound Lower Bound Fig. 4: A plot showing the tightness of the approximation’s upper bound. The gap between the two curves shows where the tightest upper bound can lie. C. Finding Γ-Clusters Before we describe our method for finding optimal Γ- Clustering (s), we describe a few properties of Γ-Clusterings. 1) Overlap: A special property of Γ-Clustering is that when Γ > 1, there are no overlapping clusters. Specifically there are no two clusters Vi and Vj with Γi > 1 and Γj > 1 that have a non-zero intersection unless one cluster is a subset of the other. Lemma III.16. Given a graph G, clusters Vi and Vj with separations Γi > 1 and Γj > 1 (defined in Definition III.1), then Vi ∩Vj is non-empty if and only if Vi ⊆Vj or Vj ⊆Vi. Proof. We prove the above result by contradiction. Before we begin, recall that edges cut by the cluster Vi have edge weights ≥αi and edges within the cluster have edge weights ≤βi. Let us assume that Vi and Vj overlap and are not nested. Without loss of generality let βj ≤βi. Then there exists an edge ⟨va, vb⟩with weight w(va, vb) ≤βj for va ∈ Vi∩Vj and vb ∈Vj \(Vi∩Vj). Since this edge does not exist in the cluster Vi but it is cut by Vi, then it must be the case that w(va, vb) ≥αi. However, this edge does exist in the cluster Vj and so αi ≤βj, since w(va, vb) ≤βj. This result highlights the contradiction: Γi = αi βi ≤βj βi ≤βi βi = 1. 2) Uniqueness: The non-overlapping property for Γ- Clusterings with Γ > 1 implies that there exists a unique maximal clustering set C∗(more clusters equals more re- ductions in the search space size). This result follows from the simple property that if a cluster Vi exists and does not exist in our clustering C∗, then we must be able to add it to C∗to get additional search space reductions. Proposition III.17. Given graph G and a parameter Γ > 1, the problem of finding a Γ-Clustering C∗that maximizes the search space reduction has a unique solution C∗. Further- more C∗contains all clusters Vi with separation Γi ≥Γ. Proof. We use contradiction to prove that C∗is unique. Suppose there exists two different Γ-Clusterings C1 and C2 that maximize the search space reductions for a given Γ > 1. Then, there is a cluster V1 that is in C1 and not in C2 (or vice versa). This implies that either V1 is somehow incompatible with C2, which we know is not the case due to Lemma III.16 (i.e., clusters do not overlap unless one is a subset of the other) or this cluster can be added to C2. This is a contradiction, which proves the first result. The second result directly follows since adding a cluster can only further reduce the search space size of the problem. Corollary III.18. Given a graph G and two clustering parameters Γi > Γj > 1, the optimal clustering C∗ j for Γj is a superset of any clustering for Γi. Proof. We prove this result by contradiction. Suppose we have a clustering Ci for Γi and the optimal clustering C∗ j for Γj, for which there exists a cluster V1 in Ci that is not in C∗ j . Then by the definition of Γ-Clusters V1 must satisfy Γ1 ≥Γi and since Γi > Γj then Γ1 ≥Γj, which makes Vi a cluster that should be added to C∗ j . This contradicts Proposition III.17 and thus proves the result. Corollary III.19. There exists a minimum Γ∗> 1 and its optimal Γ-Clustering C∗that is a superset of all other Γ- Clusters for Γ > 1. Proof. This result follows directly from Corollary III.18, when we consider Γ∗to be the smallest ratio of edge weights in the graph G (find the two edges that give us the smallest ratio). Then for every cluster with a separation parameters Γi > 1, it must also be greater than or equal to Γ∗since Γi itself is a ratio of existing edge weights in the graph G. Thus by Corollary III.18, C∗must be superset of any other Γ-Clustering for some Γ > 1. 3) An MST Approach For Finding Γ-Clusters: Given an in- put Γ > 1, Algorithm 2 computes the optimal Γ-Clusterings, i.e., the Γ-Clustering with maximum search space reduction. Informally the algorithm deletes edges in the graph from largest to smallest (line 6-7) to look for Γ-Clusters. It uses a minimum spanning tree (MST) to keep track of when the graph becomes disconnected, and when it does the disconnected components are tested to see if they qualify as Γ-Clusters (line 9-11). Regardless, any non-trivial sized disconnected component (Γ-Cluster or not) is added back to the queue (line 13-14), so that it can be broken and tested again to find of all nested Γ-Clusters. Algorithm 2: Γ-CLUSTERING(G, Γ) 1 assert(Γ > 1) 2 C ←{} 3 M ←{MST(G)} 4 while |M| > 0 do 5 m ←M.pop() 6 α ←largest edge cost in m 7 M ′ ←disconnected trees after removing edge(s) of cost α from m 8 for m′ ∈M ′ do 9 G′ ←graph induced by V [m′] 10 β ←max edge cost of G′ 11 if G′ is a clique and Γ′ ≡α/β ≥Γ then 12 C ←C ∪{V [m′]} 13 if |m′| > 1 then 14 M ←M ∪m′ 15 return C Theorem III.20. Given G and a Γ > 1, Algorithm 2 finds the optimal Γ-Clustering C∗in O(|V |3) time. Proof. The proof is two fold, first we show that Algorithm 2 runs in polynomial time and second we show that it finds the optimal Γ-Clustering. For this proof let n = |V |. Minimum spanning trees can be found O(n2) time [24] (line 3). The rest of the algorithm modifies the MST from line 3, which originally has n edges and so the while loop for the rest of the algorithm can only run at most n times (we can’t remove more than n edges). Finding the largest edge(s) and removing them in line 6 and 7 takes O(n) time. Creating the induced subgraph and finding the maximum edge cost in lines 9 and 10 takes O(n2) time. Testing if the subgraph is a clique (line 11) takes O(n2) time. Thus lines 1 to 3 run in O(n2) time and lines 4 to 14 run in O(n·n2), which means the entire algorithm runs in O(n3) time or O(|V |3). Next we show that Algorithm 2 finds the optimal Γ- Clustering. To do so, we will show that V [m′] does indeed represent a Γ-Clustering with separation Γ′ ≡α β (line 11) as defined by Definition III.1 and then we finish by showing that the algorithm does not omit any candidate clusters, thus proving the theorem by leveraging Proposition III.17. Let us start by understanding how to find Γ-Clusters and how we use MSTs in the algorithm. A Γ-Cluster with separation Γi > 1 is a subgraph of G that is connected with intra-edge weights less than the inter-edge weights connected to the rest of the graph. Thus one method of searching for Γ-Clusters is to delete all edges in the graph of weight ≥α and if there is a disconnected subgraph in G then and only then is it a possible Γ-Cluster. We can use MSTs to more efficiently keep track of these deleted edges. By definition, an MST is tree that connects vertices within the graph with minimum edge weight. If we remove edges of size ≥α in the graph to search for disconnected sub-graphs, then the graph is disconnected if and only if the MST is disconnected. This follows by considering the cut needed to disconnect these two sub-graphs (the minimum weight edge cut, shares the same weight cut by the MST). Thus we can instead search for Γ-Clusters by disconnecting the MST. Furthermore if we disconnected the MST by incrementally removing the largest edge(s) of weight α from the tree then we know that induced sub-graphs of the newly disconnected trees have at least one inter-set edge of size α. Line 9 in the Algorithm creates the induced subgraph G′ of a disconnected tree (m′ ∈M ′), line 10 measures its β, and if G′ is a clique and meets the separation criterion (Γ′ ≥Γ) then V [m′] is indeed a Γ-Cluster (by definition), thus it is added to the clustering C in line 12. Next we show that the Algorithm did not omit any candi- date clusters. We have already argued that only disconnected MST trees need to be considered for Γ-Clusters, thus what is left to show is that the algorithm tests every possible value of α to disconnect the MST tree. This is true since it considers every edge in the original MST. This is because every disconnected tree of size two or more is added back to M in line 14 until every edge originally in the MST is removed in line 7. Therefore this algorithm finds all of the candidate Γ- Clusters and Proposition III.17 tells us that it is the unique optimal Γ-Clustering for the given Γ > 1. D. Search Space Reduction The last remaining question is to determine how much the clustering approach reduces the search space. In general, this is difficult to answer since it depends on the particular constraints of the path planning problem. However, to get an understanding of the search space reduction, consider the example of TSP with a non-nested clustering (nesting would result in further search space reductions). Let r be the ratio of the non-clustered search space size N0, to the clustered search space size N1, for a graph with vertices V and a clustering C = {V1, V2, . . . , Vm}. Then, the ratio (derived from counting the number of solutions) is as follows: r ≡N0 N1 = |V |! m! Qm i |Vi|! To further simplify the ratio consider the case where all clusters are equally sized (for all Vi, Vj ∈C, |Vi| = |Vj|): Proposition III.21 (Search Space Reduction). Given a graph G and a clustering C = {V1, V2, . . . , Vm} such that |Vi| = |Vj| for all i, j ∈[1, m] then the ratio r of the search space size for the original TSP problem to the cluster TSP problem is 1 r = Ω (m!)x−1 , where x = |V |/m. 1The big-Ωnotation states that for large enough |V | the ratio r is at least k˙(m!)x−1 for some constant k. Proof. The number of solutions for the (directed) TSP prob- lem is N0 = |V |! and the number of solutions for the CTSP∗ problem is: N1 = m!  |V | m ! m (these results come from counting the number of possible solutions). First let us bound |V |! = (mx)!: |V |! = 1 Y i=m x−1 Y j=0 (ix −j) = 1 Y i=m x−1 Y j=0 (i)  x −j i  ≥ 1 Y i=m x−1 Y j=0 (i) (x −j) = 1 Y i=m ix !   x−1 Y j=0 (x −j)   m = (m!)x(x!)m We now use the fact that |V |! ≥(m!)x(x!)m to prove the main result: r ≡N0 N1 = |V |! m! (x!)m ≥(m!)x (x!)m m! (x!)m = (m!)x−1 Thus r = Ω (m!)x−1 . To get an idea of the magnitude of r, consider an instance of size |V | = 100 divided into four equal clusters. The clustered problem has a feasible solution space of size N1 ≈ 1.49×10−56N0, where N0 is the feasible solution space size of the non-clustered problem. However, N1 is still extremely large at about 1.39 × 10102. IV. Γ-CLUSTERING EXPERIMENTS In this section we present experimental results that demon- strate the effectiveness of Γ-Clustering for solving discrete path planning problems. We focus on metric TSP instances drawn from the established TSP library TSPLIB [25], which have a variety of TSP problem types (the first portion of the instance name indicates the type and the number indicates the size). The tests were conducted with Γ = 1.000001, for which Theorem III.6 implies that the solution to the clustered problem gives a min 2, 1 + 3 2Γ  -factor approximation to the TSP instance. However, we will see that the observed gap in performance is considerably smaller. To test the effectiveness of the clustering method, we perform Γ-Clustering on each TSP instance and recorded both the runtime and the number of clusters found. This gives us an idea of whether or not instances from TSPLIB have a structure that can be exploited by Γ-Clustering. Then, we use standard integer programming formulations for both the original TSP instance and the general clustered version of the instance (CTSP∗). We solve each instance three times using the solver Gurobi [26] and recorded the average solver time and solution quality. All instances were given a time budget of 900 seconds, after which they were terminated and the best solution found in that run was outputted. The TSP instances reported in this paper are the 37 instances that were solved to within 50% of optimal and have Γ-Clusters. The remaining instances are not reported as they would have required more than 900 seconds to provide a meaningful comparison. Additionally we demonstrate Γ-Clustering on an office environment to gain insight into the structure of Γ-Clusters. A. Integer Programming Formulation The clustering algorithm was implemented in Python and run on an Intel Core i7-6700, 3.40GHz with 16GB of RAM. The integer programming (IP) expression of the TSP problem and clustered problem is solved on the same computer with Gurobi, also accessed through python. The results of both of these approaches is found in Table I and summarized in Figure 6. The following is the IP expression used for the clustered and non-clustered path planning problems, where each vari- able ea,b ∈{0, 1}: minimize |V | X a=1 |V | X b=1 ea,b w(va, vb) (1) subject to |V | X b=1 ea,b = 1, for each a ∈{1, 2, . . . , |V |} (2) |V | X a=1 ea,b = 1, for each b ∈{1, 2, . . . , |V |} (3) X ∀va∈Vi,vb̸∈Vi ea,b = 1, for each i ∈{1, 2, . . . , m} (4) X ∀va̸∈Vi,vb∈Vi ea,b = 1, for each i ∈{1, 2, . . . , m} (5) X ea,b∈E′ ea,b ≤|E′| −1, for each subtour E′ (6) The formulation was adapted from a TSP IP formulation found in [27], where the Boolean variables ea,b represent the inclusion/exclusion of the edge ⟨va, vb⟩from the solution. Constraints 2 and 3 restricts the incoming and outgo- ing degree of each vertex to be exactly one (the vertex is visited exactly once). Similarly constraints 4 and 5 restrict the incoming and outgoing degree of each cluster to be exactly one (these constraints are only present in the clustered version of the problem). Constraint 6 is the subtour elimination constraint, which is lazily added to the formulation as conflicts occur due to the exponential number of these constraints. For each instance, we seed the solver with a random initial feasible solution. B. Results Figure 5 shows the ratio of time spent finding clusters with respect to total solver time. In all instances this time is less than 6% and most is less than 1%. Additionally as the total solver time grows, the ratio gets smaller (Time02 is for instances that use the full 900 seconds). This approach is able to find Γ-Clusters on 63 out of 70 TSP instances. In total it found 3700 non-trivial Γ-Clusters (i.e., clusters with |Vi| ≥2), which is promising since the TSPLIB library contains a variety of different TSP applications. Figure 6 and Table I, shows that for instances that do not time out the solution path costs found by the clustering approach are close to optimal (Cost01 is close to 1 in the figure and the instances from burma14 to pr107 are almost all within 1% error as shown in the table). Furthermore when the solver starts to time out (exceeds its 900 second time budget) the solution quality of the Γ-Clustering approach starts to surpass the solution quality of the non-clustered approach (as shown by Cost02 in the figure). We attribute this trend to the fact that the clustered approach needs less time to search its feasible solution space and thus is able to find better quality solutions faster than its counterpart. In instances gr202, kroB200, pr107, tsp225, and gr229 the clustering approach does very well compared to the non-clustering approach, enabling the solver to find solutions within 1% of optimal for the first three instances and solutions within 6% of optimal for the latter two instances, while the non- clustered approach exceeds 8% of optimal in the first three instances and 28% of optimal for the latter two instances. On average the Γ-Clustering approach is more efficient than the non-clustered approach, which is highlighted in Figure 6 and Table I. For the results that do not time out (Time01 in the figure) we often save more than 50% of the computational time, while maintaining a near optimal solution quality (Cost01 in the figure). For instances that require most or all of the 900 seconds (harder instances) the time savings can be quite large. This is particularly clear from the table when we compare the easy instances burma14 up to st70 that have an average time savings of around 60% to the harder instances kroA100, gr96, bier127, and ch130, which all have a time savings of more than 95%. For the instances that both time out (Time02 in the figure), the time savings is not present since both solvers use the full 900 seconds. From these results we can see that when the Γ-Clustering approach does not time out, we usually save time and when the approach does time out we often find better quality solutions as compared to the non-clustered approach. Remark IV.1 (Discussion on using Γ-Clustering for TSP). It is worth emphasizing that we are not necessarily recommend- ing solving TSP instances in this manner. We are simply using TSP as an illustrative example to show how Γ-Clustering can be used to reduce computation time in a given solver. Many discrete path planning problems are solved with IP solvers and as such we hope our results help provide some insight as to how Γ-Clustering would work on other path planning problems. In general unless the solver approach (IP or not) takes advantage of the clustering, there is no guarantee that a computation saving will be achieved. -0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 Time01 Time02 Clustering Time / Total Time Fig. 5: Box plot of the clustering time ratio with respect to the Γ-Clustering approach. The data is categorized by instances that did not time out (Time01) and instances that did time out (Time02). 0 0.5 1 1.5 2 2.5 Cost01 Time01 Cost02 Time02 Ratio Fig. 6: Box plot of the solver cost and time ratios for the Γ-Clustering approach with respect to the TSP approach. The data is categorized by instances that did not time out (Cost01 and Time01) and instances that did time out (Cost02 and Time02). C. Clustering Real-World Environments As shown in Figure 1 we have also performed Γ- Clustering on real-world environments. The figure shows the floor-plan for a portion of one floor of the Engineering 5 building at the University of Waterloo. Red dots denote the locations of desks within the environment. We encoded the environment with a graph, where there is a vertex for each red dot and edge weights between vertices are given by the shortest axis-aligned obstacle free path between the locations (obstacle-free Manhattan distances). The figure shows the results of clustering for Γ = 1.000001. We see that locations that are close together are formed into clusters unless there are other vertices in close proximity. A path planning problem on this environment could be robotic mail delivery, where a subset of locations must be visited each day. The clusters could then be visited together (or visited using the same robot). V. CONCLUSION In this paper we presented a new clustering approach called Γ-Clustering. We have shown how it can be used to approximate the discrete path planning problems to within a constant factor of min 2, 1 + 3 2Γ  , more efficiently than solving the original problem. We verify these findings with a set of experiments that show on average a time savings and a solution quality that is closer to the optimal solution than it is to the bound. For future directions we will be investigating TSP CTSP∗ |C| % Error Time (t) % Error Time (t) burma14 5 0.00 <1 0.39 <1 ulysses16 6 0.00 <1 0.73 <1 berlin52 17 0.00 <1 0.06 <1 swiss42 16 0.00 <1 0.94 <1 eil51 11 0.00 1 0.00 <1 eil76 13 0.00 3 0.00 1 rat99 28 0.00 6 0.83 13 eil101 16 0.00 8 0.00 4 ulysses22 10 0.00 8 0.00 <1 pr76 27 0.00 40 1.40 11 st70 23 0.00 45 0.44 13 lin105 42 0.00 225 0.00 8 kroE100 43 0.00 414 0.39 20 kroC100 46 0.00 445 0.00 12 kroA100 44 0.00 602 1.11 23 gr96 32 0.00 845 0.05 38 bier127 37 0.00 900 0.23 24 gr137 44 0.00 900 0.00 154 kroD100 42 0.00 900 0.57 471 kroB100 42 0.01 900 0.96 900 ch130 59 0.20 900 0.90 29 ch150 53 0.22 900 0.31 900 kroB150 65 0.35 900 0.35 395 kroA150 58 1.37 900 0.15 492 rat195 57 1.75 900 0.89 900 kroA200 82 5.59 900 1.38 900 pr124 18 5.78 900 4.24 900 gr202 74 8.56 900 0.64 703 pr136 48 9.78 900 4.88 900 kroB200 80 10.16 900 0.67 900 pr107 6 13.01 900 0.40 900 a280 11 20.11 900 20.46 900 pr144 40 24.40 900 20.03 900 tsp225 66 28.50 900 5.76 900 gr229 73 28.63 900 2.24 900 pr152 44 40.64 900 41.15 900 gil262 98 44.81 900 21.00 900 TABLE I: Experimental results for TSPLIB instances: |C| reports the number of Γ-Clusters, % error reports the average error from optimal and time column reports the average solver time. The instances where the CTSP∗ approach outperforms the TSP approach are highlighted in bold. Results are sorted from least to most difficult for the non-clustered approach. other path planning applications, which includes online path planning with dynamic environments. REFERENCES [1] S. L. Smith, J. T˚umov´a, C. Belta, and D. Rus, “Optimal path planning for surveillance with temporal-logic constraints,” The International Journal of Robotics Research, vol. 30, no. 14, pp. 1695–1708, 2011. [2] L. Gouveia and M. Ruthmair, “Load-dependent and precedence-based models for pickup and delivery problems,” Computers & Operations Research, vol. 63, pp. 56–71, 2015. [3] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Prentice- Hall, Inc., 1988. [4] G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: Hierarchical clustering using dynamic modeling,” Computer, vol. 32, no. 8, pp. 68–75, 1999. [5] B. W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,” Bell system technical journal, vol. 49, no. 2, pp. 291–307, 1970. [6] S. Guha, R. Rastogi, and K. Shim, “Rock: A robust clustering algorithm for categorical attributes,” Information systems, vol. 25, no. 5, pp. 345–366, 2000. [7] D. Katselis and C. L. Beck, “Clustering fully and partially observable graphs via nonconvex optimization,” in American Control Conference, 2016, pp. 4930–4935. [8] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of statistical mechanics: theory and experiment, vol. 2008, no. 10, p. P10008, 2008. [9] P. Berkhin, “A survey of clustering data mining techniques,” in Grouping multidimensional data. Springer, 2006, pp. 25–71. [10] H. Meyerhenke, B. Monien, and T. Sauerwald, “A new diffusion- based multilevel algorithm for computing graph partitions,” Journal of Parallel and Distributed Computing, vol. 69, no. 9, pp. 750–761, 2009. [11] S. Gao, I. W.-H. Tsang, and L.-T. Chia, “Kernel sparse representation for image classification and face recognition,” in European Conference on Computer Vision, 2010, pp. 1–14. [12] X. Jin, S. Gupta, J. M. Luff, and A. Ray, “Multi-resolution navigation of mobile robots with complete coverage of unknown and complex environments,” in American Control Conference, 2012, pp. 4867– 4872. [13] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on Scientific Comput- ing, vol. 20, no. 1, pp. 359–392, 1998. [14] C. B¨ackstr¨om, A. Jonsson, and P. Jonsson, “Automaton plans,” Journal of Artificial Intelligence Research, vol. 51, no. 1, pp. 255–291, 2014. [15] M. Levihn, L. P. Kaelbling, T. Lozano-Perez, and M. Stilman, “Fore- sight and reconsideration in hierarchical planning and execution,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp. 224–231. [16] A. Das, M. Diu, N. Mathew, C. Scharfenberger, J. Servos, A. Wong, J. S. Zelek, D. A. Clausi, and S. L. Waslander, “Mapping, planning, and sample detection strategies for autonomous exploration,” Journal of Field Robotics, vol. 31, no. 1, pp. 75–106, 2014. [17] M. R. K. Ryan, “Exploiting subgraph structure in multi-robot path planning,” Journal of Artificial Intelligence Research, pp. 497–542, 2008. [18] C. Chevalier and I. Safro, “Comparison of coarsening schemes for multilevel graph partitioning,” in International Conference on Learn- ing and Intelligent Optimization, 2009, pp. 191–205. [19] R. M. Karp, “Probabilistic analysis of partitioning algorithms for the traveling-salesman problem in the plane,” Mathematics of operations research, vol. 2, no. 3, pp. 209–224, 1977. [20] Y. Haxhimusa, W. G. Kropatsch, Z. Pizlo, and A. Ion, “Approximative graph pyramid solution of the E-TSP,” Image and Vision Computing, vol. 27, no. 7, pp. 887–896, 2009. [21] D. L. Applegate, R. E. Bixby, V. Chvatal, and W. J. Cook, The Traveling Salesman Problem: A Computational Study. Princeton University Press, 2006. [22] C. E. Noon and J. C. Bean, “An efficient transformation of the generalized traveling salesman problem,” Information Systems and Operational Research (INFOR), vol. 31, no. 1, pp. 39–44, 1993. [23] B. Korte, J. Vygen, B. Korte, and J. Vygen, Combinatorial optimiza- tion. Springer, 2012, vol. 2. [24] R. C. Prim, “Shortest connection networks and some generalizations,” Bell system technical journal, vol. 36, no. 6, pp. 1389–1401, 1957. [25] G. Reinelt, “TSPLIB–a traveling salesman problem library,” ORSA Journal on computing, vol. 3, no. 4, pp. 376–384, 1991. [26] G. Optimization et al., “Gurobi optimizer reference manual,” 2012. [Online]. Available: http://www.gurobi.com [27] L. Gouveia and J. M. Pires, “The asymmetric travelling salesman problem and a reformulation of the miller–tucker–zemlin constraints,” European Journal of Operational Research, vol. 112, no. 1, pp. 134– 146, 1999.