Finding Coordinated Paths for Multiple Holonomic Agents in 2-d Polygonal Environment Pavel Janovský∗ Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic pavel.janovsky@agents.fel.cvut.cz Michal ˇCáp∗ Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic michal.cap@agents.fel.cvut.cz Jiˇrí Vokˇrínek Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic jiri.vokrinek@agents.fel.cvut.cz ABSTRACT Avoiding collisions is one of the vital tasks for systems of au- tonomous mobile agents. We focus on the problem of find- ing continuous coordinated paths for multiple mobile disc agents in a 2-d environment with polygonal obstacles. The problem is PSPACE-hard, with the state space growing ex- ponentially in the number of agents. Therefore, the state of the art methods include mainly reactive techniques and sampling-based iterative algorithms. We compare the performance of a widely-used reactive method ORCA with three variants of a popular planning algorithm RRT* applied to multi-agent path planning and find that an algorithm combining reactive collision avoid- ance and RRT* planning, which we call ORCA-RRT* can be used to solve instances that are out of the reach of ei- ther of the techniques. We experimentally show that: 1) the reactive part of the algorithm can efficiently solve many multi-agent path finding problems involving large number of agents, for which RRT* algorithm is often unable to find a solution in limited time and 2) the planning component of the algorithm is able to solve many instances containing local minima, where reactive techniques typically fail. Categories and Subject Descriptors I.2.11 [ARTIFICIAL INTELLIGENCE]: Distributed Ar- tificial Intelligence—Intelligent agents, Multiagent systems General Terms Algorithms; Measurement; Performance; Experimentation Keywords Path finding problem; multi-agent solver; planning; reactive technique 1. INTRODUCTION One of the fundamental tasks for a team of mobile agents is planning trajectories for the agents so as to avoid collisions ∗Equal contribution. Appears in: Proceedings of the 13th International Confer- ence on Autonomous Agents and Multiagent Systems (AA- MAS 2014), Lomuscio, Scerri, Bazzan, Huhns (eds.), May, 5–9, 2014, Paris, France. Copyright c⃝2014, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved. between them. The problem arises in a variety of application domains such as in the teams of autonomous robots or in air traffic management systems. The problem of finding coordinated paths for a group of agents is studied since 1980s. In theory it is well known that finding even non-optimal collision-free paths from defined start positions to goal positions for a group of objects in restricted 2-d space is PSPACE-hard [3] and thus (unless P=PSPACE) there exist instances that cannot be solved in polynomial time. Recent advancements in the field of multi-agent path plan- ning include methods that solve restricted variant of the cooperative path finding problem, where point-like agents move using synchronous discrete steps on a graph. For this variant of the problem there exist complete polynomial al- gorithms such as BIBOX [7] or Push & Rotate [1], non- polynomial (but often effective) optimal algorithms [6] and anytime algorithms [9]. However, the graph abstraction used in this problem formulation is not suitable for systems where the size of the agents cannot be neglected. Research in trajectory planning for multiple agents has been made in the field of decoupled and centralized contin- uous planning domains. Decoupled approaches often offer better efficiency, but lose completeness. Frequently used de- coupled technique is prioritized planning [2]. Here priorities are assigned to agents which specify the order in which their single-agent planning will take place. Each agent takes into account the plans of all agents with higher priority as mov- ing obstacles. This method can be very efficient especially in uncluttered environments, but it is intrinsically incomplete. The method requires that the individual re-planning takes place in the space-time. It is relatively straightforward to perform such planning with forward search algorithms such as A*, but it remains unclear how should be such a planning efficiently done in continuous space e.g. using some of the sampling-based algorithms. Little attention has been devoted so far to the problem of finding coordinated trajectories for agents in continuous environments with polygonal obstacles. A straightforward solution is to construct the joint state space of all agents and search such a space using one of the sampling-based methods for continuous path planning. Alternatively, one can resort to a reactive collision avoidance techniques that proved to be very efficient in practice. The most prominent representatives of the two mentioned approaches are RRT* and ORCA. In 2011 Karaman and Frazzoli [4] published the RRT* arXiv:1402.3613v1 [cs.AI] 14 Feb 2014 algorithm, an anytime extension of RRT (rapidly explor- ing random tree [5]) that is probabilistically complete and asymptotically optimal. The algorithm is designed for con- tinuous state spaces in which it can efficiently find a path from a given start state to a given goal region by incremen- tally building a tree that is rooted at the start state and spans towards randomly sampled states from some given state space. Once such a tree first reaches the goal region, we can follow its edges backwards to obtain the first feasible path from start state to the target region. With more and more samples being added to the tree, the solution incre- mentally improves. Optimal reciprocal collision avoidance (ORCA) is one of the many methods based on velocity obstacle paradigm. It was proposed in 2011 by Jur van den Berg et al. [8] and demonstrated to be capable of generating collision-free mo- tion for a large number of agents in a cluttered workspace with local guarantees on optimality and safety. Even though ORCA is a powerful tool for solving multi-agent collision avoidance, there are many instances of the problem that it is not able to solve due to the lack of cooperation and plan- ning. An example of such an instance is any scenario where an agent has to make a long trip away from a narrow corridor to clear the way to another agent. The goal of this paper is to analyze the problem of plan- ning collision-free continuous paths for multiple holonomic agents and evaluate the performance of the two approaches in 2-d polygonal environments. We study the applicability of the RRT* algorithm for multi-agent path planning and compare it with ORCA – one of the most widely used techniques for the problem, which has been successfully used in several software and hardware multi-agent implementations. Moreover, we pro- pose a novel variant of RRT* algorithm that is specifically suited for multi-agent path planning. The new ORCA-RRT* combines RRT* planning with reactive collision avoidance. Our experiments show that the combined algorithm is able to solve instances that were not solved by either of the tech- niques alone. Further, on many other instances the new algorithm provides higher quality solution that the other al- gorithms. The rest of the paper is structured as follows. In Section 2, we state the multi-agent path finding problem. Then, in Sec- tion 3 we describe the multi-agent RRT* and introduce three variants of the algorithm that are applicable for multi-agent path planning in continous environments. The paper is con- cluded by a large-scale experimental analysis comparing the performance of the individual algorithms. 2. PROBLEM STATEMENT We define the multi-agent path finding problem as fol- lows. Consider n agents operating in 2-dimensional Eu- clidean space with polygonal obstacles. The starting posi- tions of agents are given as n-tuple (s1, . . . , sn), where si is the starting position of i-th agent. The n-tuple (d1, . . . , dn) gives the agents’ destinations. We assume that the agents have disc-shaped bodies, where the radius of agent i is de- noted as ri. The final trajectory of i-th agent πi(t) is a map- ping R →R2 of time t to 2 dimensional Euclidean space of the agent, where πi(0) = si. The time td i is the minimal time after which the i-th agent remains at its destination, td i = min(ti|∀t ∈⟨ti, ∞) : πi(t) = di). Let O ⊂R2 denote the regions of the space occupied by the obstacles. Then, the collision-free property of the set of multi-agent trajectories can be defined as CF({π1, . . . , πn}, O) = true iff ∀i, j, t, i ̸= j : dist(πi(t), πj(t)) > ri + rj and ∀i, t : D(πi(t), ri) ∩O = ∅, where dist(x, y) is Euclidean distance in 2 dimensional space and D(πi(t), ri) denotes a disc of radius ri centered at πi(t). The task is to find n-tuple (π1, . . . , πn) such that the agents never collide and the sum of times agents spend on the path to their final destinations is minimal. The problem state- ment is defined as follows: Find (π1, . . . , πn) s. t. CF({π1, . . . , πn}, O) = true n X i=1 td i is minimal. 3. RRT* FOR MULTI-AGENT PATH FIND- ING The RRT* algorithm was originally designed for single- agent motion planning, however, it can be adapted to solve multi-agent path planning problems. In this section we will discuss how can be the RRT* algorithm leveraged in a multi- agent setting. We let the RRT* algorithm operate in the joint state space J = C1 × . . . × Cn, where Ci ⊆R2 is the state space of the i-th agent. The initial state in the joint state space is xinit = (s1, . . . , sn), goal state is xgoal = (d1, . . . , dn) and any other sampled state is also n-tuple containing positions of all agents. The algorithm searches for a path p : [0, 1] →J from xinit to xgoal, which can be then decomposed into a set trajectories {πi} for each individual agent i. The pseu- docode of RRT* algorithm for multi-agent path planning is exposed in Algorithm 1. The distance metric used in the algorithm is the sum of Euclidean distances of all n agents: dist(x, y) = n X i=1 distE(xi, yi), where x, y ∈J and xi, yi ∈Ci. Now, we provide definitions of the following RRT* primi- tive procedures: Nearest Neighbor: Given a graph T = (V, E) and a state x ∈J, the function Nearest(T, x) returns a vertex v ∈V that is the closest to state x in terms of the distance metric dist(·, ·). Near Vertices: Given a graph T = (V, E), a state x ∈ J and the numbers n, m ∈N, the function Near(T, x, n) returns a set {y : y ∈V ∧dist(x, y) < rn}, where rn = γ(log n/n)1/d, γ is a constant and d = 2n is the dimension of the space J. Local Steering Procedure: Given two states x and y, a domain-specific local steering procedure Steer(x, y) returns true if the steering procedure is able to connect the state x to the state y. In the context of multi-agent path finding, the steering procedure seeks for a set of paths {pi} such that pi(0) = xi and pi(1) = yi for each agent i and all paths are mutually collision-free. We consider three different methods for local steering which are discussed in detail in Section 3.1. Find best parent: Given a graph T = (V, E), a set X ⊆V and a state x ∈J, the function FindBestParent(T, X, x) returns the best parent vertex v ∈X for state x, i.e. the vertex that yields the lowest cost of path xinit →x. Rewire: Given a graph T = (V, E), a set X ⊆V and a state x ∈J the function Rewire(T, X, x) examines all vertices from X to see whether their cost can be improved by going through the new vertex x. If there are any such vertices, the tree is rewired so that these vertices become children of x, which allows them to improve the cost of their path from the initial state xinit. The main loop of the multi-agent RRT* algorithm (see Algorithm 1) works as follows: Until interrupted, the algo- rithm samples states from the joint state space J (line 4). Each such a random sample is used in an attempt to extend the tree. First, the nearest vertex from the tree T to the ran- dom sample s is determined (line 5). Then, the algorithm attempts to connect the vertex x to the random sample s using the local steering procedure (line 6). If the two points can be connected using the local steering procedure the algo- rithm proceeds to its optimizing part, otherwise it tries again with another sample. Then a set of vertices Xnear that are within a specified distance to the x is determined (line 7). The Xnear set serves two purposes: Firstly it is used to find the best parent for the new vertex x (line 8). Secondly, af- ter the new vertex x is added to the tree, the vertices from Xnear are rewired if it improves their cost (line 10). Once the goal state1 is successfuly added to the tree, one can follow the links backwards to obtain a valid path from the state xinit to the state xgoal. However, even after the first solution has been returned, the algorithm does not stop iterating. Instead, the algorithm continously extends and rewires the tree, which leads to incremental discovery of fur- ther higher-quality paths. Algorithm 1 Multi-agent RRT* 1: V ←{xinit}; E ←∅; 2: while not interrupted do 3: T = (V, E); 4: s ←Sample(); 5: x ←Nearest(T, s); 6: if Steer(x, s) then 7: Xnear ←Near(T, s, |V |); 8: xp ←FindBestParent(T, Xnear, s); 9: V ←V ∪{x}; E ←E ∪{(xp, x)}; 10: Rewire(T, Xnear, s); 11: end if 12: end while 3.1 Multi-agent Extensions The crucial component influencing the performance of the RRT* algorithm is the local steering procedure. We consider a multi-agent steering function of the following form: 1Here we assume that the point xgoal is sampled with a certain non-zero probability (e.g. 1%) to overcome the need to define the goal as a region instead of a point. Steer(x, y) =        true if ∃{πi} s.t. {πi} = E(x, y, O) and CF({πi}, O) and ∃t ∀i πi(t) = yi false otherwise , where x, y ∈J. The function E(x, y, O) : J × J × P(R2) → Π1 × . . . × Πn is an extension function that returns a set of paths {πi}, where πi is a path for agent i such that πi(0) = xi and ∃td s.t. ∀t ≥td : πi(t) = yi. Note that the steering procedure returns true only if the trajectories generated by the extension function are 1) avoiding all static obstacles and 2) the trajectories are mutually collision-free. We study three methods that can serve as a valid exten- sion function for the multi-agent steering procedure: 1) Line extension connects individual agents using straight lines, 2) VisibilityGraph extension connects individual agents using the optimal single-agent path, and 3) ORCA extension gen- erates joint path by simulating ORCA algorithm, where each agent tries to follow its single-agent optimal path. 3.1.1 Line Extension The first considered extension method is a relatively straight- forward extension of classical single-agent RRT* planner as exposed in [4], which connects two states using straight line paths. In a multi-agent setting, we connect each of the agents by a straight path: ELine(x, y, O) = {line(xi, yi)}, where the line(x, y) : R2 × R2 →Π is a function defined as line(xi, yi) = πi(t) =  xi + vi · t yi−xi |yi−xi| for t < |yi−xi| vi yi otherwise , where vi denotes the maximal speed of the i-th agent. Ob- serve that such a trajectory prescribes that the agent i should move to the point yi along the straight line at maximum speed; when the point yi is reached, the agent stays there. Note that it can happen that a solution returned by this method is rejected by the steering procedure if a) any of the trajectories intersects some of the obstacles or b) some of the trajectories are in mutual collision. 3.1.2 Visibility Graph Extension Another considered method uses the single-agent optimal paths (i.e. optimally avoids obstacles, but ignores interac- tions among the agents) between the points xi and yi. In 2-d polygonal environments, one can efficiently find such a path by constructing a visibility graph and subsequently by find- ing the shortest path in such a graph. Let the shortest path in the visibility graph of i-th agent be represented as a se- quence of edges (ei1, . . . , eimi). The resulting trajectory of agent i will follow this path. The Visibility Graph extension function can be formally defined as EV G(x, y, O) = {segments(xi, yi)}, where segments(xi, yi) : R2 × R2 →Π is a function defined as segments(xi, yi) = πi(t) =        source(eij) + vi · t eij |eij| for t ∈( j−1 P k=1 |eik| vi , j P k=1 |eik| vi ), j ∈⟨1, mi⟩ yi otherwise , where source(e) function returns the source vertex of the edge e. Note that the trajectories returned by this method are guaranteed to be obstacle avoiding, therefore, the steer- ing procedure may only reject them if the resulting trajec- tories are in mutual collision. 3.1.3 ORCA Extension The last considered extension method generates trajec- tories by following optimal single-agent paths together with Optimal Reciprocal Collision Avoidance (ORCA) technique, which is used for reactive collision avoidance between the agents. ORCA is a decentralized reactive collision avoid- ance method based on the velocity-obstacle paradigm that performs optimization in the space of velocity vectors for each agent. During ORCA, each agent creates a velocity obstacle for every other agent based on their currently ob- served position and velocity. Each such a velocity obstacle cuts out a half-plane in the space of possible velocities for the agent. Given the agent’s desired velocity and the con- straints induced by the other agents, a linear program with n −1 constraints is constructed and solved to obtain the optimal velocity vector the agent should follow in the next time step [8]. To provide the desired velocity vector in each time step, we use the same visibility graph based navigation as in the visibility graph extension method. At each timestep, the agent finds the optimal path from its current position to its destination on the visibility graph and sets the desired velocity vector to point at this direction. The ORCA extension function is defined as EORCA(x, y, O) = {πi} where πi is a trajectory of i-th agent obtained by simulating the ORCA method with x as start positions and y as goal positions. The trajectories returned by ORCA are guaranteed to be obstacle avoiding. They are also guaranteed to be mutually collision-free. However, due to the reactive nature of the algorithm, the method is not guaranteed to find a solution if a solution exists. Since some of the executions may end up in infinite loops or deadlocks, it is not uncommon to see the agents being stuck at one point and never reach their designated target position. Therefore one has to bound the maximum number of timestep each ORCA simulation is allowed to perform. Our hypothesis is that there is a significant number of problem instances that cannot be solved by ORCA alone, but that could be efficiently solved by multi-agent RRT* with ORCA extensions. An example of an artificial instance that was not solved by ORCA is in Figure 1. In such a scenario, the reactive technique tryies to resolve the conflict by letting both agents to go back in the corridor, but at some point it decides to return back to its desired directon, resulting in a deadlock situation. (a) two disk shaped agents exchange their positions (b) two teams of disk shaped agents exchange their positions Figure 1: Example of instances solved by ORCA- RRT* that was not solved by ORCA (lines show trajectories πi(t)) 4. EXPERIMENTAL ANALYSIS In this section we experimentally evaluate features of pro- posed extension of multi-agent RRT* algorithm. We com- pare three RRT* based algorithms - Line RRT*, Visibility Graph RRT* and ORCA RRT* with reactive ORCA. The experiments have been performed on four types of 2-d test- ing environments – empty, door, cross and maze environ- ments (see Figure 2). All environments are constrained by a fixed boundary having 1000x1000 dimension. The met- rics for comparison has been focused on success rate of the algorithms and the quality of the solution for various envi- ronment settings. The measured parameters are: • idealistic solution cost is the cost of a solution for which the CF function in Equation 1 is relaxed in a way that it omits its first constraint i.e. permits collisions between agents. The idealistic (i.e. lower bound) solution cost is defined as tideal = nP i=1 td i . The goal arrival time td i is obtained by computing a single-agent optimal path for each agent using the visibility graph. • suboptimality measure gives indicative quality ratio by comparison of the given solution to the lower bound of the solution provided by idealistic solution cost. It shows how many times is the given solution worse than the idealistic solution. It is defined as suboptimality = n P i=1 td i tideal . • success rate shows the percentage of scenarios solved by the algorithms. The success rate depends on sub- optimality in a way that a given solution is successful only if its suboptimality is lower than a defined thresh- old. If the algorithm does not provide any solution within time frame defined by experiment setting, it is automatically considered unsuccessful. (a) Empty environment (b) Door environment (c) Cross environment (d) Maze environment Figure 2: Problem environments 4.1 Benchmark set The experiment has been performed on a benchmark set composed of four environments as depicted in Figure 2 with the number of agents varying from 2 to 10 and the agent body radii ranging from 50 to 100. For each combination of environment, number of agents and agent radius, the bench- mark set contains 10 different settings of agents’ start and goal positions, altogether 2160 benchmark scenarios. Since RRT* is a stochastic algorithm, we run the algorithm five times with different random seeds for each problem instance. Altogether, the presented experimental results are based on 34560 runs. The benchmark set was created by an algorithm that guar- antees that for each problem instance there is exactly one collision cluster i.e. the path finding problem cannot be di- vided into mutually non-colliding groups of agents, which would be easier to solve. This is guaranteed by adding agent’s random start and goal positions iteratively only when a collision occurs between agent’s shortest path from start to goal and any other agent’s shortest path. The procedure that creates a problem instance is given in Algorithm 2. Algorithm 2 Create problem instance 1: for i in 1:numberOfAgents do 2: colliding = false 3: while not colliding do 4: start = randomSample 5: goal = randomSample 6: path = findShortestPath(start, goal, obstacles) 7: if path exists then 8: colliding = findCollision(path, allPaths) 9: end if 10: end while 11: allPaths.add(path) 12: starts.add(start) 13: goals.add(goal) 14: end for An example of one such generated problem instance is depicted in Figure 3. It shows the maze environment with 10 agents of minimal radus r = 50 and maximal radius r = 100 and the corresponding optimal single-agent trajectories obtained by running the A* algorithm on a visibility graph. In the next sections we will discuss the results of exper- imental evaluation from the perspective of the instance set coverage/success rate of the algorithms and the solution quality/suboptimality. (a) (b) Figure 3: Example of maze environment – 10 agents with radii a) 50, b) 100 with corresponding single agent optimal paths. 4.2 Success rate We compare the success rate of four implemented algo- rithms with respect to different numbers of agents and dif- ferent radii of agents. The limit on runtime of the tested algorithms has been set to 5 seconds. To illustrate the distri- bution of solution quality, we plot these graphs for different suboptimality thresholds: • Figure 4 uses no suboptimality threshold and contains all instances. • In Figure 5 we consider all solutions having subopti- mality over 5 as unsuccessful. • In Figure 6 we consider all solutions having subopti- mality over 2.5 as unsuccessful. We can observe three significant phenomena. Firstly, the success rate of RRT* based algorithms Line-RRT* and Vis- ibility Graph-RRT* drops fast with increasing number of agents. This behavior was expected because the planning takes place in a state space exponential in the number of agents. These algorithms are therefore able to solve the de- fined problems for the number of agents up to approximately 5 in the studied scenarios. Note that RRT* is provably prob- abilistically complete [4], but for some instances given the limited runtime the algorithm was not able to find even the first feasible solution. On the other hand the success rate of both RRT* algorithms is stable with changing threshold. Second, the success rate of the ORCA reactive technique drops with the increasing radii of the agents. This can be partly explained by the existence of corridors in the test sce- narios that with the increasing radii of the agents become ”narrow” and therefore hard to solve locally. If the agents 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 4: Success rate of tested algorithms on test instances, no threshold 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 5: Success rate of tested algorithms on test instances, threshold 5 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 6: Success rate of tested algorithms on test instances, threshold 2.5 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (a) Empty environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (b) Door environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (c) Cross environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (d) Maze environment Figure 7: Rank histograms for running time 5 seconds 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (a) Empty environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (b) Door environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (c) Cross environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (d) Maze environment Figure 8: Rank histograms for running time 1 second are small, then they can swap positions without leaving the corridor, which is easy to achieve using local collision avoid- ance. If the agents are large, then there will be corridors that can accommodate only one agent at a time, which re- quires longer term planning, where one agent keeps the way clear for the other agent to pass which is only achievable with planning approaches. Further, we can observe that the solutions generated by ORCA are often of low quality (notice the difference between Figure 4(d) and 5(d)). This typically happens in crowded environments, where are the agents likely to get stuck in slowly evolving deadlock situations. Third, the success rate of the ORCA-RRT* algorithm is close to one for both high number of agents and high agent radius. It drops only with the combination of high extremes of both parameters. This behavior is achieved by the com- bination of planning and reactive approaches. The planning component is able to solve the instances that require plan- ning, while the reactive component is able to solve instances with higher numbers of agents. Moreover, we can see that the success rate of ORCA- RRT* deteriorates much less rapidly with decreasing thresh- old than pure ORCA, which implies that ORCA-RRT* in general finds in the given runtime limit higher quality so- lutions than pure ORCA alone. Since the first extension in ORCA-RRT* is in fact identical to running pure ORCA, these results confirm that the algorithm exhibits incremen- tal behavior, i.e. it improves the quality of the generated solution in time. An important observation is that the there are instances that neither of the algorithms was able to solve on its own, but that got solved when the two algorithms were com- bined. For many other instances the combination of RRT* and ORCA provided a higher-quality result within the given runtime limit than each of the algorithms alone. 4.3 Suboptimality Figure 7 and 8 show the histograms of ranks assigned to algorithms for run-time limits 5 and 1 seconds. A rank from 1 to 4 is assigned to each algorithm for each experiment ac- cording to its solution suboptimality compared to other al- gorithms. If two algorithms achieve the same suboptimality, the ranks are assigned to them randomly. If an algorithm was not able to find any solution, its rank is 4. Rank 1 means that an algorithm achieved lowest suboptimality for partic- ular problem instance. Difference between figures 7 and 8 shows how the ranks of the algorithms (e.g. solution qual- ity – suboptimality) depend on the running time limit. We observe that VisibilityGraph-RRT* algorithm achieved the worst ranks. Line-RRT* is slightly better due to it’s abil- ity to sample the state space very fast. ORCA algorithm achieved the second rank and ORCA-RRT* always achieved the first rank in the majority of problem instances. The difference between Figure 7 and 8 shows that while ORCA finds the best solution early, it does not benefit from added runtime. The variants of RRT* incrementally improve the solution and thus their performance is more dependent on the given runtime limit. The results of the experiments confirm the ability of ORCA- RRT* to find higher-quality solutions compared to both RRT* variants. As a complement to the best problem in- stance set coverage, the ORCA-RRT* is also dominant in terms of the quality of the returned solution. 5. CONCLUSIONS In this paper we studied the problem of finding coor- dinated paths for holonomic agents in 2-d polygonal en- vironments. This problem is challenging due to its pro- hibitive complexity. We studied several RRT*-based algo- rithms for multi-agent coordinated path finding and a reac- tive approach ORCA. We found that while both approaches have limited coverage of the problem instance space, an ap- proach combining planning and reactive technique benefits from both its parts, providing a better problem instance set coverage and higher solution quality. We call the new algorithm ORCA-RRT*. While RRT*- based algorithms often suffer from the exponential growth of the state space and thus are unable to solve instances with high number of agents, the reactive part of ORCA-RRT* is able to overcome this problem. On the other hand reac- tive techniques are often unable to solve problems containing local minima. Due to its RRT* planning part the ORCA- RRT* algorithm can avoid such local minima by random sampling of the state space. ORCA-RRT* is an anytime algorithm, which can itera- tively improve the provided solution. We experimented with several running time limits and examined the differences in the provided solutions. In the future work we plan to de- ploy the ORCA-RRT* on hardware agents and investigate possibility of the extension towards non-holonomic agents. Acknowledgements. The presented work was supported by the Czech Republic Ministry of Education, Youth and Sports, grants no. 7H11102 (D3CoS) and no. LD12044, and by the ARTEMIS Joint Undertaking under the number 269336 (www.d3cos.eu). 6. REFERENCES [1] B. de Wilde, A. W. ter Mors, and C. Witteveen. Push and rotate: cooperative multi-agent path planning. AAMAS, 2013. [2] M. Erdmann and T. Lozano-Perez. On multiple moving objects. Algorithmica, 2:477–521, 1987. [3] J. Hopcroft, J. Schwartz, and M. Sharir. On the complexity of motion planning for multiple independent objects; pspace-hardness of the warehouseman’s problem. IJRR, 3(4):76–88, 1984. [4] Karaman and Frazzoli. Sampling-based algorithms for optimal motion planning. I. J. Robotic Res., 20(5):846–894, 2011. [5] S. M. LaValle and J. J. Kuffner. Randomized kinodynamic planning. I. J. Robotic Res., 20(5):378–400, 2001. [6] T. S. Standley. Finding optimal solutions to cooperative pathfinding problems. AAAI, 2010. [7] P. Surynek. A novel approach to path planning for multiple robots in bi-connected graphs. In Proceedings of ICRA 2009, pages 3613–3619, 2009. [8] J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha. Reciprocal n-body collision avoidance. In Robotics research, pages 3–19. Springer, 2011. [9] M. ˇC´ap, P. Nov´ak, J. Vokˇr´ınek, and M. Pˇechouˇcek. Multi-agent RRT*: Sampling-based cooperative pathfinding (extended abstract). In Proceedings of the AAMAS’13, 2013. Finding Coordinated Paths for Multiple Holonomic Agents in 2-d Polygonal Environment Pavel Janovský∗ Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic pavel.janovsky@agents.fel.cvut.cz Michal ˇCáp∗ Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic michal.cap@agents.fel.cvut.cz Jiˇrí Vokˇrínek Agent Techology Center FEL, CTU in Prague Karlovo namesti 13, 121 35, Prague, Czech Republic jiri.vokrinek@agents.fel.cvut.cz ABSTRACT Avoiding collisions is one of the vital tasks for systems of au- tonomous mobile agents. We focus on the problem of find- ing continuous coordinated paths for multiple mobile disc agents in a 2-d environment with polygonal obstacles. The problem is PSPACE-hard, with the state space growing ex- ponentially in the number of agents. Therefore, the state of the art methods include mainly reactive techniques and sampling-based iterative algorithms. We compare the performance of a widely-used reactive method ORCA with three variants of a popular planning algorithm RRT* applied to multi-agent path planning and find that an algorithm combining reactive collision avoid- ance and RRT* planning, which we call ORCA-RRT* can be used to solve instances that are out of the reach of ei- ther of the techniques. We experimentally show that: 1) the reactive part of the algorithm can efficiently solve many multi-agent path finding problems involving large number of agents, for which RRT* algorithm is often unable to find a solution in limited time and 2) the planning component of the algorithm is able to solve many instances containing local minima, where reactive techniques typically fail. Categories and Subject Descriptors I.2.11 [ARTIFICIAL INTELLIGENCE]: Distributed Ar- tificial Intelligence—Intelligent agents, Multiagent systems General Terms Algorithms; Measurement; Performance; Experimentation Keywords Path finding problem; multi-agent solver; planning; reactive technique 1. INTRODUCTION One of the fundamental tasks for a team of mobile agents is planning trajectories for the agents so as to avoid collisions ∗Equal contribution. Appears in: Proceedings of the 13th International Confer- ence on Autonomous Agents and Multiagent Systems (AA- MAS 2014), Lomuscio, Scerri, Bazzan, Huhns (eds.), May, 5–9, 2014, Paris, France. Copyright c⃝2014, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved. between them. The problem arises in a variety of application domains such as in the teams of autonomous robots or in air traffic management systems. The problem of finding coordinated paths for a group of agents is studied since 1980s. In theory it is well known that finding even non-optimal collision-free paths from defined start positions to goal positions for a group of objects in restricted 2-d space is PSPACE-hard [3] and thus (unless P=PSPACE) there exist instances that cannot be solved in polynomial time. Recent advancements in the field of multi-agent path plan- ning include methods that solve restricted variant of the cooperative path finding problem, where point-like agents move using synchronous discrete steps on a graph. For this variant of the problem there exist complete polynomial al- gorithms such as BIBOX [7] or Push & Rotate [1], non- polynomial (but often effective) optimal algorithms [6] and anytime algorithms [9]. However, the graph abstraction used in this problem formulation is not suitable for systems where the size of the agents cannot be neglected. Research in trajectory planning for multiple agents has been made in the field of decoupled and centralized contin- uous planning domains. Decoupled approaches often offer better efficiency, but lose completeness. Frequently used de- coupled technique is prioritized planning [2]. Here priorities are assigned to agents which specify the order in which their single-agent planning will take place. Each agent takes into account the plans of all agents with higher priority as mov- ing obstacles. This method can be very efficient especially in uncluttered environments, but it is intrinsically incomplete. The method requires that the individual re-planning takes place in the space-time. It is relatively straightforward to perform such planning with forward search algorithms such as A*, but it remains unclear how should be such a planning efficiently done in continuous space e.g. using some of the sampling-based algorithms. Little attention has been devoted so far to the problem of finding coordinated trajectories for agents in continuous environments with polygonal obstacles. A straightforward solution is to construct the joint state space of all agents and search such a space using one of the sampling-based methods for continuous path planning. Alternatively, one can resort to a reactive collision avoidance techniques that proved to be very efficient in practice. The most prominent representatives of the two mentioned approaches are RRT* and ORCA. In 2011 Karaman and Frazzoli [4] published the RRT* arXiv:1402.3613v1 [cs.AI] 14 Feb 2014 algorithm, an anytime extension of RRT (rapidly explor- ing random tree [5]) that is probabilistically complete and asymptotically optimal. The algorithm is designed for con- tinuous state spaces in which it can efficiently find a path from a given start state to a given goal region by incremen- tally building a tree that is rooted at the start state and spans towards randomly sampled states from some given state space. Once such a tree first reaches the goal region, we can follow its edges backwards to obtain the first feasible path from start state to the target region. With more and more samples being added to the tree, the solution incre- mentally improves. Optimal reciprocal collision avoidance (ORCA) is one of the many methods based on velocity obstacle paradigm. It was proposed in 2011 by Jur van den Berg et al. [8] and demonstrated to be capable of generating collision-free mo- tion for a large number of agents in a cluttered workspace with local guarantees on optimality and safety. Even though ORCA is a powerful tool for solving multi-agent collision avoidance, there are many instances of the problem that it is not able to solve due to the lack of cooperation and plan- ning. An example of such an instance is any scenario where an agent has to make a long trip away from a narrow corridor to clear the way to another agent. The goal of this paper is to analyze the problem of plan- ning collision-free continuous paths for multiple holonomic agents and evaluate the performance of the two approaches in 2-d polygonal environments. We study the applicability of the RRT* algorithm for multi-agent path planning and compare it with ORCA – one of the most widely used techniques for the problem, which has been successfully used in several software and hardware multi-agent implementations. Moreover, we pro- pose a novel variant of RRT* algorithm that is specifically suited for multi-agent path planning. The new ORCA-RRT* combines RRT* planning with reactive collision avoidance. Our experiments show that the combined algorithm is able to solve instances that were not solved by either of the tech- niques alone. Further, on many other instances the new algorithm provides higher quality solution that the other al- gorithms. The rest of the paper is structured as follows. In Section 2, we state the multi-agent path finding problem. Then, in Sec- tion 3 we describe the multi-agent RRT* and introduce three variants of the algorithm that are applicable for multi-agent path planning in continous environments. The paper is con- cluded by a large-scale experimental analysis comparing the performance of the individual algorithms. 2. PROBLEM STATEMENT We define the multi-agent path finding problem as fol- lows. Consider n agents operating in 2-dimensional Eu- clidean space with polygonal obstacles. The starting posi- tions of agents are given as n-tuple (s1, . . . , sn), where si is the starting position of i-th agent. The n-tuple (d1, . . . , dn) gives the agents’ destinations. We assume that the agents have disc-shaped bodies, where the radius of agent i is de- noted as ri. The final trajectory of i-th agent πi(t) is a map- ping R →R2 of time t to 2 dimensional Euclidean space of the agent, where πi(0) = si. The time td i is the minimal time after which the i-th agent remains at its destination, td i = min(ti|∀t ∈⟨ti, ∞) : πi(t) = di). Let O ⊂R2 denote the regions of the space occupied by the obstacles. Then, the collision-free property of the set of multi-agent trajectories can be defined as CF({π1, . . . , πn}, O) = true iff ∀i, j, t, i ̸= j : dist(πi(t), πj(t)) > ri + rj and ∀i, t : D(πi(t), ri) ∩O = ∅, where dist(x, y) is Euclidean distance in 2 dimensional space and D(πi(t), ri) denotes a disc of radius ri centered at πi(t). The task is to find n-tuple (π1, . . . , πn) such that the agents never collide and the sum of times agents spend on the path to their final destinations is minimal. The problem state- ment is defined as follows: Find (π1, . . . , πn) s. t. CF({π1, . . . , πn}, O) = true n X i=1 td i is minimal. 3. RRT* FOR MULTI-AGENT PATH FIND- ING The RRT* algorithm was originally designed for single- agent motion planning, however, it can be adapted to solve multi-agent path planning problems. In this section we will discuss how can be the RRT* algorithm leveraged in a multi- agent setting. We let the RRT* algorithm operate in the joint state space J = C1 × . . . × Cn, where Ci ⊆R2 is the state space of the i-th agent. The initial state in the joint state space is xinit = (s1, . . . , sn), goal state is xgoal = (d1, . . . , dn) and any other sampled state is also n-tuple containing positions of all agents. The algorithm searches for a path p : [0, 1] →J from xinit to xgoal, which can be then decomposed into a set trajectories {πi} for each individual agent i. The pseu- docode of RRT* algorithm for multi-agent path planning is exposed in Algorithm 1. The distance metric used in the algorithm is the sum of Euclidean distances of all n agents: dist(x, y) = n X i=1 distE(xi, yi), where x, y ∈J and xi, yi ∈Ci. Now, we provide definitions of the following RRT* primi- tive procedures: Nearest Neighbor: Given a graph T = (V, E) and a state x ∈J, the function Nearest(T, x) returns a vertex v ∈V that is the closest to state x in terms of the distance metric dist(·, ·). Near Vertices: Given a graph T = (V, E), a state x ∈ J and the numbers n, m ∈N, the function Near(T, x, n) returns a set {y : y ∈V ∧dist(x, y) < rn}, where rn = γ(log n/n)1/d, γ is a constant and d = 2n is the dimension of the space J. Local Steering Procedure: Given two states x and y, a domain-specific local steering procedure Steer(x, y) returns true if the steering procedure is able to connect the state x to the state y. In the context of multi-agent path finding, the steering procedure seeks for a set of paths {pi} such that pi(0) = xi and pi(1) = yi for each agent i and all paths are mutually collision-free. We consider three different methods for local steering which are discussed in detail in Section 3.1. Find best parent: Given a graph T = (V, E), a set X ⊆V and a state x ∈J, the function FindBestParent(T, X, x) returns the best parent vertex v ∈X for state x, i.e. the vertex that yields the lowest cost of path xinit →x. Rewire: Given a graph T = (V, E), a set X ⊆V and a state x ∈J the function Rewire(T, X, x) examines all vertices from X to see whether their cost can be improved by going through the new vertex x. If there are any such vertices, the tree is rewired so that these vertices become children of x, which allows them to improve the cost of their path from the initial state xinit. The main loop of the multi-agent RRT* algorithm (see Algorithm 1) works as follows: Until interrupted, the algo- rithm samples states from the joint state space J (line 4). Each such a random sample is used in an attempt to extend the tree. First, the nearest vertex from the tree T to the ran- dom sample s is determined (line 5). Then, the algorithm attempts to connect the vertex x to the random sample s using the local steering procedure (line 6). If the two points can be connected using the local steering procedure the algo- rithm proceeds to its optimizing part, otherwise it tries again with another sample. Then a set of vertices Xnear that are within a specified distance to the x is determined (line 7). The Xnear set serves two purposes: Firstly it is used to find the best parent for the new vertex x (line 8). Secondly, af- ter the new vertex x is added to the tree, the vertices from Xnear are rewired if it improves their cost (line 10). Once the goal state1 is successfuly added to the tree, one can follow the links backwards to obtain a valid path from the state xinit to the state xgoal. However, even after the first solution has been returned, the algorithm does not stop iterating. Instead, the algorithm continously extends and rewires the tree, which leads to incremental discovery of fur- ther higher-quality paths. Algorithm 1 Multi-agent RRT* 1: V ←{xinit}; E ←∅; 2: while not interrupted do 3: T = (V, E); 4: s ←Sample(); 5: x ←Nearest(T, s); 6: if Steer(x, s) then 7: Xnear ←Near(T, s, |V |); 8: xp ←FindBestParent(T, Xnear, s); 9: V ←V ∪{x}; E ←E ∪{(xp, x)}; 10: Rewire(T, Xnear, s); 11: end if 12: end while 3.1 Multi-agent Extensions The crucial component influencing the performance of the RRT* algorithm is the local steering procedure. We consider a multi-agent steering function of the following form: 1Here we assume that the point xgoal is sampled with a certain non-zero probability (e.g. 1%) to overcome the need to define the goal as a region instead of a point. Steer(x, y) =        true if ∃{πi} s.t. {πi} = E(x, y, O) and CF({πi}, O) and ∃t ∀i πi(t) = yi false otherwise , where x, y ∈J. The function E(x, y, O) : J × J × P(R2) → Π1 × . . . × Πn is an extension function that returns a set of paths {πi}, where πi is a path for agent i such that πi(0) = xi and ∃td s.t. ∀t ≥td : πi(t) = yi. Note that the steering procedure returns true only if the trajectories generated by the extension function are 1) avoiding all static obstacles and 2) the trajectories are mutually collision-free. We study three methods that can serve as a valid exten- sion function for the multi-agent steering procedure: 1) Line extension connects individual agents using straight lines, 2) VisibilityGraph extension connects individual agents using the optimal single-agent path, and 3) ORCA extension gen- erates joint path by simulating ORCA algorithm, where each agent tries to follow its single-agent optimal path. 3.1.1 Line Extension The first considered extension method is a relatively straight- forward extension of classical single-agent RRT* planner as exposed in [4], which connects two states using straight line paths. In a multi-agent setting, we connect each of the agents by a straight path: ELine(x, y, O) = {line(xi, yi)}, where the line(x, y) : R2 × R2 →Π is a function defined as line(xi, yi) = πi(t) =  xi + vi · t yi−xi |yi−xi| for t < |yi−xi| vi yi otherwise , where vi denotes the maximal speed of the i-th agent. Ob- serve that such a trajectory prescribes that the agent i should move to the point yi along the straight line at maximum speed; when the point yi is reached, the agent stays there. Note that it can happen that a solution returned by this method is rejected by the steering procedure if a) any of the trajectories intersects some of the obstacles or b) some of the trajectories are in mutual collision. 3.1.2 Visibility Graph Extension Another considered method uses the single-agent optimal paths (i.e. optimally avoids obstacles, but ignores interac- tions among the agents) between the points xi and yi. In 2-d polygonal environments, one can efficiently find such a path by constructing a visibility graph and subsequently by find- ing the shortest path in such a graph. Let the shortest path in the visibility graph of i-th agent be represented as a se- quence of edges (ei1, . . . , eimi). The resulting trajectory of agent i will follow this path. The Visibility Graph extension function can be formally defined as EV G(x, y, O) = {segments(xi, yi)}, where segments(xi, yi) : R2 × R2 →Π is a function defined as segments(xi, yi) = πi(t) =        source(eij) + vi · t eij |eij| for t ∈( j−1 P k=1 |eik| vi , j P k=1 |eik| vi ), j ∈⟨1, mi⟩ yi otherwise , where source(e) function returns the source vertex of the edge e. Note that the trajectories returned by this method are guaranteed to be obstacle avoiding, therefore, the steer- ing procedure may only reject them if the resulting trajec- tories are in mutual collision. 3.1.3 ORCA Extension The last considered extension method generates trajec- tories by following optimal single-agent paths together with Optimal Reciprocal Collision Avoidance (ORCA) technique, which is used for reactive collision avoidance between the agents. ORCA is a decentralized reactive collision avoid- ance method based on the velocity-obstacle paradigm that performs optimization in the space of velocity vectors for each agent. During ORCA, each agent creates a velocity obstacle for every other agent based on their currently ob- served position and velocity. Each such a velocity obstacle cuts out a half-plane in the space of possible velocities for the agent. Given the agent’s desired velocity and the con- straints induced by the other agents, a linear program with n −1 constraints is constructed and solved to obtain the optimal velocity vector the agent should follow in the next time step [8]. To provide the desired velocity vector in each time step, we use the same visibility graph based navigation as in the visibility graph extension method. At each timestep, the agent finds the optimal path from its current position to its destination on the visibility graph and sets the desired velocity vector to point at this direction. The ORCA extension function is defined as EORCA(x, y, O) = {πi} where πi is a trajectory of i-th agent obtained by simulating the ORCA method with x as start positions and y as goal positions. The trajectories returned by ORCA are guaranteed to be obstacle avoiding. They are also guaranteed to be mutually collision-free. However, due to the reactive nature of the algorithm, the method is not guaranteed to find a solution if a solution exists. Since some of the executions may end up in infinite loops or deadlocks, it is not uncommon to see the agents being stuck at one point and never reach their designated target position. Therefore one has to bound the maximum number of timestep each ORCA simulation is allowed to perform. Our hypothesis is that there is a significant number of problem instances that cannot be solved by ORCA alone, but that could be efficiently solved by multi-agent RRT* with ORCA extensions. An example of an artificial instance that was not solved by ORCA is in Figure 1. In such a scenario, the reactive technique tryies to resolve the conflict by letting both agents to go back in the corridor, but at some point it decides to return back to its desired directon, resulting in a deadlock situation. (a) two disk shaped agents exchange their positions (b) two teams of disk shaped agents exchange their positions Figure 1: Example of instances solved by ORCA- RRT* that was not solved by ORCA (lines show trajectories πi(t)) 4. EXPERIMENTAL ANALYSIS In this section we experimentally evaluate features of pro- posed extension of multi-agent RRT* algorithm. We com- pare three RRT* based algorithms - Line RRT*, Visibility Graph RRT* and ORCA RRT* with reactive ORCA. The experiments have been performed on four types of 2-d test- ing environments – empty, door, cross and maze environ- ments (see Figure 2). All environments are constrained by a fixed boundary having 1000x1000 dimension. The met- rics for comparison has been focused on success rate of the algorithms and the quality of the solution for various envi- ronment settings. The measured parameters are: • idealistic solution cost is the cost of a solution for which the CF function in Equation 1 is relaxed in a way that it omits its first constraint i.e. permits collisions between agents. The idealistic (i.e. lower bound) solution cost is defined as tideal = nP i=1 td i . The goal arrival time td i is obtained by computing a single-agent optimal path for each agent using the visibility graph. • suboptimality measure gives indicative quality ratio by comparison of the given solution to the lower bound of the solution provided by idealistic solution cost. It shows how many times is the given solution worse than the idealistic solution. It is defined as suboptimality = n P i=1 td i tideal . • success rate shows the percentage of scenarios solved by the algorithms. The success rate depends on sub- optimality in a way that a given solution is successful only if its suboptimality is lower than a defined thresh- old. If the algorithm does not provide any solution within time frame defined by experiment setting, it is automatically considered unsuccessful. (a) Empty environment (b) Door environment (c) Cross environment (d) Maze environment Figure 2: Problem environments 4.1 Benchmark set The experiment has been performed on a benchmark set composed of four environments as depicted in Figure 2 with the number of agents varying from 2 to 10 and the agent body radii ranging from 50 to 100. For each combination of environment, number of agents and agent radius, the bench- mark set contains 10 different settings of agents’ start and goal positions, altogether 2160 benchmark scenarios. Since RRT* is a stochastic algorithm, we run the algorithm five times with different random seeds for each problem instance. Altogether, the presented experimental results are based on 34560 runs. The benchmark set was created by an algorithm that guar- antees that for each problem instance there is exactly one collision cluster i.e. the path finding problem cannot be di- vided into mutually non-colliding groups of agents, which would be easier to solve. This is guaranteed by adding agent’s random start and goal positions iteratively only when a collision occurs between agent’s shortest path from start to goal and any other agent’s shortest path. The procedure that creates a problem instance is given in Algorithm 2. Algorithm 2 Create problem instance 1: for i in 1:numberOfAgents do 2: colliding = false 3: while not colliding do 4: start = randomSample 5: goal = randomSample 6: path = findShortestPath(start, goal, obstacles) 7: if path exists then 8: colliding = findCollision(path, allPaths) 9: end if 10: end while 11: allPaths.add(path) 12: starts.add(start) 13: goals.add(goal) 14: end for An example of one such generated problem instance is depicted in Figure 3. It shows the maze environment with 10 agents of minimal radus r = 50 and maximal radius r = 100 and the corresponding optimal single-agent trajectories obtained by running the A* algorithm on a visibility graph. In the next sections we will discuss the results of exper- imental evaluation from the perspective of the instance set coverage/success rate of the algorithms and the solution quality/suboptimality. (a) (b) Figure 3: Example of maze environment – 10 agents with radii a) 50, b) 100 with corresponding single agent optimal paths. 4.2 Success rate We compare the success rate of four implemented algo- rithms with respect to different numbers of agents and dif- ferent radii of agents. The limit on runtime of the tested algorithms has been set to 5 seconds. To illustrate the distri- bution of solution quality, we plot these graphs for different suboptimality thresholds: • Figure 4 uses no suboptimality threshold and contains all instances. • In Figure 5 we consider all solutions having subopti- mality over 5 as unsuccessful. • In Figure 6 we consider all solutions having subopti- mality over 2.5 as unsuccessful. We can observe three significant phenomena. Firstly, the success rate of RRT* based algorithms Line-RRT* and Vis- ibility Graph-RRT* drops fast with increasing number of agents. This behavior was expected because the planning takes place in a state space exponential in the number of agents. These algorithms are therefore able to solve the de- fined problems for the number of agents up to approximately 5 in the studied scenarios. Note that RRT* is provably prob- abilistically complete [4], but for some instances given the limited runtime the algorithm was not able to find even the first feasible solution. On the other hand the success rate of both RRT* algorithms is stable with changing threshold. Second, the success rate of the ORCA reactive technique drops with the increasing radii of the agents. This can be partly explained by the existence of corridors in the test sce- narios that with the increasing radii of the agents become ”narrow” and therefore hard to solve locally. If the agents 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 4: Success rate of tested algorithms on test instances, no threshold 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 5: Success rate of tested algorithms on test instances, threshold 5 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (a) Empty environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (b) Door environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (c) Cross environment 50 60 70 80 90 100 2 4 6 8 10 Line RRT* 50 60 70 80 90 100 2 4 6 8 10 Visibility Graph RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA RRT* 50 60 70 80 90 100 2 4 6 8 10 ORCA agents radius (d) Maze environment 0.0 0.2 0.4 0.6 0.8 1.0 success rate Figure 6: Success rate of tested algorithms on test instances, threshold 2.5 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (a) Empty environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (b) Door environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (c) Cross environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (d) Maze environment Figure 7: Rank histograms for running time 5 seconds 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (a) Empty environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (b) Door environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (c) Cross environment 0.00 0.25 0.50 0.75 1.00 1 2 3 4 Rank Algorithms ratio algorithm Line RRT* ORCA ORCA RRT* Vis. Graph RRT* (d) Maze environment Figure 8: Rank histograms for running time 1 second are small, then they can swap positions without leaving the corridor, which is easy to achieve using local collision avoid- ance. If the agents are large, then there will be corridors that can accommodate only one agent at a time, which re- quires longer term planning, where one agent keeps the way clear for the other agent to pass which is only achievable with planning approaches. Further, we can observe that the solutions generated by ORCA are often of low quality (notice the difference between Figure 4(d) and 5(d)). This typically happens in crowded environments, where are the agents likely to get stuck in slowly evolving deadlock situations. Third, the success rate of the ORCA-RRT* algorithm is close to one for both high number of agents and high agent radius. It drops only with the combination of high extremes of both parameters. This behavior is achieved by the com- bination of planning and reactive approaches. The planning component is able to solve the instances that require plan- ning, while the reactive component is able to solve instances with higher numbers of agents. Moreover, we can see that the success rate of ORCA- RRT* deteriorates much less rapidly with decreasing thresh- old than pure ORCA, which implies that ORCA-RRT* in general finds in the given runtime limit higher quality so- lutions than pure ORCA alone. Since the first extension in ORCA-RRT* is in fact identical to running pure ORCA, these results confirm that the algorithm exhibits incremen- tal behavior, i.e. it improves the quality of the generated solution in time. An important observation is that the there are instances that neither of the algorithms was able to solve on its own, but that got solved when the two algorithms were com- bined. For many other instances the combination of RRT* and ORCA provided a higher-quality result within the given runtime limit than each of the algorithms alone. 4.3 Suboptimality Figure 7 and 8 show the histograms of ranks assigned to algorithms for run-time limits 5 and 1 seconds. A rank from 1 to 4 is assigned to each algorithm for each experiment ac- cording to its solution suboptimality compared to other al- gorithms. If two algorithms achieve the same suboptimality, the ranks are assigned to them randomly. If an algorithm was not able to find any solution, its rank is 4. Rank 1 means that an algorithm achieved lowest suboptimality for partic- ular problem instance. Difference between figures 7 and 8 shows how the ranks of the algorithms (e.g. solution qual- ity – suboptimality) depend on the running time limit. We observe that VisibilityGraph-RRT* algorithm achieved the worst ranks. Line-RRT* is slightly better due to it’s abil- ity to sample the state space very fast. ORCA algorithm achieved the second rank and ORCA-RRT* always achieved the first rank in the majority of problem instances. The difference between Figure 7 and 8 shows that while ORCA finds the best solution early, it does not benefit from added runtime. The variants of RRT* incrementally improve the solution and thus their performance is more dependent on the given runtime limit. The results of the experiments confirm the ability of ORCA- RRT* to find higher-quality solutions compared to both RRT* variants. As a complement to the best problem in- stance set coverage, the ORCA-RRT* is also dominant in terms of the quality of the returned solution. 5. CONCLUSIONS In this paper we studied the problem of finding coor- dinated paths for holonomic agents in 2-d polygonal en- vironments. This problem is challenging due to its pro- hibitive complexity. We studied several RRT*-based algo- rithms for multi-agent coordinated path finding and a reac- tive approach ORCA. We found that while both approaches have limited coverage of the problem instance space, an ap- proach combining planning and reactive technique benefits from both its parts, providing a better problem instance set coverage and higher solution quality. We call the new algorithm ORCA-RRT*. While RRT*- based algorithms often suffer from the exponential growth of the state space and thus are unable to solve instances with high number of agents, the reactive part of ORCA-RRT* is able to overcome this problem. On the other hand reac- tive techniques are often unable to solve problems containing local minima. Due to its RRT* planning part the ORCA- RRT* algorithm can avoid such local minima by random sampling of the state space. ORCA-RRT* is an anytime algorithm, which can itera- tively improve the provided solution. We experimented with several running time limits and examined the differences in the provided solutions. In the future work we plan to de- ploy the ORCA-RRT* on hardware agents and investigate possibility of the extension towards non-holonomic agents. Acknowledgements. The presented work was supported by the Czech Republic Ministry of Education, Youth and Sports, grants no. 7H11102 (D3CoS) and no. LD12044, and by the ARTEMIS Joint Undertaking under the number 269336 (www.d3cos.eu). 6. REFERENCES [1] B. de Wilde, A. W. ter Mors, and C. Witteveen. Push and rotate: cooperative multi-agent path planning. AAMAS, 2013. [2] M. Erdmann and T. Lozano-Perez. On multiple moving objects. Algorithmica, 2:477–521, 1987. [3] J. Hopcroft, J. Schwartz, and M. Sharir. On the complexity of motion planning for multiple independent objects; pspace-hardness of the warehouseman’s problem. IJRR, 3(4):76–88, 1984. [4] Karaman and Frazzoli. Sampling-based algorithms for optimal motion planning. I. J. Robotic Res., 20(5):846–894, 2011. [5] S. M. LaValle and J. J. Kuffner. Randomized kinodynamic planning. I. J. Robotic Res., 20(5):378–400, 2001. [6] T. S. Standley. Finding optimal solutions to cooperative pathfinding problems. AAAI, 2010. [7] P. Surynek. A novel approach to path planning for multiple robots in bi-connected graphs. In Proceedings of ICRA 2009, pages 3613–3619, 2009. [8] J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha. Reciprocal n-body collision avoidance. In Robotics research, pages 3–19. Springer, 2011. [9] M. ˇC´ap, P. Nov´ak, J. Vokˇr´ınek, and M. Pˇechouˇcek. Multi-agent RRT*: Sampling-based cooperative pathfinding (extended abstract). In Proceedings of the AAMAS’13, 2013.