arXiv:1607.00562v1 [cs.RO] 2 Jul 2016 Integrated Task and Motion Planning for Multiple Robots under Path and Communication Uncertainties Bradley Woosley and Prithviraj Dasgupta Abstract—We consider a problem called task ordering with path uncertainty (TOP-U) where multiple robots are provided with a set of task locations to visit in a bounded environment, but the length of the path between a pair of task locations is initially known only coarsely by the robots. The objective of the robots is to find the order of tasks that reduces the path length (or, energy expended) to visit the task locations in such a scenario. To solve this problem, we propose an abstraction called a task reachability graph (TRG) that integrates the task ordering with the path planning by the robots. The TRG is updated dynamically based on inter-task path costs calculated using a sampling-based motion planner, and, a Hidden Markov Model (HMM)-based technique that calculates the belief in the current path costs based on the environment perceived by the robot’s sensors and task completion information received from other robots. We then describe a Markov Decision Process (MDP)-based algorithm that can select the paths that reduce the overall path length to visit the task locations and a coordination algorithm that resolves path conflicts between robots. We have shown analytically that our task selection algorithm finds the lowest cost path returned by the motion planner, and, that our proposed coordination algorithm is deadlock free. We have also evaluated our algorithm on simulated Corobot robots within different environments while varying the number of task locations, obstacle geometries and number of robots, as well as on physical Corobot robots. Our results show that the TRG-based approach can perform considerably better in planning and locomotion times, and number of re-plans, while traveling almost-similar distances as compared to a closest first, no uncertainty (CFNU) task selection algorithm. I. INTRODUCTION Multi-robot task planning and path planning are impor- tant problems in multi-robot systems when robots have to perform tasks at different locations within an environment. The problem is encountered in many applications of multi- robot systems such as automated surveillance [1], robotic demining [2], and automated inspection of engineering struc- tures [3]. As a motivating example, we consider a scenario for performing standoff detection of explosives or landmines using autonomous robots where multiple robots are provided with a coarse map containing locations of objects of interest. The robots are required to autonomously plan their paths to get in proximity of each object of interest so that they can analyse the object with their detection sensors. For realizing this, the main computational problem is to calculate a suitable task plan or ordering among the tasks for each robot so that a performance metric, such as the energy expended or the time required by robots to perform the tasks gets reduced. Researchers have *This work was partially supported as part of the COMRADES project supported by the U.S. Office of Naval Research and by a GRACA grant from the University of Nebraska Omaha B. Woosley is a graduate student and P. Dasgupta is a Professor with the Computer Science Department, University of Nebraska, Omaha, USA. {bwoosley, pdasgupta}@unomaha.edu Fig. 1. (a) Scenario showing two robots and six tasks, each task needs 1 robot to get completed, (b) Tasks selected by robots using CFNU algorithm; red-marked edges are unnavigable, blue dashed edges show re-calculated, collision-free paths, (c) Our proposed algorithm calculates a different task schedule for each robot that results in lower path costs by including uncertainty in path cost and availability in real-time into the task schedule calculation. Dotted edges show the task reachability graph (TRG) edges not followed by each robot due to higher expected costs; dashed edges are collision-free paths calculcated by the motion planner. proposed Multi-robot Task Allocation (MRTA) techniques [4] as well as multi-robot path planning techniques [5] to address this problem. However, on one hand, most MRTA techniques assume that the costs or distances between the task locations are fixed and known to all the robots as soon as they become aware of the task. This criterion might not be valid if the robots have a coarse map of the environment and the path cost between tasks can change dynamically as the robots discover obstacles in the environment, or if due to communication constraints, the delivery of a task completed message is delayed. On the other hand, path planning techniques account for dynamically discovered obstacles but they focus mainly on finding collision and conflict-free paths for robots and do not adjust the ordering between the waypoints or task locations that are being visited by the robots. Keeping a fixed order between tasks might result in unnecessary longer paths to complete the task schedule, especially when a dynamically updated path around an obstacle could induce a shorter task schedule. To address the above problem, it would make sense to investigate techniques that have a closer integration between the task and motion planning operations of robots. Researchers have proposed the Simultaneous Task and Motion Planning (STAMP) problem to investigate this problem in the context of robotic manipulation [6]. Our work advances this direction of research by proposing a framework called Task Ordering under Path Uncertainty (TOP-U) to address the STAMP problem in the context of a search and exploration scenario using wheeled, mobile robots. An example scenario shown in Figure 1 il- lustrates an example of the reduced time taken and distance traveled when robots use our proposed algorithm, in contrast with Closest First No Uncertainty (CFNU) algorithm used for task selection, that does not consider uncertainty in path costs and availabilities in its calculations. The main contributions of our paper are the following: We present a formalization of the problem called task ordering with path uncertainty (TOP-U), where multiple robots have to visits tasks whose locations are uncertain due to the presence of obstacles in the environment, while reducing the distances traveled be- tween the tasks. We propose a data structure called a task reachability graph (TRG) that is used to model the problem and a Markov Decision Process (MDP)-based algorithm that each robot uses to dynamically calculate its task schedule in real-time using the TRG. We also propose a distributed coordination algorithm for resolving deadlock scenarios due to path conflicts between multiple robots using our algorithm. We have proved analytically that our task scheduling algorithm is optimal and the coordination algorithm is deadlock free. We also provide extensive experimental results on simulated and physical Coroware Corbot robots, with different number of robots and tasks within environments with different obstacles geometries and task distributions. Our results show that our proposed TRG-based approach could perform up to 51% better in planning and locomotion times with 20% fewer replans, while traveling similar distances as compared to a closest first, no uncertainty (CFNU) selection algorithm. The rest of our paper is structured as follows: in the next section, we discuss existing research on MRTA and motion planning techniques. In Section III, we formalize the TOP-U problem and describe the robot task scheduling and multi-robot coordination algorithms. Sections IV and V describe our analytical and experimental results and finally we conclude. II. RELATED WORK Motion and task planning have been important problems in robotics. Several approaches for solving them have been proposed in literature over the past two decades, although these problems have largely been treated separately. In motion planning the objective is to find a collision free path for a robot so that it can navigate within its environment [7]. Sampling- based motion planners like probabilistic roadmap (PRM) [8] and Rapidly-exploring Random Trees (RRT) [9] have been used widely for motion planning. Recently, researchers have proposed extensions to these techniques by using methods to reduce the time required to calculate the paths and address the problem of moving through narrow passages [10], [11], [12], and handling uncertainty in obstacle locations [13], [14]. Researchers have also investigated the problem of coordinating the paths between multiple robots [15], [5] where robots exchange their individual paths with each other and mutually exclusive paths are calculated for each robot in the robots’ joint configuration space. To reduce the complexity of planning in the joint configuration space, a lightweight protocol was proposed in [16] where robots iteratively make way for one robot at a time to reach its goal until a consistent set of maneuvers have been determined for all robots to reach their goal. We use a complementary approach in this paper that allows robots to mostly calculate path plans individually in their local configuration space but if a set of robots get within close proximity of each other they use a conflict resolution algorithm to find collision free paths. Path planning in dynamic environments where the cost between the source and goal locations can change abruptly was addressed in [17], [18] using Markov Decision Processes (MDPs). In our proposed approach, path costs are also dynamically updated using the robots’ sensor information, and the updated path costs are used immediately to recalculate the task schedule using an MDP to allow for switching between tasks to reduce the cost of the total path length to visit all the tasks. The problem of finding a suitable ordering of operations or tasks to perform by multiple robots has been researched as the Multi-Robot Task Allocation (MRTA) problem [19], excellent reviews of MRTA are available in [20], [4]. Most of the approaches focus mainly on finding a suitable ordering of tasks while assuming appropriate robot motion planning techniques. Recently, researchers have addressed more tightly coupled task and motion planning under the simultaneous task and motion planning (STAMP) problem. The proposed solu- tion techniques combine symbolic task planning with control based techniques [21], [6], [22] for a mobile manipulation problem where task interdependencies form a critical aspect and reasoning using symbolic task planning is critical to determine the task precedence. In contrast, for our setting, reducing the cost of the task schedule is more critical than the order of tasks, and our algorithm uses probabilistic methods to quickly incorporate robots’ perceptions about the environment into its plan. III. TASK ORDERING WITH PATH UNCERTAINTY We consider a set of wheeled, mobile robots, R, deployed within an environment. Robots are capable of localizing themselves within the environment and can also communicate wirelessly with each other. The environment contains a set of tasks, T . Robots have to visit the locations of tasks to perform operations required to complete the tasks. Each task can require visits by one of more robots to get completed; the information about how many robots are required to complete a task is provided a priori to the robots. We consider tasks that are loosely coupled and all robots required to complete a task do not necessarily need to visit the task’s location at the same time. Each robot is initially aware of the locations of the tasks, but does not know the exact paths between the tasks1 or the obstacles along those paths. To represent this path 1In the rest of the paper, we have referred to task locations as tasks for legibility. uncertainty, each robot uses a task reachability graph (TRG), a fully connected graph with task locations as its vertices. Formally, a TRG is denoted by T RG = (V, E, C, P, t) where: • V (t) = {v(t) i ∪vcurr} is the vertex set and vcurr is the robot’s current location. Each v(t) i corresponds to a task location the robot is aware of at time t • E(t) = {e(t) ij : e(t) ij = (v(t) i , v(t) j )} is the edge set connecting the vertices in the TRG • C(t) = {c(t) ij } is the expected distance or cost expended by a robot to traverse the path underlying edge eij. ccurr,i denotes the expected cost from the robot’s current location (vcurr) to vi • P (t) = {p(t) ij } is the probability that edge e(t) ij is not available Owing to path uncertainties between task locations (TRG vertices), C(t) and P (t) are estimated from perceived sensor data and they get updated by the robot as it discovers obstacles and task availabilities while navigating between tasks. Let S : V →V denote a function that returns an ordering over the set of tasks. Each robot maintains its own TRG and plans its path using its TRG. The problem facing each robot to find a suitable order for visiting the tasks is specified by the Task Ordering under Path Uncertainty (TOP-U) problem below: TOP-U Problem. Given T RG = (V, E, C, P, t) represent- ing the set of tasks, inter-task costs and task availabilities at time t, determine a schedule S∗(V )(t) that induces an ordering (v1, v2, v3...) over the tasks, given by: S∗(V )(t) = arg min S(V )(t) X (vi,vj) ∈S(V )(t)(1 −p(t) ij )c(t) ij subject to: 0 ≤p(t) ij ≤1 X (vi,vj)∈S(V )(t) (1 −p(t) ij )c(t) ij ≤B(t) (1) where B(t) is the battery available to a robot at current time t. Note that S∗(V )(t) represents the path through the TRG with the minimum expected cost, weighted with availability. The second constraint above ensures that the robot is able to complete this path with its currently available battery. Note that {V } can change dynamically for a robot as tasks can get completed by other robots. An instance of the TOP-U problem corresponds to the well-known traveling salesman problem (TSP) [23]. However, solving the conventional TSP might not guarantee an optimal path as edge costs (cij) could change dynamically as robots discover previously unknown obstacles while traveling between tasks, while edge availabilities (pij) could change dynamically because some tasks got completed by other robots. To address the dynamic nature of the problem, we propose a Hidden Markov Model (HMM)-based method to update the edge availabilities, and then use the updated information within an MDP to find the desired ordering of the TRG vertices to solve the TOP-U problem. In the rest of the paper, for legibility, we have omitted the time notation from the TRG parameters, assuming it to be understood from context. A. Dynamically Updating Edge Cost and Availability Edge Cost Update. TRG edge costs correspond to the distance that the robot requires to travel to reach from one TRG vertex to another. Each edge cost is initialized to the Euclidean distance between the pair of TRG vertices forming the edge. However, if there exist previously unknown obstacles in the path between a pair of TRG vertices, then the distance the robot travels might exceed the Euclidean distance between the vertices. To accommodate the path distance uncertainty, the robot uses a probabilistic roadmap(PRM)-based path plan- ner [14] to dynamically update the expected edge cost. The PRM planner works by first generating a set of sampled points R from the robot’s configuration space. It then uses the available information about obstacles perceived by the robot from its current location vcurr to determine path segments that are close to obstacles and might result in collision with high probability; such segments are associated with a high penalty value. Following [14], the TRG edge cost calculations are given by the following steps. 1) Calculate the cost of each path segment that connects any two sampled points (ρ1, ρ2) ∈Qfree ⊆ℜ2 (Qfree is the free space in the environment) as: costρ1, ρ2 = pcoll ρ1,ρ2penalty + (1 −pcoll ρ1,ρ2)dist(ρ1, ρ2), where pcoll ρ1,ρ2 the probability of collision of segment (ρ1, ρ2) based on its distance to perceived obstacles, penalty is an arbitrary large number used to discourage paths that have a high probability of collision and dist(ρ1, ρ2) is the Euclidean distance between ρ1 and ρ2 2) Calculate the physical path ρij corresponding to TRG edge eij = (vi, vj) as a sequence of path segments ρ = (ρ1, ρ2)start...(ρ1, ρ2)end, given by: ρij = arg min ρ X (ρ1,ρ2)∈ρ costρ1, ρ2 s. t.: ρ1start = vi, ρ2end = vj 3) Calculate the expected cost cij for TRG edge eij as the sum of the costs of path segments in path ρij calculated in step 2 above, as: cij = X (ρ1,ρ2)∈ρij costρ1, ρ2 (2) Edge Availability Update. In our scenario, tasks are completed in a distributed manner by different robots and a task (or, TRG vertex for a robot) might get completed by other robots before the robot reaches it. When a task is completed, the last robot visiting the task broadcasts a task completion message to all other robots. Each robot then needs to remove the TRG vertex for the task from its TRG. Because message communication in unstructured environments might be unreliable, task completion message might be lost due to noisy communication, or, the robot broadcasting the message might be outside the communication range of some robots. To handle this uncertainty, it would be useful if the robot could infer whether the task was still available or not, from information related to the task’s availability that it can directly Slice (t+1) MO SO TNA PLL MO SO TNA PLL MO SO TNA P(PLL) P(PLL) 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 1.0 0 0.663 0.337 0.296 0.704 0.196 0.804 0.788 0.212 0.522 0.478 0.233 0.767 0.155 0.845 P(PLL|MO, SO, TNA) = 0.296 P(PLL|MO, SO, TNA) = 0.788 P(PLL|MO, SO, TNA) = 0.663 Sample inhibition probabilties CPT for PLL MOt P(MOt+1) T 0.3 F 0.001 CPT for MO SOt P(SOt+1) T 0.8 F 0.2 CPT for SO TNAt P(TNAt+1) T 0.98 F 0.02 CPT for TNA to slice (t+2) from slice (t-1) Slice 't' Fig. 2. Temporal Bayesian network used in the HMM for determining the suitability of path length to a task. observe. For our problem, we assume that this observable information related to task availability is the distance or path length remaining to reach the task - if the path length is very large, it could be due to the task becoming unavailable2. One caveat to using the path length as an indicator of task availability is that it is also affected by obstacles along the path; it changes dynamically as the robot encounters obstacles while going towards the task. The problem facing the robot then is to observe the path length values over the recent past and infer from it whether the task is still available. To model this inference problem, each robot uses a Hidden Markov Model (HMM) [24]; one HMM, HMMij, is used to update the availability of each TRG edge eij. The crucial HMM variable is Path Length Long (PLL) that evolves temporally as the robot encounters static obstacles (SO) and mobile obstacles (MO) on its range sensor, or, receives a task completion message called task not available (T NA), as shown in Figure 2. Variables SO, MO and T NA are binary- valued and they too evolve with time as the robot moves towards the task and encounters obstacles, or receives task completion message. The temporal transition model is given in Figure 2 via the arrows moving between dashed boxes, the sample temporal probabilities of these variables are also provided. The dependencies between these variables affecting PLL are captured in each slice of the HMM as shown inside the dashed boxes in Figure 2. Because each of the variables affecting PLL - static obstacles, mobile obstacles and task not available - do not affect each other and can be considered as independent of each other, their probabilistic effect on PLL can be combined relatively easily from the individual inhibition probabilities for these variables using a noisy-OR model. An example noisy-OR based probability calculation for PLL is shown alongside Figure 2. We assume that the environment has a 98% communication success rate, or 2% communication failure rate, leading to the probability value 2Note that when a TRG edge is removed, it can be looked upon as the edge length becoming infinitely large. given in Figure 2. To solve the problem of calculating the probability of a task being still available from PLL values, the robot first calculates the observed value of PLL variable for the current time step. For HMMij, the observed value of variable PLLij for current time step t is determined by assuming that PLL is very large when it is Γ times more than the minimum cost of any edge in the current TRG, as given by the equation below: PLL(t) ij = ( FALSE if c(t) ij ≤Γ min({c(t) ik : ∀k ∈V }) TRUE otherwise (3) where Γ is a user defined constant that is based on system and environment factors such as battery remaining, terrain and navigation conditions. The sequence of values for PLL(1...t) ij is recorded, and used to estimate the probability of the state variable T NA(t) ij , given by p(T NA(t) ij |PLL{1:t} ij ), using the Forward-Backward algorithm [24]. The forward stage is given by the equation: P(X(t+k+1)|PLL{1:t} ij ) = X T NA(t+k) ij  P(X{t+k+1}|T NA(t+k) ij ) P(T NA(t+k) ij |PLL{1:t} ij )  and the backwards stage is given by: P(PLL{k+1:t} ij |X(k)) = X T NA(k+1) ij  P(PLL(k+1) ij )|T NA(k+1) ij ) P(PLL{k+2:t} ij |T NA(k+1) ij )P(T NA(k+1) ij |X(k))  where X(t) is the combination of the set of state variables, MO, SO and T NA at time t, PLL{1:t} ij is set of evidence (PLL observations) from time 1 through t, and T NA(t) ij is the value of the variable T NAij at time t. Finally, to integrate the calculated value of T NAij with the TRG edge eij, we model the task availability as probabilistic availability pij of TRG edge eij. pi,j gets a value 1 when the robot is certain that the task is available and there exists a finite distance path to reach it, and, 0 when the path to reach the task is infinitely large meaning the task is not available; intermediate probabilities represent the uncertainty of the task not being completed by other robots and still remaining available to the robot. pij is calculated by normalizing Equation 3, given by: pij = p(T NAij|PLL{1:t}) P j p(T NAij|PLL{1:t}). (4) The normalization ensures that the robot has a probability 1 of leaving TRG vertex vi through at least one of its incident edges. B. TOP-U Solution using Markov Decision Process Following the update of the edge costs and availabilities, the robot has to select the TRG edge with the minimum Start robot, init TRG Robot stop Update TRG (Alg. 2) Calculate MDP policy (Alg. 2) Plan path (PRM) to task; Navigate towards task Multi-robot collision OR insufficient battery Path cost task OR reached task not available Next task as per MDP policy No collision with other robots Broadcast current location reached T not ask changed OR (Alg. 1) avoidance (Alg. 3) No more tasks Fig. 3. State diagram showing the operation of a robot using the different algorithms proposed. 1 TRGTaskSelect(T RG =< V, E, P, C >) Input: T RG: task reachability graph 2 Build initial PRM roadmap 3 Initialize MDP with current TRG information 4 Determine paths in robots configuration space using PRM planner between all TRG edges eij = (vi, vj) ∈E 5 vcurr ←current position of robot 6 while V is not empty AND battery available for next vertex do 7 v′ ←Next task as per MDP policy 8 path ←PRM path between vcurr and v′ 9 while v′ not reached do 10 Update vcurr using localization system 11 Broadcast vcurr to other robots 12 //avoid collisions w. other robots (Alg. 3) 13 collision ←coordinatePath(T RG, v′) 14 if collision = FALSE then 15 if (taskCompleted message recd. from anothe robot) OR (new obstacle detected in robot’s path) then 16 (v′, path) ←updateTRG(T RG, v′) (Alg. 2) 17 end 18 Move along current segment of path 19 end 20 end 21 Remove v′ from V //reached v′ 22 Communicate completion of task to all other robots 23 end Algorithm 1: Algorithm to select a task in the TRG using an MDP-based policy. 1 updateTRG(T RG =< V, E, P, C >, v′) Input: T RG: task reachability graph; v′: destination TRG vertex Output: v: destination TRG vertex, path: path to destination TRG vertex 2 update V removing completed tasks, if any 3 for (vi, vj) ∈E do 4 path′ ←replan path from vi to vj (PRM) 5 cij ←pathLength(path′) 6 end 7 path ←replan path from vcurr to v′ using PRM-planner 8 Generate observations PLLij for every eij in TRG 9 Update pij using HMM in Eqn. 4 for every eij 10 Update MDP, TRG with new values of V , pij, cij for every eij 11 vnew ←Next task as per updated MDP policy (Egn. 5) 12 if vnew = {∅} then 13 return null; // No more tasks 14 end 15 if vnew ̸= v′ then 16 v′ ←vnew; // Switch tasks 17 path ←PRM path between vcurr and v′ 18 end 19 return v′, path Algorithm 2: Algorithm to update TRG and path when TRG vertices are removed (task completed) or a new obstacle is detected that triggers a path re-calculation. expected cost, weighted with availability to solve the TOP- U problem (Equation 1). Because of the uncertainties in edge costs and availabilities, a Markov Decision Process(MDP) is used to do this. An MDP [24] consists of a set of states, a set of actions to transition between states, along with a probability distribution and reward for each action at each state. The output of an MDP is a policy that prescribes an action at each state, which maximizes the cumulative, expected reward to the robot to reach a desired or goal state from its current state. A more thorough discussion on MDPs and solution techniques is given in [24]. For our TOP-U problem, the TRG’s vertices, V , represent the MDP’s states, the set of actions at each state (TRG vertex) of the MDP correspond to the edges from that TRG vertex, TRG edge availabilities give the transition probabilities between MDP states, while the inverse of the TRG edge costs correspond to the reward for reaching each state in the MDP (lower edge costs corresponds to larger rewards). The policy calculated by the MDP gives the maximum expected reward (or minimum expected cost, weighted by availability) for the robot to visit the TRG vertices. The MDP is solved using the value iteration algorithm, that solves the following equation: U(vi) = c−1 curr,i + γ max eij∈E X vk P(vk|vi, eij)U(vk) (5) where c−1 curr,i is the inverse expected cost from the robots current location to vi, γ is a user-defined, reward-discount factor and P(vk|vi, eij) is the probability that the robot will reach task vk when starting at task vi and attempting to follow the edge eij towards task vj which may or may not be the same task as vk. vk ̸= vj happens if the robot was to attempt going to task vj, but due to obstacles, communications, etc. it determines that it is better to instead go to task vk. The equation for P(vk|vi, eij) is given below, which if vk = vj, the probability is the edge availability, and if vk ̸= vj, then it is the probability of the edge not being available distributed evenly to the remaining tasks. P(vk|vi, eij) = ( 1 −pij if j = k pij |V |−1 otherwise (6) C. Robot Navigation and Multi-robot Path Coordination Al- gorithms The main algorithm used by a robot for selecting tasks to visit is shown in Algorithm 1. The main idea of the algorithm is to select the task, v′, determined by the MDP policy, and plan a path to reach it. If the path results in potential collisions with other robots’ paths, path conflicts are resolved (line 11). Every time the path cost to a task changes due to obstacles, or a task completed message from another robot is received, a TRG update is triggered (line 14). This might result in switching the task the robot is headed to. The robot continues to move towards its currently selected task until it is reached and upon reaching the task its removes its vertex from the TRG and broadcasts task complete message to other robots (lines 16 − 20). The algorithm used to update the TRG is shown in Al- gorithm 2. When a robot’s TRG vertex set or path costs on the TRG change, it calculates a new navigation path to its destination vertex v′ (lines 2 −7) and new edge availability values using its current perception in the HMM (line 8 −9). These updated values are incorporated into the MDP and the MDP’s policy is recalculated to yield the new destination vertex (line 10 −11). If the recalculated policy prescribes a new target vertex, vnewthen the robot performs a task switch and its destination vertex is changed from v′ to vnew (line 15 −17). The algorithm also handles the case where all tasks in a robot’s TRG might get completed by other robots before it reached those tasks; in that case the algorithm returns a null vertex and empty path(lines 12 −13) so that the robot stops. D. Coordinating paths between robots to avoid collisions If robots determine their paths individually using Algo- rithm 1, it could lead to robot collisions when the planned paths of two or more robots intersect with each other. To avoid this scenario, we have used a collision avoidance algorithm shown in Algorithm 3. Each robot uses the locations broadcast by other robots to check if there are other robots within a radius of rcoll, called the collision circle, of itself (lines 2). When a set of robots are within the collision circle of each other, all the robots stop and the robots exchange their identifiers, representing their priorities, with each other. A leader election algorithm called the bully algorithm [25] is then used to select the robot with the highest priority as the winner. The winner robot holds the winner token, which gives 1 coordinatePath Input: v′: destination TRG vertex; T RG: task reachability graph Output: Is the robot currently in a collision 2 if another robot within Oi then 3 stop 4 build/update collision shape 5 if previous winner token released then 6 prio ←robot id 7 end 8 //priority is either robot id or ∞ 9 send/receive priority to/from other robots within Oi 10 select robot with lowest prio in Oi as winner 11 if I was winner, but lost this round then 12 transfer winner token to winning robot 13 end 14 if I am winner then 15 (v′, path) ←updateTRG(T RG, v′) 16 //other robots considered as static obstacles in PRM 17 if v′ = null then 18 //No more valid paths available to the robot 19 prio ←∞ 20 end 21 else 22 Move along current segment of path 23 end 24 if outside Oi then 25 release winner token 26 end 27 end 28 else 29 if All priorities in collision shape are ∞then 30 path ←performJointPlanning(TRG) 31 if path = null then 32 exit FAILURE 33 end 34 Set all robots prio to robot id 35 return TRUE 36 end 37 end 38 return FALSE 39 end 40 return FALSE Algorithm 3: Modified Algorithm to avoid collisions between robots in close proximity of each other. it the right to move (lines 3 −7). All other robots in the collision circle, which do not hold the winner token, remain stationary (line 24 −26). The winner robot uses the PRM planner in conjunction with updating the TRG using Algorithm 2 to find a path to its destination vertex v′. The path returned by the PRM planner is executed and the moving robot releases the winner token once it is outside its collision circle (lines 9, 14−22). If the PRM planner is not able to find a path to the goal, e.g., if the goal is unreachable because there is another robot within the collision circle that is stopped right at the goal 1 2 3 1 2 3 4 5 6 7 (a) (b) Fig. 4. (a) Example collision circle. The boxes represent the robots, the numbers are the robot’s ids, and the circles represnt the circle of size rcoll around the robot. Collision circle for robot 1 is O1 = {1, 2, 3}, (b) Two example collision shapes. The boxes represent the robots, the numbers are the robot’s ids, and the circles represnt the circle of size rcoll around the robot. Collision shape for robot 1 is C1 = {1, 2, 3, 4, 5}, Likewise collision shape for robot 6 is C6 = {6, 7}. location, the moving robot relinquishes its right to move by setting its priority to a high value (∞) (lines 11−12). Another robot from within the set of stopped robots gets a chance to run the bully algorithm and attempts to move. This protocol ensures that at least one robot exits the collision circle with each execution of the bully algorithm, and finally there is only one robot left inside the collision circle. This robot then reverts to using Algorithm 1 to plan its path. IV. THEORETICAL RESULTS In this section, we prove some properties related to our proposed algorithms. A. Task Selection Algorithm Properties Theorem IV.1. Algorithm TRGTaskSelect finds a solution that is admissible, that is, it never overestimates the expected cost to a task calculated using the TRG. Proof: (By Contradiction.) Let us suppose there is another algorithm A′ that selects TRG edges with lower expected cost than Algorithm TRGTaskSelect. Let vj and vj′ (vj ̸= vj′) respectively denote the task (vertex) selected by Algorithm TRGTaskSelect and A′, when the robot is at vertex vi. Using Equations 5 and 6, the MDP in Algorithm TRGTaskSelect (Alg. 2, line 11) calculates the expected cost to reach vj from vi using TRG edge eij as (1 −pij)cij. Algorithm A′, which does not follow the vertex recommendation made by Algorithm TRGTaskSelect, selects a different TRG edge eij′, which has cost cij′ with probability (1 −pij) and TRG edge eij with probability pij. The robot’s expected cost to reach vj′ using Algorithm A′ is then (1 −pij)cij′ + pijcij. Since, by Equation 5, the MDP selects the edge with the minimum cost, it follows that, cij < cij′; let cij′ = cij + ∆, ∆> 0. By our assumption that the expected cost calculated by A′ is lower than that of Algorithm TRGTaskSelect, we get: (1 −pij)(cij + ∆) + pijcij < (1 −pij)cij (7) or, (1 −pij)∆+ pijcij < 0 or, pij > ∆ ∆−cij or, pij > 1 1 −cij ∆ Because, by definition, cij and ∆are both > 0, we get pij > 1, which is not valid as pij ∈[0, 1]. Therefore, the expected cost returned by A′ cannot be lower. Hence proved. Lemma IV.2. The solution found by Algorithm TRGTaskSe- lect remains admissible when there are errors in the edge cost and availability estimates. Proof: Let cij,est, ǫc and pij,est, ǫp denote the estimated values and errors in edge costs and availabilities returned by the PRM and HMM respectively,(using Algorithm 2, lines 7 and 9), where ǫc ∈R, ǫc ≪cij and pij,est ∈[0, 1] and 0 ≤pij + ǫp ≤1. If the true cost and availability of edge eij is denoted by cij and pij respectively, cij,est = cij + ǫc and pij,est = pij + ǫp. For Algorithm TRGTaskSelect to not overestimate edge costs with these estimated values Theorem IV.1 should hold when they are used in Equation 8. That is, (1 −pij,est)(cij,est + ∆) + pij,estcij,est > (1 −pij,est)cij,est. Substituting above values of cij,est and pij,est we get: (pij + ǫp)(cij + ǫc) + (1 −(pij + ǫp))∆> 0 (8) As pij has limited domain of [0, 1], and the above equation is linear in pij, we can check the two boundary points of the domain; if the inequality is satisfied for both boundaries, then it is satisfied for all points inside the domain. Note that then when pij = 0, ǫp ∈[0, 1] and when pij = 1, ǫp ∈ [−1, 0]. Substituting pij = 0 in Equation 8, we get ǫp(cij + ǫc + (1 −ǫp)∆> 0. Since ǫp ∈[0, 1] and cij, ∆> 0, each of the terms on l.h.s. of the last inequality is > 0 and the inequality holds. Similarly, substituting pij = 1 in Equation 8, we get (ǫp + 1)(cij + ǫc) −ǫp∆> 0. Since ǫp ∈[−1, 0] when pij = 0, we substitute the boundary values of ǫp in the last equation; when ǫp = −1, we get ∆> 0 which is valid from the definition of ∆; when ǫp = 0, we get (cij + ǫc) > 0, which is valid, because, by definition ǫ ≪cij. Theorem IV.3. The solution found by Algorithm TRGTaskS- elect is consistent. Proof: Suppose the robot is at vertex vi and vj is the next vertex selected by Algorithm TRGTaskSelect. For consistency property, we need to show that cij < cik + ckj for any vk ̸= vi, vj. We prove by contradiction - suppose cij > cik + ckj. Note that the TRG is a complete graph and vertices vi, vj and vk form a triangle. Consequently, ||eij|| < ||eik||+||ekj||, where ||eij|| is the Euclidean distance between vi and vj. From Section III-A, the path ρij for navigating the robot along TRG edge eij has cost cij; it is found by the path planner and from Equation 2, it is guaranteed to be the minimum cost, collision-free path connecting vi and vj. In other words ∄q ∈Qfree, satisfying cij > ciq + cqj. Because a task location has to be in the free space in the environment, vk ∈Qfree and, so, the last inequality is valid for q = vk. Therefore, cij ≯cik + ckj. Therefore, our assumption was invalid. Hence proved. Theorem IV.4. The solution found by Algorithm TRGTaskS- elect is optimal. Proof: From Theorems IV.1 and IV.3 it follows that the solution found by Algorithm TRGTaskSelect is both admissible and consistent. Therefore, the solution is optimal. Similar to Lemma IV.2, it can be shown that the solution found by Algorithm TRGTaskSelect is consistent and hence optimal, even with errors in path cost estimates. B. Completeness of Coordination Algorithm Next, we analyse the synchronization properties of our proposed multi-robot coordination algorithm to show that it does not give rise to deadlock or livelock conditions between robots, resulting in their inability to move and reach tasks. To facilitate this analysis, we consider the movement of the robots between sets, corresponding to robot states defined by the algorithm. Let dist(ai, aj) denote the Euclidean distance between robots ai, aj ∈R. We define the collision circle for robot ai as Oi = {aj : aj ∈R, dist(ai, aj)≤rcoll}, where rcoll is the distance away from robot ai that we consider robots to be in immediate risk of collision. An example of a collision circle can be seen in Figure 4(a), where robot 1 has robots 2 and 3 in its collision circle. This means that O1 = {1, 2, 3}. We next define the concept of a collision shape. A collision shape is the group of all robots that are either in each others collision circle, or, through sharing collision circles with other robots, can reach the collision circle of another robot without exiting any overlapping collision circles. As shown in Figure 4(b), robot 3 is not in the collision circle of robot 1, however it is in the same collision shape as robot 1, because by traveling through the collision circle of robot 2, robot 1 can reach the collision circle of robot 3. On the other hand, robot 6 is not in the collision shape of robot 1 because there is no way to move into the collision circle of robot 6 without leaving collision circles. To help define the collision shape, we first define a recursive helper set H[n] i , corresponding to robot ai ∈R at recursive step n, as H[n+1] i = H[n] i S j∈H[n] i Oj and H[0] i = Oi. As n →∞, H[n] i becomes the set of all robots in the current collision shape. We can now define a collision shape as: Definition 1. Let H[n] i denote a helper set of robot ai, as defined above. Then collision shape Ci = limn→∞H[n] i The collision circle and collision shape of a robot get updated as the algorithm proceeds. We use O(t) i and C(t) i to denote the collision circle and collision shape of robot i at round t respectively. We now define three subsets of C(t) i used in our analysis. • W (t) i ⊆C(t) i : The set of all robots in the collision shape of robot ai waiting for their turn to move • L(t) i ⊆C(t) i : The set of all robots in the collision shape of robot ai that are allowed to move • S(t) i ⊆C(t) i : The set of all robots in the collision shape of robot ai that have surrendered their right to move We also define a set X(t) i as the set of all robots that were originially in the collision shape of robot ai, but have since left. Formally, X(t) i = St j=0 C(j) i \C(t) i . The elements of the above sets are mutually exclusive and the subsets fully partition the set C(t) i , in other words the following properties hold: W (t) i ∩L(t) i = {∅}, W (t) i ∩S(t) i = {∅}, L(t) i ∩S(t) i = {∅}, W (t) i ∪L(t) i ∪S(t) i = C(t) i , and X(t) i ∩C(t) i = {∅} Lemma IV.5. If during round t, there are robots currently waiting to be allowed to move, then in the next round, there has to be at least one robot that is selected for movement. That is, if W (t) i ̸= {∅} then L(t+1) i ̸= {∅}. Proof: Line 10 selects one element aj from each O(t) i where aj ∈Ci for membership in L(t) i . The robot selected can be a member of one of two sets, aj ∈L(t) i or aj ∈W (t) i . If aj ∈L(t) i , then L(t+1) i ̸= {∅}, because aj will remain in L(t) i . And if aj ∈W (t) i , then because aj ̸= ∅in the next time step, L(t+1) i = L(t) i ∪{aj}, which means L(t+1) i ̸= {∅}. Therefore, we can conclude that if W (t) i ̸= {∅}, then in the next time step L(t+1) i ̸= {∅}. Lemma IV.6. coordinatePath deadlocks when no robots are allowed to move and there are no robots available to be selected for movement. This occurs when L(t) i = W (t) i = W (t−1) i = {∅}, and C(t) i ̸= {∅}. Proof: As, C(t) i = {∅} is the termination case, C(t) i ̸= {∅} is a necessary condition. When the robots deadlock, that means that for the rest of time, no robots are allowed to move. Or, mathematically, ∃t′ such that L(t) i = {∅} ∀t ≥t′. By Lemma IV.5, this implies that W (t) i = {∅} ∀t ≥t′ −1. This implies that for each round t that is in deadlock L(t) i = W (t) i = W (t−1) i = {∅}. Theorem IV.7. Algorithm coordinatePath does not deadlock. We analyse the operation of the collision avoidance algo- rithm as a method to move robots between two sets, Ci and Xi. Initially, all robots in Ci are placed in Wi, and from Wi, movements through Si and Li are possible until the movement into Xi is possible. Based on the above descriptions, a deadlock can only occur when no robot is allowed to move, meaning that no winners have been selected, and there are no available waiting robots to become winners. Proof: By Lemma IV.6, coordinatePath deadlocks when L(t) i = W (t) i = W (t−1) i = {∅} and C(t) i ̸= {∅}. As C(t) i ̸= {∅}, and C(t) i = L(t) i ∪W (t) i ∪S(t) i , this implies that C(t) i = S(t) i . Line 29 tests for this condition when all robots remaining in the collision shape have a priority of ∞. This causes joint configuration space planning to be called, which is guaranteed to find paths for all robots, if such paths exist, or a failure when paths do not exist, but does not deadlock. If paths do exist, all of the robots in the suspended state S(t) i are transitioned into the waiting state W (t+1) i , on line 34. In other words S(t+1) i = {∅} and W (t+1) i = S(t) i . The conditions for deadlock given in S W L X Fig. 5. State diagram for Algorithm coordinatePath. Lemma IV.6 no longer holds as W (t+1) i ̸= {∅}. Definition 2. A livelock occurs for algorithm coordinatePath when there is no robot in the collision shape that is able to reach the sink state X through a finite set of state transitions across the W, L, and S states. Lemma IV.8. Any cycle in the state transition graph that contains state S cannot be a livelocking cycle Proof: Once a robot enters state S, there are only two scenarios, stay in S or exit state S by going only to state W. In the first case, there will come a time where either all robots have entered S. This triggers a joint planning, which is guaranteed to not result in livelock. In the second case, the only way a robot in S is allowed to reach W is if another robot makes a transition from L to X. This implies, from the definition of livelock, that algorithm coordinatePath is not in livelock, because robots outside of the coordinate path algorithm can make progress towards their tasks uninhibited. Lemma IV.9. There are only a finite number of cycles or paths through the state graph in Figure 5 that could feasibly result in a livelock scenario Proof: Based on Lemma IV.8, any cycle that moves through S can be eliminated from the list, which leaves only two possible states to move through. Then based on the properties of finite state machines, all cycles which result in multiple visits of the same state in a row can be collapsed into the equivalent cycle visiting that state only once. As such, this leaves us with only three potential cycles, W, L, and W →L. Lemma IV.10. It is not possible for robots to eternally stay in state L. Proof: All robots in L must remain moving and may not remain stopped for any time not required for planning or sensing. The paths returned by PRM are the shortest path in the roadmap to the goal possition. Based on these two facts, the robot must either move out of the collision shape, or enter a collision with another robot causing its transition either into W or S depending on if the collision was with a higher priority robot, or if the collision caused all feasible paths for the robot to become blocked. Lemma IV.11. It is not possible for robots to eternally cycle between sets W to L and back to W. Proof: Transitions from W to L occur when a robot is the highest priority robot in its collision circle O. Transitions from L back to W occur when a robot loses its position as the highest priority robot in its collision circle, when it was moving. The robot would still be able to move towards its goal if the higher priority robot was not there. Also, consider the worst case scenario, where, as each robot selected from W to get into L, moves and encounters a robot of higher priorty, causing it to transition from L back to W. Even assuming that L contains only one robot at a time, as the set of robots is finite, and all robot ids are unique inside W or L, there must exist at least one robot with highest priority, which, once entering L, can not be moved back to W as there is no other higher prority robot to prevent its movement. This highest priority robot only has two destination states that it can reach, S or X; either of which breaks the loop W −L−W. Therefore, a scenario where a robot cycles between W −L −W states cannot exist. Lemma IV.12. It is not possible for robots to eternally stay in the state W Proof: This would be considered a deadlock state. For robots to stay in W, there must be a higher priority robot inside their collision circle that is selected to move to L. All members of L must remain in motion, otherwise they become members of S, and no longer the highest priority robot in the collision circle, causing a new robot from W to be selected. Robots in L can not remain perennially inside the collision circle as the path planner PRM generates a path to goal. If a feasible path is found by the planner, the robot will leave the collision circle by transitioning from L to X. On the other hand, if a feasible path cannot be determined, the robot enters S. Theorem IV.13. Algorithm coordinatePath does not result in livelock. Proof: Livelock is when the robots are still able to move, however they make no progress towards completing their goals. To do this, the robots must remain in states that allow motion, but never reach their goal locations (tasks), or exit their collision shapes. Figure 5 shows all of the possible transitions between states of the coordinatePath algorithm. Through inspection of this figure, we can determine that livelock occurs when the robots follow any cycle in the graph without any robots entering state X. This provides the following livelock possibilities for the robots: 1) Stay in W 2) Stay in L 3) Cycle between states states W and L Lemma IV.10 shows that the option 2 cannot exist. Lemma IV.11 shows that option 3 cannot exist. And lemma IV.12 shows that option 1 cannot exist. Based on this, we can conclude that coordinatePath does not livelock. V. EXPERIMENTAL RESULTS To verify our approach, we have run several experiments on the Webots robot simulator using a model of the Coroware Corobot robot, as well as on physical Corobot robots. The TABLE I MAPPING BETWEEN ENVIRONMENT PARAMETERS AND LOAD VALUES Tasks Robots Visits Load 5 3 1 1.67 10 3 1 3.33 5 1 1 5 10 3 2 6.67 10 1 1 10 15 3 2 10 15 1 1 15 15 3 3 15 0 5 10 15 20 0 5 10 15 Switching Replans Load TRG CFNU (a) (b) 0 5 10 15 20 4 6 8 10 12 14 16 Switching replans Load TRG CFNU (c) (d) Fig. 6. Average number of replans by each robot which (a) results in a task switch, and (b) does not result in a task switch, using the TRG and CFNU approaches for different values of T, R and V is. Values on the x-axis are arranged in increasing orders for average robot task load L. (c) Switching replans single robot case (d) Non-switching replans single robot case Corobot is a 4-wheel 30cm × 30cm, skid-steer robot which we control in a differential wheel manner. It is equipped with an indoor Stargazer localization system [26] which provides 2D location and heading, Hokuyo laser distance sensor which provides a 270◦range up to 5m in distance at a resolution of 1 3 ◦, and wireless communications. The on-board computer has an 1.6GHz Intel Atom 330 processor with 4GB of RAM. The simulator provides a Gaussian distributed noise to sensor and actuator motions. The environments have a size of 10 × 10 m2; obstacles were placed at different locations within the environment. The number of tasks (T ) was varied over 5, 10 and 15. The task locations were selected in a way that if there were a single robot in the environment following a closest first task selection strategy, it would cause the robot to switch tasks for 50% of the replans it does. When there are multiple robots, one of them is selected arbitrarily and placed at a location that would effect the aforementioned 50% task switching for replans. Any other robots are placed randomly in the envi- ronment, while keeping an even distribution. For the different settings we have used, the robot positions are the same for all runs in an environment. The number of robots (R) used were 1 and 3 . For each task a certain number of robots had to visit it to complete it; the variable for visits per task (V is) was varied over 1, 2, and 3. To present our results in a concise manner, we have used a parameter representing each robot’s average task load, given by L = V is×T R . The different combinations of T, R and V is, and corresponding values of L used for our experiments are given in Table I. User-defined constants were set to γ = 0.8 (Discount rate of MDP in Equation 5) and Γ = 1.5 (PLL parameter in Equation 3), unless otherwise stated. Probability values used in the HMM were similar to those given in Figure 2, which were based on average times for a robot for avoiding static and mobile obstacles within the environments used for the experiments. All results were aver- aged over 10 simulation runs. We have compared the quality of task ordering performed by our proposed technique with an approach where the task ordering is done using CFNU [27] - each robot selects the task that has the least cost on its TRG from its current location, without modeling uncertainties in the inter-task paths or updating TRG edge availabilities. The CFNU algorithm switches tasks when another task is closer to it than the currently selected task, as measured using straight line distance. To enable comparison, both task ordering approaches use the same underlying path planner and multi-robot coordination mechanism, when necessary. We have reported three metrics for quantifying the performance of the algorithms - the distances traveled by the robots, the number of replans (with and without task switches) and the times (planning and locomotion) taken to visit all tasks. Also, to understand only the performance of our MDP-based task selection method, we have reported the metrics separately for single robot scenarios, where |R| = |V is| = 1. That is,there is only 1 robot that has to visit all tasks once, and edge availability is affected only by previously unknown obstacles; effects due to communication uncertainties of TaskComplete messages sent by other robots do not arise. Figures 6(a) and (b) show the number of replans made by each robot, resulting in and not resulting in task switches respectively for the two algorithms for the different load values shown in Table I. We observe that, on average, the TRG-based approach results in 40% less replanning and 61% less task switching than the CFNU approach. The reduced planning and task switching by the TRG-based algorithm can be attributed to its ability to reason more efficiently about task availabilities using its costs and beliefs about paths in the MDP based approach, along with real-time sensor data incorporated into its decisions using the HMM. In contrast, the CFNU approach uses only Euclidean distances to select tasks and consequently performs poorly. In Figures 6(c) and (d), the number of replans resulting in and not resulting in tasks switches are shown for the single robot cases. We see that the switching replans are the same for both approaches because there are no other robots in the environment which could complete the task before the robot reaches it first. We also see that the TRG performs fewer non-switching replans. In Figures 7(a) and (b), we show the average time taken for both approaches, this includes path planning and task ordering times, and, locomotion times for each robot, which includes time taken to resolve inter-robot path conflicts using Algorithm 3. The TRG-based approach takes much less time (a) (b) (c) (d) Fig. 7. Average times taken by TRG and CFNU approaches. Top row is for multi-robot scenarios, bottom row is for single robot scenarios. The left column shows time for planning paths and solving the MDP for the TOP- U problem. The right column shows time taken traveling between tasks and performing collision avoidance. Experiments performed for different values of T, T and V is. Values on the x-axis are arranged in increasing orders for average robot task load L. (a) (b) Fig. 8. Percent of replans resulting in a task switch (a) all cases (b) single robot case for both planning and navigation compared to the CFNU approach. In this case, the TRG-based approach requires up to 51% less planning time, and, up to 75% lower locomotion and coordination times than the CFNU approach. This is because the TRG-based approach accounts for both the known obstacles between tasks and the likelihood that the task will become unavailable. The CFNU approach behaves myopically and selects the closest task to visit, which could be on the other side of a large obstacle and require considerable planning and locomotion times to reach. In contrast, the TRG-based approach uses the robots perception of the environment to weight the path costs to tasks with the corresponding path belief to reduce the overall path costs. Note that when the number of tasks is small, or the average task load per robot is close to 1, both algorithms have comparable performance for all three metrics as each robot has to visit only one task and there is no task ordering required. Figures 7(c) and (d) show the average time taken by TRG and CFNU approaches in the single robot case as the average robot load increases. We can see that our TRG approach takes less planning time compared to the CFNU approach. We also observe that as the average task load L of the robots increase (from left to right on the x-axis), the distances traveled by the robots increases. The robots using the TRG- based approach travel distances between −16% (more) to 32% (less) than the CFNU approach, with an average improvement of about 6% (less distance traveled) across all experiments. The TRG-based approach sometimes travels a small amount more than the CFNU approach when the robot decides to abandon its current task for another task. In that case, the CFNU approach will switch as soon as the other task becomes closer, which in some cases is the best decision, where as the TRG-based approach will continue to follow its previous task even though another task is closer. In some cases this might be the best thing to do because the closer task might be on the other side of a wall that the robot has yet to discover and actually require more distance to explore enough of the wall to realize that the task is no longer the closest; TRG performs better in such scnearios. In some of the environments with a larger load value, the TRG-based approach starts to perform better. We also tested our proposed algorithm on physical Corobot robots. We used the environment shown in Table II (left) where the white dots represent the tasks that the robot must visit. This environment was also designed such that a CFNU closest- task-first algorithm would switch 50% of the time. We tested for a fixed value of Γ = 5.0, and tested for |R| = {1, 2}, V is = {1, 2}, and |T | = 5. Results were averaged over three runs, and are shown in Table II (right). In both environments, the robots were able to navigate and visit the tasks with the required number of visits. In comparing these results to those shown in Figure 6, we can see that the hardware performed fairly similar. For example, Table II shows that the five tasks, one robot, one visit environment had four switching replans, which is within the margin of error for the switching replans. VI. CONCLUSIONS AND FUTURE WORK In this paper, we introduced the TOP-U problem where robots have to determine the order to visit a set of task locations when the path costs between a pair of tasks can vary dynamically as the robot discovers obstacles while navigating between tasks. We proposed a data structure called a task reachability graph (TRG) along with techniques to integrate the task ordering with uncertainty in path costs and availabil- ities. Our analytical results show that our proposed algorithm is optimal and complete. The algorithm’s performance was also evaluated extensively through experiments and was shown to result in reduced time and fewer computations (less task switching) as compared to an algorithm that does not consider uncertainty in path costs and task availabilities. As future work, we are considering analyzing situations with stricter constraints such as a partial ordering over the task set. In the present work, multi-robot coordination is handled using a light-weight coordination mechanism where all robots, except one, stop. We are investigating techniques to integrate tighter, but fast, coordination approaches to improve the coordination of robots, such as exchanging path plans and calculating a plan TABLE II OVERHEAD VIEW OF ENVIRONMENT USED FOR TESTING WITH 2 COROBOT ROBOTS AND 5 TASKS; WHITE DOTS REPRESENT TASK LOCATIONS (LEFT) AND EXPERIMENTAL RESULTS Rob. Vis. Dist. Traveled (cm) # Switches # Non-switches Plan Time (s) Navigation Time (s) 1 1 2102 ± (349.94) 4 ± (0) 0.67 ± (1.15) 613.96 ± (144.54) 1415.96 ± (238.87) 2 1 1130.45 ± (86.92) 2.83 ± (0.58) 2.5 ± (1.32) 835.6 ± (269.11) 746.86 ± (48.66) 2 2 6118.95 ± (2689.5) 4.5 ± (1.5) 3.17 ± (2.52) 760 ± (372.83) 3921.43 ± (1704.98) in the joint configuration space, only when robots get within close proximity of each other. REFERENCES [1] R. Zlot and A. Stentz, “Market-based multirobot coordination for complex tasks,” I. J. Robotic Res., vol. 25, no. 1, pp. 73–101, 2006. [2] W. Lenagh, P. Dasgupta, and A. Mu˜noz-Mel´endez, “A spatial queuing- based algorithm for multi-robot task allocation,” Robotics, vol. 4, no. 3, pp. 316–340, 2015. [3] S. Rutishauser, N. Correll, and A. Martinoli, “Collaborative coverage us- ing a swarm of networked miniature robots,” Robotics and Autonomous Systems, vol. 57, no. 5, pp. 517–525, 2009. [4] G. A. Korsah, A. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,” I. J. Robotic Res., vol. 32, no. 12, pp. 1495–1512, 2013. [5] G. Wagner and H. Choset, “Subdimensional expansion for multirobot path planning,” Artif. Intell., vol. 219, pp. 1–24, 2015. [Online]. Available: http://dx.doi.org/10.1016/j.artint.2014.11.001 [6] I. Sucan and L. Kavraki, “Accounting for uncertainty in simultaneous task and motion planning using task motion multigraphs,” in IEEE International Conference on Robotics and Automation, St. Paul, May 2012, pp. 4822–4828. [7] H. Choset, W. Burgard, S. Hutchinson, G. Kantor, L. Kavraki, K. Lynch, and S. Thrun, Principles of Robot Motion: Theory, Algorithms, and Implementation. MIT Press, June 2005. [8] R. Bohlin and L. Kavraki, “Path planning using lazy prm,” in ICRA, vol. 1, 2000, pp. 521–528. [9] S. Lavalle, “Rapidly-exploring random trees: A new tool for path planning,” Tech. Rep., 1998. [10] J. Gammell, S. Srinivasa, and T. Barfoot, “Informed rrt*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic,” in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, September 14-18, 2014, 2014, pp. 2997–3004. [Online]. Available: http://dx.doi.org/10.1109/IROS.2014.6942976 [11] K. Hauser, “Lazy collision checking in asymptotically-optimal motion planning,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on, May 2015, pp. 2951–2957. [12] C. Voss, M. Moll, and L. Kavraki, “A heuristic approach to finding diverse short paths,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on, May 2015, pp. 4173–4179. [13] M. Kneebone and R. Dearden, “Navigation planning in probabilistic roadmaps with uncertainty,” in International Conference on Automated Planning and Scheduling, 2009. [Online]. Available: http://aaai.org/ocs/index.php/ICAPS/ICAPS09/paper/view/717/1113 [14] P. Missiuro and N. Roy, “Adapting probabilistic roadmaps to handle uncertain maps,” in Robotics and Automation, 2006. ICRA 2006. Pro- ceedings 2006 IEEE International Conference on, 2006, pp. 1261–1267. [15] V. Desaraju and J. How, “Decentralized path planning for multi-agent teams with complex constraints,” Autonomous Robots, vol. 32, no. 4, pp. 385–403, 2012. [16] R. Luna and K. Bekris, “Efficient and complete centralized multi-robot path planning,” in International Conference on Intelligent Robots and Systems, IROS 2011, 2011, pp. 3268–3275. [17] B. Marthi, “Robust navigation execution by planning in belief space,” in Proceedings of Robotics: Science and Systems, Sydney, Australia, July 2012. [18] S. Loibl, D. Meyer-Delius, and P. Pfaff, “Probabilistic time-dependent models for mobile robot path planning in changing environments,” in 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, May 6-10, 2013, 2013, pp. 5545–5550. [19] B. Gerkey and M. Mataric, “A formal analysis and taxonomy of task allocation in multi-robot systems,” The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, 2004. [Online]. Available: http://ijr.sagepub.com/content/23/9/939.abstract [20] M. B. Dias, R. Zlot, N. Kalra, and A. Stentz, “Market-based multirobot coordination: A survey and analysis,” Proceedings of the IEEE, vol. 94, no. 7, pp. 1257–1270, July 2006. [21] L. Kaelbling and T. Lozano-P´erez, “Integrated task and motion planning in belief space,” International Journal of Robotics Research, vol. 32, no. 9-10, 2013. [22] J. Wolfe, B. Marthi, and S. Russell, “Combined task and motion planning for mobile manipulation,” in International Conference on Automated Planning and Scheduling, Toronto, Canada, 2010. [23] T. Cormen, C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms. MIT Press, June 2009. [24] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach. Prentice Hall, June 2009. [25] H. Garcia-Molina, “Elections in a distributed computing system,” IEEE Trans. Computers, vol. 31, no. 1, pp. 48–59, 1982. [26] Stargazer from hagisonic inc. [Online]. Available: http://eng.hagisonic.kr/cnt/prod/prod010102?uid=11$&$cateID=2$&$fieldName=$&$or [27] B. Woosley and P. Dasgupta, “Multirobot task allocation with real-time path planning,” in FLAIRS Conference, 2013, pp. 574–579.