Reactive Integrated Mission and Motion Planning Alireza Partovi, Rafael Rodrigues da Silva1 and Hai Lin Abstract— Correct-by-construction manipulation planning in a dynamic environment, where other agents can manipulate objects in the workspace, is a challenging problem. The tight coupling of actions and motions between agents and complexity of mission specifications makes the problem computationally intractable. This paper presents a reactive integrated mission and motion planning for mobile-robot manipulator systems op- erating in a partially known environment. We introduce a multi- layered synergistic framework that receives high-level mission specifications expressed in linear temporal logic and generates dynamically-feasible and collision-free motion trajectories to achieve it. In the high-level layer, a mission planner constructs a symbolic two-player game between the robots and their environment to synthesis a strategy that adapts to changes in the workspace imposed by other robots. A bilateral synergistic layer is developed to map the designed mission plan to an integrated task and motion planner, constructing a set of robot tasks to move the objects according to the mission strategy. In the low-level planning stage, verifiable motion controllers are designed that can be incrementally composed to guarantee a safe motion planning for each high-level induced task. The proposed framework is illustrated with a multi-robot warehouse example with the mission of moving objects to various locations. I. INTRODUCTION In the recent years, there has been an increased interest in using autonomous manipulator robots in factory automation [1]. In a warehouse, for instance, manipulator mobile robots are tasked to perform sequences of pick-up/drop-off objects and meanwhile are expected to have a collision-free path planning in the operating workspace. Exhibiting such a complex behavior requires integration of a high-level mission planner to synthesize long-term strategies, and a motion plan- ner to generate feasible motion trajectories concerning the robot’s dynamic model and workspace structure. Seamlessly combining these layers, however, leads to challenging class of planning problems [2], [3]. A key challenge is to adapt the task and motion planning with a change of the workspace structure. Transitionally, manipulation planning is performed in a highly structured environment that allows using pre-computed motion trajec- tories [4]. In the multi-robot manipulator systems, however, each robot may change the workspace structure by carrying the objects to different locations which may create infeasible motion trajectories and fails the pre-computed task plans. The partial support of the National Science Foundation (Grant No. CNS-1446288, ECCS-1253488, IIS-1724070) and of the Army Research Laboratory (Grant No. W911NF- 17-1-0072) is gratefully acknowledged. The authors are with the Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA. apartovi@nd.edu, rrodri17@nd.edu, hlin1@nd.edu 1 The second author would like to appreciate the scholarship support by CAPES/BR, BEX 13242/13-0 The robots, therefore, at the high-level planning are required to fulfill the mission objective by reactively adapt the strategy concerning other robots behavior, and at the low-level motion planning, to ensure a collision-free movement in a dynamic workspace. Another important challenge is the complexity of defining the mission objectives. In the multi-robot manipulation plan- ning with a complex mission, including a detailed description of each robot’s task in the mission specification can be tedious [3]. This demands a correct-by-construction design that the user only expresses the mission requirement over the objects’ location of interest, and the robots synthesize the necessary detailed tasks and movements to fulfill it. This paper aims to develop a correct-by-construction integrated mission and motion planning framework for robotic ma- nipulator system with mission objective expressed in linear temporal logic. Integrated mission and motion planning in a dynamic envi- ronment are demonstrated in other mobile robot applications. Authors in [5], [6], develops a framework for sensor-based temporal logic motion planning where the reactive mission planner responds to changes in the environment. However, they are limited to reach-avoid specifications. Various works leveraged AI approaches for high-level task planning [2], [7]–[9]. AsYmov [2] combines Proba- bilistic Roadmaps Methods (PRM) with a heuristic-search task planner based on metric-FF. However, these heuristic search algorithms are limited, because it ignores the geo- metric details in the symbolic layer. Thus, Hierarchical Task Networks (HTN) is extended in several related approaches [7]–[9] with shared literals to control backtracking between task and motion layer. For example, Hierarchical Task in the Now (HTN) [7] introduces an aggressive hierarchical approach which takes long-term choices first and commits them in a top-down fashion to reduces the amount of search required. Nonetheless, these works do not have a reactive high-level planning to adapts with a dynamic environment. Recently, there is an interest to combine formal approaches to the robot manipulation problem. In the work [10], it is presented an approach which combines Satisfiability Modulo Theories (SMT) in the discrete layer with sampling-based motion planning to achieve probabilistic completeness. It proposes a notation-independent task language to incorporate action effects. In [3], the task is specified in LTL formulas allowing richer expressiveness. These methods also do not consider changes in the object locations due to actions from the environment. Furthermore, the collision-free aspect of motion planning for moving obstacles is not considered. These obstacles appear in heterogeneous multi-robotic sys- arXiv:1711.02201v1 [cs.RO] 6 Nov 2017 tems since other robots can be seen as moving obstacles to each other. In this paper, we leverage a broad range of methods from literature to develop a reactive integrated mission and motion planning framework for multi-robot manipulation tasks. Our proposed framework has a hierarchical structure that accepts an LTL mission specification over the desired location of objects. At the high-level, the mission planner considers uncontrollable robots in the workspace as an adversary players and synthesizes a strategy that adapts to changes in the workspace enforced by uncontrollable robots. In the synergistic layer, according to a mission road-map, a robot manipulation plan is automatically synthesized. It encodes an integrated task and motion planning to a Bounded Satisfiability Checking (BSC) [11] specified by Bounded Prefix LTL formulas which guarantee dynamically feasible trajectories in the known dynamic workspace. BSC models consist of temporal logic with arithmetic terms rather than transition systems; thus, it allows integration of geometric details. During execution of the synthesized tasks, we employ safe motion primitive verified in Differential Dynamic Logic (dL) [12]. dL language can model non-deterministic hybrid systems to guarantee safety formally, considering delays, of local motion planning algorithms such that the robot does not actively collide with obstacles observed by the sensors. Our main contribution lies in two folds: adapting reactive synthesis in the mission planner that enables robots to respond to change of workspace enforced by uncontrollable agents such human or other uncontrollable robots; designing a synergistic layer that automatically constructs robot manip- ulation tasks ensuring dynamically-feasible and collision-free motion trajectories. The remaining of this paper is organized as follows. The background and preliminaries are introduced in Section II which follows with the problem formulation in Section III. Section IV presents the reactive integrated task and motion planning layer. The reactive mission planning is addressed in Section V. A warehouse manipulation example is presented in Section VI. The paper is concluded in Section VII. II. PRELIMINARIES In this section, we introduce the preliminary terminology and notations that are used throughout this paper. A. K-Bounded Prefix LTL with Arithmetic Temporal Terms (LTLK) We express the specification of an integrated task and mo- tion planning based on Counter Linear Temporal Logic Over Constraint System CLTLB(D) defined in [13] and Bounded Linear Temporal Logic defined in [14]. This language is interpreted over arithmetic and boolean terms S ∶= V ∪Q, where V is a set of continuous variables such that a variable v ⊆V at instant k is real-valued v[k] ∈R, or integer- valued v[k] ∈N; and Q is a set of boolean state variables such that a state variable q ⊆Q at instant k is boolean- valued q[k] ∈{⊥,⊺}. Furthermore, arithmetic terms v ∈V are valuated using temporal operators, this valuation is named arithmetic temporal term ϕ such that the valuation function I ∶ϕ × [1..K] ↦R × N. For short, we name LTLK for the K-Bounded Prefix LTL with Arithmetic Temporal Terms. Definition 1 (Arithmetic Temporal Term). A K-bounded prefix arithmetic temporal term ϕ is defined as: ϕ ∶∶= s ∣◯ϕ, where s ⊆S such that s ∈R or s ∈N; and ◯stands for the next operator, i.e. I(◯r,k) = r[k+1]. The terms are interpreted using a labeling function L ∶ DV×DQ ↦2Π, where Π is a finite set of atomic propositions. Each atomic proposition π ∈Π is associated to a boolean term or to a linear arithmetic predicate over arithmetic temporal terms. An atomic proposition π ∈Π associated to a boolean term JπK ∶= {q ∈Q} holds true at instant k if and only if q[k] holds true. On the other hand, an atomic proposition π ∈Π associated to a linear arithmetic predicate over arithmetic temporal terms JπK ∶= {h⊺ϕ & c} holds at instant k true if and only if h⊺I(ϕ,k) &c holds true, where & is a relation operator (& ∈{≤∣<∣=∣>∣≥}), ϕ is a vector of n arithmetic temporal terms such that I(ϕ) ∈RnR × NnN and n = nR + nN, h ∈Rn and c ∈R. Therefore, a LTLK formula is defined as below. Definition 2 (Formula). A LTLK formula φ is defined as: φ ∶∶= π ∣¬φ ∣φ1 ∧φ2 ∣◯φ ∣φ1Uφ2, where π ∈Π is a atomic proposition, ◯stands for usual next operator, and U stands for usual until operators. Each formula defines a set of infinite words ρ over 2Π, i.e. ρ ∈(2Π)ω is an infinite word. A K-bounded prefix is a sequence ξK = s[0]s[1]s[2] ...s[K] and its word is ρK = L(s0)L(s[1])L(s[2])...L(s[K]). We consider a K- bounded prefix fragment of LTL language, meaning that we check if a formula φ defines a K-bounded prefix. When this formula defines a K-bounded prefix at instant k, we denote ρK ⊧k φ. If the infinite path ρ is K-loop, then ρK ⊧0 φ will imply that φ defines the infinite word ρ [15]. Thus, we enforce that ρ is a K loop by assuming a loop in the last state s[K] to define the following semantics. Definition 3 (Semantics). The semantics of a LTLK formula φ at an instant k ∈[0..K] is as follow: ●ρK ⊧k π ⇐⇒π ∈2Π ∧π ∈L(v[k]), ●ρK ⊧k ¬φ ⇐⇒ρ ⊭K k φ, ●ρK ⊧k φ1 ∧φ2 ⇐⇒ρ ⊧K k φ1 ∧ρ ⊧K k φ2, ●ρK ⊧k ◯φ ⇐⇒ρ ⊧K k φ ∨k ∉[1..K −1], ●ρK ⊧k φ1Uφ2 ⇐⇒ ⎧⎪⎪⎨⎪⎪⎩ ∃i ∈[k..K] ∶ρ ⊧K i φ2∧ ∀j ∈[k..i −1] ρ ⊧K j φ1 . Remark: I(◯r,K) = r[K]. Based on the grammar in Def.2, it can also use others common abbreviations, including: standard boolean, such as ⊥, ⊺, ∨and →, ◇φ that stands for ⊺Uφ, and it means that φ eventually holds before the last instant (included), ◻φ that stands for ¬ ◇¬φ, and it means that φ always holds until the last instant, Last that stands for ◯⊥holds true only at last instant K. LTLK formula is used to specify properties over finite runs. In order to define properties over infinite words, we use LTL formulas. The semantic of an LTL formula is similar to LTLK but for infinite bound. The language of an LTL formula ψ over variable V is defined as L(ψ) = {ω ∈ (DV)ω∣ω ⊧φ}. For further details on LTL model checking and it’s semantic, the reader can refer to [16]. III. PROBLEM FORMULATION To motivate this work, we provide a warehouse robotic example that will be used through this paper. A. Motivating Example We consider fully actuated ground mobile robots and, in this example, we used a Dublin’s vehicle model, ˙x = f(x,u), y = Hx, (1) where x = (px,py,θ,v,ω) ∈DX ⊆R5 is the state, u = (a,α) ∈DU ⊆R2 is the control input, f(x,u) = (v cosθ,v sinθ,ω,a,α), H = diag(1,1,1) ∈R3,5, and y ∈ DY ⊆R3 is the output. The states describe the robot positions p = (px,py), orientation θ and velocities v = (v,ω), and the outputs describe the robot configuration y = (px,py,θ). The translational acceleration a is bounded a ∈[−B,A], and the translational v and angular ω are linked by the nonholomonic constraint −˙px sinθ + ˙py cosθ = 0. We call x(t) as the state trajectory and y(t) as the output trajectory such that the robot dynamic model is under the initial condition x(0) = x0. Now consider a warehouse layout depicted in Figure 1, with having manipulator robots R1, R2, and R3. We consider robot R1 as our controllable system and other robots as part of the environment. The user does not know precisely the layout map, although, the following abstracted scenario features are known to him: the robots initial positions R1H and R2H; the scanning area W1 and W2, where the objects are sorted and placed by robot R2; and the transportation locations W3 and W4 where objects are supposed to be dropped off. Robot R1 mission is to move the unloaded objects to the transportation places W3 and W4, and meanwhile avoid collision with other robots and any unexpected moving obstacles. This manipulation mission, at the high-level requires an strategy to adapt with robot R2 behavior, and at the low-level, needs motion planning based on the warehouse layout and other moving robots. Fig. 1: Warehouse layout and abstracted features. B. Reactive Mission and Motion Planning We first, give a general overview of the integrated mission and motion planning framework depicted in Figure 2. Con- sider a group of manipulator robots R = {R1,...,R∣R∣} and a workspace W ⊂R2 that contains a finite set of obstacles, and movable objects. In the workspace there exist a set of locations of interest for the objects to be placed, and it is assumed each location is large enough to contain one and only one object. The mission requirement for the robots is to move the objects in the workspace to the desired locations. We formalize this problem in a hierarchical structure with a reactive mission planner and an integrated task and motion planning layers. The location of objects and robots positions constructs a space V with a bounded domain DV, that will be called task space in the rest of the paper. The objects and robots’ location of interest in the task space are abstracted as discrete states in the mission planning layer. Let the set of all mission states be M = {m1,...,m∣M∣}, where mi ∈ DV, and they are mutually exclusive ⋂i∈[1..∣M∣] mi = ∅. This abstraction helps the user to be able to declaratively define the mission specification over location of objects. For instance, in the warehouse example, object o1 eventually be at transportation area W2. The transition between these states is an integrated task and motion planning problem for a manipulator robot that can be designed and verified off- line. We call these transitions, bounded-time tasks that are specified with LTLK formulas. A bounded-time task can be seen as robot’s skill on moving freely or carrying an object to a specific location in the workspace. These skills are designed for a set of motion primitives that jointly satisfies the bounded-time tasks. Motion primitives itself are reactive hybrid controllers that induce collision-free and dynamically- feasible motion trajectories in the described workspace, even when the robot meets an unexpected moving obstacles with maximum velocity VB′. Considering M as a set of nodes and the transition between them as an edge set Γ, we can construct a mission graph, capturing all feasible integrated task and motion planning for the given task space valuation, i.e M. We consider a set of robots in the workspace as our controllable system and other robots as the system’s environment which their behaviors are uncontrollable but known to the system. The mission graph can be characterized as a turned-based two-player game arena with node set M that is partitioned to system states Ms, and environment mission states Me, meaning the system takes actions at mission state m ∈Ms, and, otherwise, it is the environment robots turn. The edge set also is partitioned as Γ = Γs ⋃Γe representing the integrated task and motion planning specifications that can be realized by the system and its environment. With these ingredients, the reactive mission planning problem is defined as synthesis of system robots winning strategy in the mission graph that satisfies the mission specification. Problem 1 (Reactive Mission and Motion Planning). Given a high-level mission Ψ, a set of mission states M, the behavior of the environment abstracted by Me and Γe, a set of manipulator robots R with dynamic model (1), an initial state for robot x0, a workspace W, a set of static obstacles CB, a maximum velocity for moving obstacles VB′, synthesize a mission strategy which ensures Ψ and yields trajectories that are collision-free and dynamical feasible in W. Reactive Mission Planner Discrete Mission Abstraction Gm Task and Motion Planner LTL Mission Specification Ψ Environment Tasks (Me,Γe) Robot Model Workspace W,CB Online Motion Planner Collision-free dynamically-feasible trajectory satisfying mission specification Fig. 2: Integrated mission and motion planning framework. IV. REACTIVE TASK AND MOTION PLANNING The transition between two mission states γi ∈Γ is implemented as a task and motion planning which must ensure collision-free, dynamically-feasible trajectories, and actions to manipulate objects. This task and motion planning considers the known workspace to synthesize a robust plan for a local motion planning which avoids collision with unexpected obstacles during runtime. Hence, we call it Reactive Integrated Task and Motion Planning (RITMP). The robots workspace W is a full-dimensional polytope defined in a two dimensional Euclidean space W ⊂R2 which specifies a fixed Cartesian frame where objects can move. A configuration space Yi of a robot Ri is defined by yi = (px i ,py i ,θi), where p ∈R2 and θ ∈(−π,π] are the robot position and orientation in the workspace, respectively. This workspace has a set of known static and rigid obstacles B = {B1,...,B∣B∣}, which by considering a circular shaped robot, C-obstacle open convex polytopes regions CBi ∈CB can be precomputed such that CBi = {yi ∈Yi ∶Ri(yi(t)) ∩ Bi = ∅}. Besides moving, the robot can execute some tasks such as grasping and placing movable objects which changes the workspace. Task domains typically use a variety of notations defined over states and actions [17], [18]. We model these actions into LTLK specifications using a notation- independent task domain. Definition 4 (Task Language). A task language is a set of strings of actions defined by the tuple T ∶= ⟨V,A,E,v0,vf⟩, where, ●V is a set of state variables (v1,...,v∣V∣) ∈DV which is the conjunction of robot configuration state space (v1,...,v3) = (px,py,θ) ∈DY and non-motion state space (v3,...,v∣V∣) ∈DVt, ●A is a set of actions a ∈A, ●E ⊆(DV × DA × DV) is the set of symbolic tran- sitions e ∈ E denoting pre(a) aÐ→ eff(a), where a ∈A is the operator, pre(a),eff(a) ⊆DV are the precondition and effect set. We specify pre(a) and eff(a) at instant k by linear arithmetic pred- icates ψpre,a(v[k]) and ψeff,a(v[k],v[k+1]), respec- tively, such that v[k] ∈pre(a) ≡ψpre,a(v[k]) and v[k+1] ∈eff(a) ≡ψeff,a(v[k],v[k+1]), ●v0,vf ∈DV is the initial and final condition of the task state variables. We study the problem of synthesizing a hybrid control system which generates collision-free output trajectories of (1) to guarantee task requirements T in a partially known workspace with moving obstacles. Definition 5 (Collision-free Dynamically Feasible Trajec- tory). A robot output trajectory yi(t) is called collision-free dynamically feasible trajectory if the following conditions hold. 1) It does not leave the workspace: yi(t) ∈W, 2) It does not collide with known static obstacles B: Ri(yi(t)) ∩B = ∅; 3) It does not collide with any unexpected obstacle B′ moving with velocity up to VB′ detected by sensors: Ri(yi(t)) ∩B′ = ∅, 4) It is dynamically feasible: ∀t ∈R≥0 there exist u(t) ∈ DU such that ˙x = f((x),u) and x(0) = x0. A valid task plan should generates a continuous output trajectory yi(t) which is feasible in the dynamic model (1) and does not collide with any obstacle or other agent in the workspace. Definition 6 (Valid Task Plan). A task plan T i = {a[1],...,a[K]} for the robot Ri ∈R is valid if the following conditions hold. 1) Any robot trajectory yi(t) generated with local mo- tion planning following the plan T is a collision-free dynamically feasible trajectory, 2) The plan T i satisfies the task language Ti, i.e. v[0] = v0, v[K] = vf, ψpre,a[k](v[k]) and ψeff,a[k](v[k],v[k+1]). The integrated task and motion planning is solved hi- erarchically with two layers: offline discrete planning and continuous planning. First, the discrete planning finds an integrated task and motion plan ξK ∶= ⟨T ,Y ,P ⟩for the known obstacles, including the objects placed in workspace, where T = {a[1],...,a[K]} is a task plan, Y and P forms a motion plan composed by an ordered set of target robot configuration Y ∶= {y[1],...,y[K]} and a collision free tunnel in the workspace P ∶= {P[1],...,P[K]}. A tunnel P is a sequence of open and bounded convex polytopes P[k] ∈P which there exist at least one feasible trajectory x(t) to the target y[k] generated by the action a[k]. We encode the discrete problem with LTLK formulas that are solvable by an SMT solver such as Z3 [19] using linear arithmetic constraints. Second, a continuous trajectory x(t) is generated for this tunnel using a local motion planning such that y(t) is passive safe for dynamic environment with moving obstacles, meaning that the robot stops before a collision happens. This safety property is verified offline and abstracted to the discrete planner using safety verification with Differential Dynamic Logic dL [12]. In the following subsections we will present the discrete and continuous planning methods. A. Motion Primitives We aim to address the extreme computation of combining logic with dynamical constraints via compositional verifi- cation and synthesis. The Continuous Planner consists of a library of certified, reactive motion primitives in the form of a tunnel P in the workspace and target configurations Y in the robot output space. These primitives are certified and re- active as it provides an online collision avoidance algorithm by taking into account sensor readings when synthesizing continuous trajectories. For a robot with dynamic model (1), a passive safety property can be proved using Differential Dynamic Logic (dL) [20], which can be implemented with local motion planning such as Dynamic Window Approach (DWA) [21]. Furthermore, all free configuration space DY is feasible for this robot, and a local optimization can find a feasible trajectory in a convex polytope. Lemma 1. There exist a collision-free trajectory between an initial y0 to a target configuration y∗inside a convex polytope P using Dynamic Window Approach (DWA) if the following conditions hold true: ●P is obstacle free, or there exist moving obstacles inside this polytope which will eventually move away from the robot path, ●the initial state x0 is safe, meaning that the robot can stop before a collision, ●the initial configuration y0 is inside P. Proof. The system (1) can be feedback linearized as a double integrator ¨˜x = ˜u [22]; thus, a trajectory to any position (px,py) inside an obstacle free polytope P can be found by local optimization. Furthermore, if the robot is stopped, it can rotate to any orientation θ. Therefore, if the initial state x0 is safe, and the initial y0 and the target configuration y∗ are inside an obstacle free convex polytope P, then the DWA can find a feasible trajectory. We ensure that a non-convex tunnel P of obstacle free configuration space is feasible using DWA by requiring that this tunnel is composed by a sequence of convex polytopes P ∶= {P[1],...,P[K]} such that P[k] ∩P[k+1] ≠∅. This condition can be specified in the following LTLK formula. φsafe(W,CB) ≡◻[ ⋀ i∈[1..nw F ] wi∧ ⋀ i∈[1..∣CB∣] ⋁ j∈[1..nb,i F ] (bi,j ∧◯bi,j)], (2) where wi ≡h⊺ w,i[px,py]⊺≤cw,i, bi,j ≡h⊺ b,i,j[px,py]⊺≤ cb,i,j, hw,i ∈R2 and cw,i ∈R are facet i ∈[1..nw F ] parameters of the polytope W, and hb,i,j ∈R2 and cb,i,j ∈R are facet j ∈[1..nb,i F ] parameters of the polytope CBi. Lemma 2. If a task and motion plan ξK satisfies φsafe, it will generate a collision free trajectory y(t) for a given workspace W with known static obstacles B. Proof. A polytope P is obstacle free inside the workspace W if it is inside of at least one of the half-plane of each C-obstacle region CBi ∈CB. Furthermore, two se- quent polytopes P[l] and P[l+1] intersects with each other (P[l] ∩P[l+1] ≠∅) if and only if there exist a point that is inside both polytopes. Therefore, a obstacle free polytope is formed by one of active hyper-plane (a hyper plan which bi,j holds true) for each obstacle CBi and the workspace hyper-planes. Moreover, sequent polytopes has non-empty intersection because for each active obstacle hyper-plan is active for next robot position, i.e. (bi,j ∧◯bi,j). Local motion planning, such as DWA, not only efficiently computes trajectories, but also is robust to unknown obsta- cles. However, dynamic obstacles can turn these trajectories unsafe before the robot can react due delays such as sensors sampling time and computation. Hence, we design a mini- mally invasive safety supervisor which ensure that the robot stops before the collision occurs. This supervisor gets data from radar sensor to detect when the collision is imminent to take over and stop the robot. A similar strategy is used in other collision avoidance controllers in the literature [23]. Definition 7 (Safe Motion Primitive). A safe motion prim- itive is hybrid control system with two modes of operation: inactive and override. The dynamic model is defined in (1) for a circular shaped robot with radius Ds. The moving obstacle B′ dynamic model is unknown, but its maximum velocity ∥vB′∥≤VB′ is known, and its closest point p∗ B′ ∈B′ to the robot is detected by a radar sensor. In the drive mode, an over-the-shelf local motion planning generates the robot trajectory when safe ≡∥p −p∗ B′∥∞> v2 2B + VB′(ϵ + v+Aϵ B ) + ( A B +1)( A 2 ϵ2 +ϵv)+Ds holds true, where ϵ is the maximum reaction delay. Otherwise, the mode override stops the robot (a = −A). This supervisor is synthesized offline and abstracted to the high-level discrete planner using Differential Dynamic Logic dL [12] verification. Lemma 3. Any state trajectory x(t) of a Safe Motion Primitive stops before the collision, i.e. φpf ≡(v = 0) ∨ (∥p −p∗ B′∥> v2 2B + VB′ v B + Ds), if φpf holds true initially. Proof. In [20], it was proved with KeYmaera that the translational acceleration a ensures the invariance of φpf if a ∈[−B,A] if safe holds true, otherwise a = −B. Moreover, (v = 0) ∨(∥p −p∗ B′∥> v2 2B + VB′ v B + Ds) →(v = 0) ∨∀pB′ ∈ B′ (∥p −pB′∥> Ds); thus, the robot will never actively collide with a moving obstacle. B. High-Level Discrete Planner The High-Level Discrete Planner shown in Algorithm 1 synthesizes a sequence of motion primitives ξK ∶= ⟨T ,Y ,P⟩, which each action a[k] ∈T is parametrized by a target configuration y[k] ∈Y and a feasible convex polytope P[k] ⊆W in the workspace. A feasible output space satisfies the LTLK formula φsafe. Moreover, besides moving, the robot can also realize some tasks that change the environment which are specified in the task language T . Algorithm 1 Discrete Planner Algorithm 1: function ITMP(φ,Kmax) 2: K ←1 ▷Initialize the horizon. 3: Satus ←UNSAT 4: while Status = UNSAT ∧K ≤Kmax do 5: (Status, ξK) ←Check(K,φ) 6: K ←K + 1 7: end while 8: return Status, ξK 9: end function The task language is encoded in LTLK formula φT over state variables V ∪a, where the variable a ∈Z is a state variable defining the motion primitive taken at each time instant in the discrete plan. This formula states that the primitive must be always valid (1 ≤a ≤∣A∣) and every time that a primitive is selected, its preconditions and effects must hold true (ψpre,ai(v) ∧ψeff,ai(v,◯v)). Since at last time instant in the discrete plan we do not select any primitive, we ignore their constraints at this instant, i.e. ¬Last. φT (v0,vf) ≡(v = v0) ∧◻(Last →v = vf) ∧◻[1 ≤a ≤∣A∣∧ ⋀ i∈[1..∣A∣] φ[i] A ] φ[i] A ≡((a = i) ∧¬Last) →(ψpre,ai(v) ∧ψeff,ai(v,◯v)). (3) Therefore, the task language is encoded in the formula φT , and the workspace W and known static obstacles B to the formula φsafe. Theorem 1. For a circular shaped robot with dynamic model (1), given an initial state x0, a task language T , a workspace W, a set of static obstacles B and a maximum velocity for moving obstacles VB′, a task and motion plan ξL which satisfies φsafe ∧φT generates a valid task plan T i. Proof. φsafe ensures that a collision free trajectory exists for a robot with dynamic model (1) in the workspace W with static obstacles B, i.e. condition 1, 2 and 4 in Def. 5. Moreover, all primitives implement the minimally invasive safety supervisor; thus, they are passive safe for moving obstacles with a known maximum velocity VB′, i.e. condition 3 and 4 in Def. 5. Finally, φT ensures the action-change behaviors, i.e. condition 2 in Def. 6. Therefore, any discrete task and motion plan which satisfies φsafe∧φT will generate a valid task plan. V. REACTIVE MISSION STRATEGY SYNTHESIS The solution of task and motion planning problem pro- vides a set of safe motion primitives that jointly satisfy a bounded-time task specification. The mission specification, however, can be defined over more complex scenario and possibly requires an infinite sequence of bounded-time tasks to be accomplished. We consider the controllable robot team, Rs ⊆R, and the environment robots as Re ∶= R/Rs. The mission planning problem can be characterized to a reactive synthesis formalism by constructing a two-player game arena, denoted as mission graph Gm with having Re and Rs as the players. A two-player symbolic turned-based game arena is a directed graph which models a reactive system interacting with its environment [24]. It defines over variable set V that at each turn either system or environment takes an action and updates the vtask space V. The mission graph is a tuple Gm = (V,Γ,∆), where Γ is finite set of actions, and ∆is predicate over V ∪Γ ∪V′ defining the transition relation in Gm. Let’s assume there exist an auxiliary variable with domain σ ∈V,Dσ = {1,2} defines which player’s turn is to take an action. Let’s denote player- 1 as environment robots Re with a set of state Me = {m ∈ DV∣m∣σ = 1}, and player-2 as system robots Rs with set of states Ms = {m ∈DV∣m∣σ = 2}. Let’s also define state set M = Me∪Ms, A run ω = m0m1 ... is a sequence of states mi ∈M where all pairs (si,a,s′ i+1) ⊧∆for i ≥0 and a ∈Γ. A transition between mission states (m,m′) ∈M is an integrated task and motion problem defined over a task language T = ⟨V,A,E,m,m′⟩. Here the aim is to construct an LTLK task specification φ, over the task language such that there exist a valid task and motion plan that satisfies φ with initial task space m and final task space m′. Given mission states and the all the corresponding task languages, the mission planning problem is to construct the mission graph, and then find a strategy for Rs player, that satisfies the mission specification. Definition 8 (Strategy). A strategy for player-σ is a partial function Sσ ∶DV ∗⋅Dσ V →Γ such that for all prefix of runs ending in player-σ’s state, {r.mσ ∈DV ∗∣mσ ∈Dσ V}, if A(mσ) ≠∅, choose a successor state from A(mσ), that (mσ,Sσ(r,s),m′) ∈∆holds, where m′ is a prime of mσ. A mission game with winning condition is (Gm,Ψinit,Ψ), where Ψinit is a predicate over V specifies initial state of the game, and Ψ is LTL formula specifies mission specification for Rs. Consider Sσ be a strategy for player-σ, and m ∈ M, the outcomes(S1,S2,m) is all the runs in the form of m0m1⋯that m0 = m, and (mi,Sσ(m0⋯mi),mi+1) ∈ ∆. A strategy S2 is winning if for all strategies of player−1, S1, and all the states that satisfies the initial conditions, m ⊧Ψinit, the outcomes(S1,S2,m) ⊆L(Ψ). An LTL specification Ψ is called realizable in Gm with initial condition ψinit if and only if there exist a winning strategy S2. We call the winning strategy S mission strategy. The mission strategy synthesis is designed in two steps. First, we construct a mission graph that captures all the feasible transitions over mission states for a known workspace. The next step is to find a winning strategy over mission graph that enforces the mission specification. A. Mission Graph Mission graph is a symbolic turned-based two-player game arena Gm = (V,Γ,∆), defined over task space V. The variable states set is M with partitions of environment state Me = {me 1,...,me ∣Me∣} and system states Ms = {ms 1,...,ms ∣Ms∣}. Let’s define variable α ∈{s,e} and ¯α = {e,s}/α for simplicity on notations. Let succ(m) ⊆M be set of all mission state that a transition from state m ∈M is feasible. The action set Γ = Γe ⋃Γs, and the transition relation ∆= ∆e ⋃∆s can be partitioned to the environment and system players, where ∆α = V ⋃Γα ⋃V′. We assume the tuple Ge = (V,Γe,∆e) is pre-designed and known to the mission planner. The system’s action set, Γs = ⋃γi, is designed over a bounded-time task specification φi that at ms i ∈Ms, can enforce task space to me j ∈succ(ms i). If there exist such φi, we use a symbolic representation of it as an action, denoted by a function symb ∶φ →γ , and add transition tuple (ms i,γi,me j) to ∆s. This procedure is presented in Algorithm 2. Algorithm 2 Mission Graph Synthesis Algorithm Input: Ms,Ge,W,CB,Kmax Output: Gm = (V,Γ,∆) 1: i,j ←1 ▷Initialize the counter. 2: while i < ∣Ms∣do 3: while j ≤∣Me∣do 4: φk ←φsafe(W,CB) ∧φT (ms i,me j) 5: (Status,ξK) ←ITMP(φk,Kmax) 6: if Status = SAT then 7: γi ∶= Symbl(φk) ▷symbolic action for φi. 8: Γs ←Γs ∪γi 9: ∆←∆∪(mi,γi,mj) 10: end if 11: j ←j + 1 12: end while 13: i ←i + 1 14: end while 15: return Gm = (V,Γs ⋃Γe,∆⋃∆e) The synthesized Gm has a deterministic system’s tran- sition function ∆S, since for any ms ∈Ms and all me ∈succ(ms k), the corresponding task specification, φk ∶= φsafe ∧φT (ms,me), if exist, is unique. The environment transition function, ∆e, however can be non-deterministic. B. Mission Strategy Synthesis Construction of Gm is a bottom-up approach that provides an abstracted model of all robots behavior for the mission planning layer. The mission planning, on the other hand, is top-down design over mission graph that synthesis a finite mission strategy model that satisfies a given mission specification. Synthesis of mission strategy can be converted to a standard problem of solving symbolic two-players game [24]. Proposition 1 (Existence of Mission Strategy [24]). Given initial configuration of task space described by predicate Ψinit over task space V, and let mission specification Ψ be an LTL formula defined over mission state M, mission strategy S exist if the system player has a winning strategy in (Gm,Ψinit,Ψ) . If there exist a mission strategy, a finite-state transducer can be synthesized that accepts all the runs induced by S [25]. Let Gm S be mission strategy transducer that satisfies the mission specification, L(Gm S ) ⊆L(Ψ), it is important that the induced motion trajectory from Gm S be feasible in the workspace. Theorem 2 (Correctness of Mission Strategy). L(Gm S ) in- duces collision-free and dynamically-feasible trajectories. Proof. We proof by construction that all the runs over mission states that induced by strategy model Gm S generates collision-free and dynamically-feasible trajectories as it is defined in Def 5. Let a run be ω ∈L(Gm S ), mission states m = ωi and m′ = ωi+1, for i ≥0 and {m,m′} ∈M. We can define a task language T = ⟨V,A,E,m,m′⟩based on Def. 6 and construct an LTLK task specification by using Theorem 1, as φ = φsafe ∧φT (m,m′). Since m′ ∈succ(m) and by definition of Gm given in Algorithm 2, we know there exist a valid task plan that it’s integrated task and motion plan, ξL, satisfies φ. Hence, by Def. 6 the induced trajectories are collision-free and dynamically-feasible. VI. SIMULATION We present results of our approach in simulation to evalu- ate an automatic synthesis of controllers for a multi-robotic system in a dynamic environment with a moving obstacle. The synthesis of task and motion planning is encoded to an SMT problem and solved with Z3 [19] for Pioneer P3- DX Robots 1 equipped with gripper and a laser sensor. We simulate the execution of the generated controllers with MobileSim 2. Our evaluation is illustrated in a warehouse scenario presented in Sec. III-A where a set of location of objects L = {W0,...,W4} is given as in Fig. 3a. A task and motion plan is generated offline for all (ms i,λi,me j) ∈∆s. For example, Fig. 3b, 3c and 3d shows a plan for moving an object in the scanning area W1 to the transportation area W4. The blue line is a continuous output trajectory y(t) for the robot R1 which is generated at runtime considering the laser sensors such that ensures a task and motion plan synthesized offline and no collision with a priori unknown obstacles moving up to 0.8m/s. This robot crosses with the robot R3 trajectory, the green line in Fig. 3c and 3d, while moving to 1http://www.mobilerobots.com/ResearchRobots/ PioneerP3DX.aspx 2http://robots.mobilerobots.com/MobileSim/ download/current/README.html (a) Pre-defined locations. (b) Pick Up (c) Drop Off (d) Go Home Fig. 3: An execution of a task and motion plan to fetch an object at W0 to W2. drop the object off and returning home. At this moment, this robot automatically finds a collision-safe detour. Hence, the designed controller is reactive not only in the mission layer but also in the motion layer. Consequently, this controller is robust to unexpected changes in the environment. VII. CONCLUSION In this paper, we addressed integrated mission and motion planning problem for robotic manipulator systems operating in a partially known environment. The main advantages of the proposed approach are in two-folds: a) The high-level mission specification can be defined declaratively on desired robots and objects locations. The framework synthesis the required robot task to move the objects according to the mission states; b) The framework offers reactivity in both motion planning layer to handle unknown moving obstacles, and in the mission planning to adjust the robot’s strategy with respect to change of environment enforced by others robots operating in the workspace. REFERENCES [1] P. R. Wurman, R. D’Andrea, and M. Mountz, “Coordinating hundreds of cooperative, autonomous vehicles in warehouses,” AI magazine, vol. 29, no. 1, p. 9, 2008. [2] S. Cambon, R. Alami, and F. Gravot, “A hybrid approach to intricate motion, manipulation and task planning,” The International Journal of Robotics Research, vol. 28, no. 1, pp. 104–126, 2009. [3] K. He, M. Lahijanian, L. E. Kavraki, and M. Y. Vardi, “Towards manipulation planning with temporal logic specifications,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 346–352. [4] C. Belta, A. Bicchi, M. Egerstedt, E. Frazzoli, E. Klavins, and G. J. Pappas, “Symbolic planning and control of robot motion [grand challenges of robotics],” IEEE Robotics & Automation Magazine, vol. 14, no. 1, pp. 61–70, 2007. [5] H. Kress-Gazit, T. Wongpiromsarn, and U. Topcu, “Correct, reactive, high-level robot control,” IEEE Robotics & Automation Magazine, vol. 18, no. 3, pp. 65–74, 2011. [6] J. Alonso-Mora, J. A. DeCastro, V. Raman, D. Rus, and H. Kress- Gazit, “Reactive mission and motion planning with deadlock reso- lution avoiding dynamic obstacles,” Autonomous Robots, pp. 1–24, 2017. [7] L. P. Kaelbling and T. Lozano-P´erez, “Hierarchical task and motion planning in the now,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, 2011, pp. 1470–1477. [8] ——, “Integrated task and motion planning in belief space,” The International Journal of Robotics Research, p. 0278364913484072, 2013. [9] M. Gharbi, R. Lallement, and R. Alami, “Combining symbolic and geometric planning to synthesize human-aware plans: toward more efficient combined search.” in Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015, pp. 6360– 6365. [10] N. T. Dantam, Z. K. Kingston, S. Chaudhuri, and L. E. Kavraki, “Incremental task and motion planning: a constraint-based approach,” in Proceedings of robotics: science and systems, 2016. [11] M. Pradella, A. Morzenti, and P. S. Pietro, “Bounded satisfiability checking of metric temporal logic specifications,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 22, no. 3, p. 20, 2013. [12] A. Platzer, Logical analysis of hybrid systems: proving theorems for complex dynamics. Springer Science & Business Media, 2010. [13] M. M. Bersani, A. Frigeri, A. Morzenti, M. Pradella, M. Rossi, and P. S. Pietro, “Bounded reachability for temporal logic over constraint systems,” in Temporal Representation and Reasoning (TIME), 2010 17th International Symposium on. IEEE, 2010, pp. 43–50. [14] T. Latvala, A. Biere, K. Heljanko, and T. Junttila, “Simple bounded ltl model checking,” in International Conference on Formal Methods in Computer-Aided Design. Springer, 2004, pp. 186–200. [15] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu, “Symbolic model checking without bdds,” Tools and Algorithms for the Construction and Analysis of Systems, pp. 193–207, 1999. [16] Clarke, O. Grumberg, and D. A. Peled, Model checking. MIT Press, 1999. [17] E. Erdem, E. Aker, and V. Patoglu, “Answer set programming for collaborative housekeeping robotics: representation, reasoning, and execution,” Intelligent Service Robotics, vol. 5, no. 4, pp. 275–291, 2012. [18] S. Srivastava, E. Fang, L. Riano, R. Chitnis, S. Russell, and P. Abbeel, “Combined task and motion planning through an extensible planner- independent interface layer,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on. IEEE, 2014, pp. 639–646. [19] L. De Moura and N. Bjørner, “Z3: An efficient SMT solver,” in Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337–340. [20] S. Mitsch, K. Ghorbal, D. Vogelbacher, and A. Platzer, “Formal verification of obstacle avoidance and navigation of ground robots,” arXiv preprint arXiv:1605.00604, 2016. [21] D. Fox, W. Burgard, S. Thrun et al., “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997. [22] G. Oriolo, A. De Luca, and M. Vendittelli, “Wmr control via dynamic feedback linearization: design, implementation, and experimental val- idation,” IEEE Transactions on control systems technology, vol. 10, no. 6, pp. 835–852, 2002. [23] U. Borrmann, L. Wang, A. D. Ames, and M. Egerstedt, “Control barrier certificates for safe swarm behavior,” IFAC-PapersOnLine, vol. 48, no. 27, pp. 68–73, 2015. [24] S. Moarref, Compositional reactive synthesis for multi-agent systems. University of Pennsylvania, 2016. [25] M. Y. Vardi, “An automata-theoretic approach to linear temporal logic,” in Logics for concurrency. Springer, 1996, pp. 238–266.