arXiv:1008.3760v1 [cs.RO] 23 Aug 2010 Formal-language-theoretic Optimal Path Planning For Accommodation of Amortized Uncertainties and Dynamic Effects✩ I. Chattopadhyaya,∗, A. Casconea, A. Raya aDepartment of Mechanical Engineering, The Pennsylvnaia State University, University Park, 16802 Abstract We report a globally-optimal approach to robotic path planning under uncertainty, based on the theory of quantitative measures of formal languages. A significant generalization to the language-measure-theoretic path planning algorithm ν⋆is presented that explicitly accounts for average dynamic uncertainties and estimation errors in plan execution. The notion of the navigation automa- ton is generalized to include probabilistic uncontrollable transitions, which account for uncertainties by modeling and planning for probabilistic deviations from the computed policy in the course of execution. The planning problem is solved by casting it in the form of a performance maximization problem for probabilistic finite state automata. In essence we solve the following optimization problem: Compute the navigation policy which maximizes the probability of reaching the goal, while simultaneously minimiz- ing the probability of hitting an obstacle. Key novelties of the proposed approach include the modeling of uncertainties using the concept of uncontrollable transitions, and the solution of the ensuing optimization problem using a highly efficient search-free com- binatorial approach to maximize quantitative measures of probabilistic regular languages. Applicability of the algorithm in various models of robot navigation has been shown with experimental validation on a two-wheeled mobile robotic platform (SEGWAY RMP 200) in a laboratory environment. Keywords: Language Measure, Probabilistic Finite State Machines, Robotics, Path Planning, Supervisory Control 1. Introduction & Motivation The objective of this paper is to report a globally-optimal ap- proach to path planning under uncertainty, based on the theory of quantitative measures of formal languages. The field of tra- jectory and motion planning is enormous, with applications in such diverse areas as industrial robots, mobile robot navigation, spacecraft reentry, video games and even drug design. Many of the basic concepts are presented in [1] and in recent com- prehensive surveys [2]. In the context of planning for mobile robots and manipulators much of the literature on path and mo- tion planning is concerned with finding collision-free trajecto- ries [3]. A great deal of the complexity in these problems arises from the topology of the robot’s configuration space, called the C-Space. Various analytical techniques, such as wavefront ex- pansion [4] and cellular decomposition, have been reported in recent literature [5], which partition the C-Space into a finite number of regions with the objective of reducing the motion planning problem as identification of a sequence of neighboring cells between the initial and final (i.e., goal) regions. Graph- theoretic search-based techniques have been used somewhat successfully in many wheeled ground robot path planning prob- lems and have been used for some UAV planning problems, typ- ically radar evasion [6]. These approaches typically suffer from ✩This work has been supported in part by the U.S. Army Research Labora- tory and the U.S. Army Research Office under Grant No. W911NF-07-1-0376 and by the Office of Naval Research under Grant No. N00014-08-1-380 ∗Corresponding Author complexity issues arising from expensive searches, particularly in complicated configuration spaces. To circumvent the com- plexity associated with graph-based planning, sampling based planning methods [7] such as probabilistic roadmaps have been proposed. However, sampling based approaches are only prob- abilistically complete (i.e. if a feasible solution exists it will be found, given enough time) but there is no guarantee of find- ing a solution within a specified time, and more often than not, global route optimality is not guaranteed. Distinct from these general approaches, there exist reported techniques that explic- itly make use of physical aspects of specific problems for plan- ning e.g. use of vertical wind component for generating optimal trajectories for UAVs [8], feasible collision-free trajectory gen- eration for cable driven platforms [9], and the recently reported approach employing angular processing [10]. 1.1. Potential Field-based Planning Methodology Among reported deterministic approaches, methods based on artificial potential fields have been extensively investigated, of- ten referred to cumulatively as potential field methods (PFM). The idea of imaginary forces acting on a robot were suggested by several authors inclding [11] and [12]. In these approaches obstacles exert repulsive forces onto the robot, while the tar- get applies an attractive force to the robot. The resultant of all forces determines the subsequent direction and speed of travel. One of the reasons for the popularity of this method is its sim- plicity and elegance. Simple PFMs can be implemented quickly Preprint submitted to Elsevier August 26, 2018 and initially provide acceptable results without requiring many refinements. [13] has suggested a generalized potential field method that combines global and local path planning. Potential field based techniques have been also successfully employed in multi-robot co-operative planning scenarios [14, 15], where other techniques prove to be inefficient and impractical. While the potential field principle is particularly attractive because of its elegance and simplicity, substantial shortcom- ings have been identified that are inherent to this principle. The interested reader is referred to [16] for a systematic criticism of PFM-based planning, where the authors cite the underlying dif- ferential equation based analysis as the source of the problems, and the fact that it combines the robot and the environment into one unified system. Key problems inherent to PFMs, indepen- dent of the particular implementation, are: 1. Trap situations due to local minima: Perhaps the best- known problem with PFMs are possible trap-situations [11, 17], which occur when when the robot runs into a dead end, due to the existence of a local extrema in the po- tential field. Trap-situations can be remedied with heuris- tic recovery rules, which are likely to result in non-optimal paths. 2. No passage between closely spaced obstacles: A severe problem with PFMs occurs when the robot attempts to travel through narrow corridors thereby experiencing re- pulsive forces simultaneously from opposite sides, leading to wavy trajectories, no passage etc. 3. Oscillations in the presence of obstacles: Presence of high obstacle clutter often leads to unstable motion, due to the complexity of the resultant potential. 4. Effect of past obstacles: Even after the robot has already passed an obstacle, the latter keeps affecting the robot mo- tion for a significant period of time (until the repulsive po- tential dies down). These disadvantages become more apparent when the PFM- based methods are implemented in high-speed real-time sys- tems; simulations and slow speed experiments often conceal the issues; probably contributing to the widespread popularity of potential planners. 1.2. The ν⋆Planning Algorithm Recently, the authors reported a novel path planning algo- rithm ν⋆[18], that models the navigation problem in the frame- work of Probabilistic Finite State Automata (PFSA) and com- putes optimal plans via optimization of the PFSA from a strictly control-theoretic viewpoint. ν⋆uses cellular decomposition of the workspace, and assumes that the blocked grid locations can be easily estimated, upon which the planner computes an opti- mal navigation gradient that is used to obtain the routes. This navigation gradient is computed by optimizing the quantitative measure of the probabilistic formal language generated by the associated navigation automaton. The key advantages can be enumerated as: 1. ν⋆is fundamentally distinct from a search: The search problem is replaced by a sequential solution of sparse lin- ear systems. On completion of cellular decomposition, ν⋆ optimizes the resultant PFSA via a iterative sequence of combinatorial operations which elementwise maximizes the language measure vector [19][20]. Note that although ν⋆involves probabilistic reasoning, the final waypoint se- quence obtained is deterministic. 2. Computational efficiency: The intensive step in ν⋆is a special sparse matrix inversion to compute the language measure. The time complexity of each iteration step can be shown to be linear in problem size implying significant numerical advantage over search-based methods for high- dimensional problems. 3. Global monotonicity: The solution iterations are globally monotonic, i.e, each iteration yields a better approxima- tion to the final optimal solution. The final waypoint se- quence is generated essentially by following the measure gradient which has a unique maxima at the goal. 4. Global Optimality: It can be shown that trap-situations are a mathematical impossibility for ν⋆. The optimal navigation gradient produced by ν⋆is reminiscent of potential field methods [7]. However, ν⋆automatically gen- erates, and optimizes this gradient; no ad-hoc potential function is necessary. 1.3. Focus of Current Work & Key Contributions The key focus of this paper is extension of the ν⋆planning algorithm to optimally handle execution uncertainties. It is well recognized by domain experts that merely coming up with a navigation plan is not sufficient; the computed plan must be executed in the real world by the mobile robot, which often cannot be done exactly and precisely due to measurement noise in the exteroceptive sensors, imperfect actuations, and external disturbances. The idea of planning under uncertainties is not particularly new, and good surveys of reported methodologies exist [21]. In chronological order, the main family of reported approaches can be enumerated as follows: • Pre-image Back-chaining [22, 23, 24] where the plan is synthesized by computing a set of configurations from which the robot can possibly reach the goal, and then propgating this preimage recursively backward or back- chaining, a problem solving approach originally proposed in [25]. • Approach based on sensory uncertainty fields (SUF) [26, 27, 28, 29] computed over the collision-free subset of the robot’s configuration space, which reflects expected uncer- tainty (distribution of possible errors) in the sensed config- uration that would be computed by matching the sensory data against a known environment model (e.g. landmark locations). A planner then makes use of the computed SUF to generate paths that minimize expected errors. • Sensor-based planning approaches [30, 31, 32], which consider explicit uncertainty models of various motion primitives to compute a feasible robust plan composed of sensor-based motion commands in polygonal envi- ronments, with significant emphasis on wall-following schemes. 2 • Information space based approach using the Bellman prin- ciple of stochastic dynamic programming [33], which in- troduced key concepts such as setting up the problem in a probabilistic framework, and demanding that the optimal plan maximize the probability of reaching the goal. How- ever, the main drawback was the exponential dependence on the dimension of the computed information space. • The set-membership approach [34] which performs a lo- cal search, trying to deform a path into one that respects uncertainty constraints imposed by arbitrarily shaped un- certainty sets. Each hard constraint is turned into a soft penalty function, and the gradient descent algorithm is em- ployed, hoping convergence to an admissible solution. • Probabilistic approaches based on disjunctive linear pro- gramming [35, 36], with emphasis on UAV applications. The key limitation is the inability to take into account exteroceptive sensors, and also the assumption that dead- reckoning is independent of the path executed. Later ex- tensions of this approach use particle representations of the distributions, implying wider applicability. • Adaptation of search strategies in extended spaces [37, 38, 39, 40], which consider the classical search problem in configuration spaces augmented with uncertainty informa- tion. • Approach based on Stochastic Motion Roadmaps (SRM) [41], which combines sampling-based roadmap representation of the configuration space, with the theory of Markov Decision Processes, to yield models that can be subsequently optimized via value-iteration based infinite horizon dynamic programming, leading to plans that maximize the probability of reaching the goal. The current work adds a new member to the family of exist- ing approaches to address globally optimal path planning under uncertainties. The key novelty of this paper is the association of uncertainty with the notion of uncontrollability in a controlled system. The navigation automaton introduced in [18] is aug- mented with uncontrollable transitions which essentially cap- tures the possibility that the agent may execute actuation se- quences (or motion primitives) that are not coincident with the planned moves. The planning objective is simple: Maximize the probability of reaching the goal, while simultaneously min- imizing the probability of hitting any obstacle. Note that, in this respect, we are essentially solving the exact same problem in- vestigated by [41]. However our solution approach is very dif- ferent. Instead of using value iteration based dynamic program- ming, we use the theory of language-measure-theoretic opti- mization of probabilistic finite state automata [20]. Unlike the SRM approach, the proposed algorithm does not require the use of local auxiliary planners, and also needs to make no assump- tions on the structure of the configuration space to guarantee it- erative convergence. The use of arbitrary penalties for reducing the weight on longer paths is also unnecessary, which makes the proposed ν⋆under uncertainties completely free from heuris- tics. We show that all the key advantages that ν⋆has over the state-of-art carries over to this more general case; namely that of significantly better computational efficiency, simplicity of implementation, and achieving global optimality via monotonic sequence of search-free combinatorial iterative improvements, with guaranteed polynomial convergence. The proposed ap- proach thus solves the inherently non-convex optimization [42] by mapping the physical specification to an optimal control problem for probabilistic finite state machines (the navigation automata), which admits efficient combinatorial solutions via the language-measure-theoretic approach. The source of many uncertainties, namely modeling uncertainty, disturbances, and uncertain localization, is averaged over (or amortized) for ade- quate representation in the automaton framework. This may be viewed as a source of approximation in the proposed approach; however we show in simulation and in actual experimentation that the amortization is indeed a good approach to reduce plan- ning complexity and results in highly robust planning decisions. Thus the modified language-measure-theoretic approach pre- sented in this paper, potentially lays the framework for seam- less integration of data-driven and physics-based models with the high-level decision processes; this is a crucial advantage, and goes to address a key issue in autonomous robotics, e.g., in a path-planning scenario with mobile robots, the optimal path may be very different for different speeds, platform capabilities and mission specifications. Previously reported approaches to handle these effects using exact differential models of platform dynamics results in overtly complex solutions that do not re- spond well to modeling uncertainties, and more importantly to possibly non-stationary environmental dynamics and evolving mission contexts. Thus the measure-theoretic approach enables the developmentof true Cyber-Physical algorithms for control of autonomous systems; algorithms that operate in the logical domain while optimally integrating, and responding to, physi- cal information in the planning process. The rest of the paper is organized in seven sections. Sec- tion 2 briefly explains the language-theoretic models consid- ered in this paper, reviews the language-measure-theoretic op- timal control of probabilistic finite state machines and presents the necessary details of the reported ν⋆algorithm. Section 3 presents the modifications to the navigation model to incor- porate the effects of dynamic uncertainties within the frame- work of probabilistic automata. Section 4 presents the perti- nent theoretical results and establishes the main planning al- gorithm. Section 5 develops a formulation to identify the key amortized uncertainty parameters of the PFSA-based naviga- tion model from an observed dynamical response of a given platform. The proposed algorithm is summarized with pertinent comments in Section 6. The theoretical development is verified in high-fidelity simulations on different navigation models and validated in experimental runs on the SEGWAY RMP 200 in section 7. The paper is summarized and concluded in Section 8 with recommendations for future work. 3 2. Preliminaries: Language Measure-theoretic Optimiza- tion Of Probabilistic Automata This section summarizes the signed real measure of reg- ular languages; the details are reported in [43]. Let Gi ≡ ⟨Q, Σ, δ, qi, Qm⟩be a trim (i.e., accessible and co-accessible) finite-state automaton model that represents the discrete-event dynamics of a physical plant, where Q = {qk : k ∈IQ} is the set of states and IQ ≡{1, 2, · · · , n} is the index set of states; the automaton starts with the initial state qi; the alphabet of events is Σ = {σk : k ∈IΣ}, having Σ T IQ = ∅and IΣ ≡{1, 2, · · · , ℓ} is the index set of events; δ : Q×Σ →Q is the (possibly partial) function of state transitions; and Qm ≡{qm1, qm2, · · · , qml} ⊆Q is the set of marked (i.e., accepted) states with qmk = q j for some j ∈IQ. Let Σ∗be the Kleene closure of Σ, i.e., the set of all finite-length strings made of the events belonging to Σ as well as the empty string ǫ that is viewed as the identity of the monoid Σ∗under the operation of string concatenation, i.e., ǫs = s = sǫ. The state transition map δ is recursively extended to its reflexive and transitive closure δ : Q×Σ∗→Q by defining ∀q j ∈Q, δ(q j, ǫ) = q j and ∀q j ∈Q, σ ∈Σ, s ∈Σ⋆, δ(qi, σs) = δ(δ(qi, σ), s) Definition 1. The language L(qi) generated by a DFSA G ini- tialized at the state qi ∈Q is defined as: L(qi) = {s ∈ Σ∗| δ∗(qi, s) ∈Q} The language Lm(qi) marked by the DFSA G initialized at the state qi ∈Q is defined as: Lm(qi) = {s ∈ Σ∗| δ∗(qi, s) ∈Qm} Definition 2. For every q j ∈Q, let L(qi, q j) denote the set of all strings that, starting from the state qi, terminate at the state q j, i.e., Li, j = {s ∈Σ∗| δ∗(qi, s) = q j ∈Q} The formal language measure is first defined for terminating plants [44, 45] with sub-stochastic event generation probabili- ties i.e. the event generation probabilities at each state summing to strictly less than unity. Definition 3. The event generation probabilities are specified by the function ˜π : Σ⋆× Q →[0, 1] such that ∀q j ∈Q, ∀σk ∈ Σ, ∀s ∈Σ⋆, (1) ˜π(σk, q j) ≜˜π jk ∈[0, 1); P k ˜π jk = 1 −θ, with θ ∈(0, 1); (2) ˜π(σ, q j) = 0 if δ(q j, σ) is undefined; ˜π(ǫ, q j) = 1; (3) ˜π(σks, q j) = ˜π(σk, q j) ˜π(s, δ(q j, σk)). The n × ℓevent cost matrix is defined as: eΠ|i j = ˜π(qi, σj) Definition 4. The state transition probability π : Q × Q → [0, 1), of the DFSA Gi is defined as follows: ∀qi, q j ∈Q, πi j = X σ∈Σ s.t. δ(qi,σ)=qj ˜π(σ, qi) The n×n state transition probability ma- trix is defined as Π| jk = π(qi, q j) The set Qm of marked states is partitioned into Q+ m and Q− m, i.e., Qm = Q+ m ∪Q− m and Q+ m ∩Q− m = ∅, where Q+ m contains all good marked states that we desire to reach, and Q− m contains all bad marked states that we want to avoid, although it may not always be possible to completely avoid the bad states while attempting to reach the good states. To characterize this, each marked state is assigned a real value based on the designer’s perception of its impact on the system performance. Definition 5. The characteristic function χ : Q →[−1, 1] that assigns a signed real weight to state-based sublanguages L(qi, q) is defined as: ∀q ∈Q, χ(q) ∈  [−1, 0), q ∈Q− m {0}, q < Qm (0, 1], q ∈Q+ m (1) The state weighting vector, denoted by χ = [χ1 χ2 · · · χn]T, where χ j ≡χ(q j) ∀j ∈IQ, is called the χ-vector. The j-th ele- ment χ j of χ-vector is the weight assigned to the corresponding terminal state q j. In general, the marked language Lm(qi) consists of both good and bad event strings that, starting from the initial state qi, lead to Q+ m and Q− m respectively. Any event string belonging to the language L0 = L(qi) −Lm(qi) leads to one of the non-marked states belonging to Q −Qm and L0 does not contain any one of the good or bad strings. Based on the equivalence classes defined in the Myhill-Nerode Theorem, the regular languages L(qi) and Lm(qi) can be expressed as: L(qi) = S qk∈Q Li,k and Lm(qi) = S qk∈Qm Li,k = L+ m∪L− m where the sublanguage Li,k ⊆Gi having the initial state qi is uniquely labelled by the terminal state qk, k ∈IQ and Li, j ∩Li,k = ∅∀j , k; and L+ m ≡S qk∈Q+m Li,k and L− m ≡S qk∈Q−m Li,k are good and bad sublanguages of Lm(qi), respectively. Then, L0 = S qk(Elementwise) ν‡. Definition 10 (Optimal Supervision Problem). Given a (non- terminating) plant G = (Q, Σ, δ, eΠ, χ, C ), the problem is to com- pute a supervisor that disables a subset D⋆⊆C , such that ν⋆≧(Elementwise) ν† ∀D† ⊆C where ν⋆and ν† are the mea- sure vectors of the supervised plants G⋆and G† under D⋆and D†, respectively. Remark 1. The solution to the optimal supervision problem is obtained in [20, 47] by designing an optimal policy for a ter- minating plant [45] with a sub-stochastic transition probability matrix (1 −θ)eΠ with θ ∈(0, 1). To ensure that the computed optimal policy coincides with the one for θ = 0, the suggested algorithm chooses a small value for θ in each iteration step of the design algorithm. However, choosing θ too small may cause numerical problems in convergence. Algorithms reported in [20, 47] computes how small a θ is actually required, i.e., computes the critical lower bound θ⋆, thus solving the optimal supervision problem for a generic PFSA. It is further shown that the solution obtained is optimal and unique and can be computed by an effective algorithm. Definition 11. Following Remark 1, we note that algorithms reported in [20, 47] compute a lower bound for the criti- cal termination probability for each iteration of such that the disabling/enabling decisions for the terminating plant coin- cide with the given non-terminating model. We define θmin = mink θ[k] ⋆where θ[k] ⋆is the termination probability computed in the kth iteration. Definition 12. If G and G⋆are the unsupervised and super- vised PFSA respectively then we denote the renormalized mea- sure of the terminating plant G⋆(θmin) as νi # : 2L(qi) →[−1, 1] (See Definition 7). Hence, in vector notation we have: ν# = θmin[I −(1 −θmin)Π#]−1χ where Π# is the transition probability matrix of the supervised plant G⋆, we note that ν# = ν[K] where K is the total number of iterations required for convergence. For the sake of completeness, the algorithmic approach is shown in Algorithms 1 and 2. 2.3. Problem Formulation: A PFSA Model of Autonomous Navigation Figure 1: (a) shows the vehicle (marked ”R”) with the obstacle positions shown as black squares. The Green4 dot identifies the goal (b) shows the finite state representation of the possible one-step moves from the current position. (d) shows uncontrollable transitions ”u” from states corresponding to blocked grid locations to ”q⊖” We consider a 2D workspace for the mobile agents. This restriction on workspace dimensionality serves to simplify the exposition and can be easily relaxed. To set up the problem, the 5 Algorithm 1: Computation of Optimal Supervisor input : P, χ, C output: Optimal set of disabled transitions D⋆ begin Set D[0] = ∅; /* Initial disabling set */ Set eΠ[0] = eΠ ; /* Initial event prob. matrix */ Set θ[0] ⋆= 0.99, Set k = 1 , Set Terminate = false; while (Terminate == false) do Compute θ[k] ⋆; /* Algorithm 2 */ Set eΠ[k] = 1−θ[k] ⋆ 1−θ[k−1] ⋆ eΠ[k−1]; Compute ν[k] ; for j = 1 to n do for i = 1 to n do Disable all controllable qi σ−→qj s.t. ν[k] j < ν[k] i ; Enable all controllable qi σ−→qj s.t. ν[k] j ≧ν[k] i ; Collect all disabled transitions in D[k]; if D[k] == D[k−1] then Terminate = true; else k = k + 1 ; D⋆= D[k] ; /* Optimal disabling set */ end workspace is first discretized into a finite grid and hence the ap- proach developed in this paper falls under the generic category of discrete planning. The underlying theory does not require the grid to be regular; however for the sake of clarity we shall present the formulation under the assumption of a regular grid. The obstacles are represented as blocked-off grid locations in Algorithm 2: Computation of the Critical Lower Bound θ⋆ input : P, χ output: θ⋆ begin Set θ⋆= 1, Set θcurr = 0; Compute P , M0 , M1, M2; for j = 1 to n do for i = 1 to n do if Pχ i −Pχ j , 0 then θcurr = 1 8M2 Pχ i −Pχ j else for r = 0 to n do if M0χ i , M0χ j then Break; else if  M0Mr 1χ  i ,  M0Mr 1χ  j then Break; if r == 0 then θcurr = |{(M0−P)χ}i−{(M0−P)χ}j| 8M2 ; else if r > 0 AND r ≤n then θcurr = |(M0M1χ)i−(M0M1χ)j| 2r+3M2 ; else θcurr = 1 ; θ⋆= min(θ⋆, θcurr) ; end the discretized workspace. We specify a particular location as the fixed goal and consider the problem of finding optimal and feasible paths from arbitrary initial grid locations in the workspace. Figure 1(a) illustrates the basic problem setup. We further assume that at any given time instant the robot occupies one particular location (i.e. a particular square in Figure 1(a)). As shown in Figure 1, the robot has eight possible moves from any interior location. The boundaries are handled by removing the moves that take the robot out of the workspace. The possi- ble moves are modeled as controllable transitions between grid locations since the robot can ”choose” to execute a particular move from the available set. We note that the number of pos- sible moves (8 in this case) depends on the chosen fidelity of discretization of the robot motion and also on the intrinsic ve- hicle dynamics. The complexity results presented in this paper only assumes that the number of available moves is significantly smaller compared to the number of grid squares, i.e., the dis- cretized position states. Specification of inter-grid transitions in this manner allows us to generate a finite state automaton (FSA) description of the navigation problem. Each square in the discretized workspace is modeled as a FSA state with the controllable transitions defining the corresponding state transi- tion map. The formal description of the model is as follows: Let GNAV = (Q, Σ, δ, ˜Π, χ) be a Probabilistic Finite State Au- tomaton (PFSA). The state set Q consists of states that corre- spond to grid locations and one extra state denoted by q⊖. The necessity of this special state q⊖is explained in the sequel. The grid squares are numbered in a pre-determined scheme such that each qi ∈Q \ {q⊖} denotes a specific square in the discretized workspace. The particular numbering scheme cho- sen is irrelevant. In the absence of dynamic uncertainties and state estimation errors, the alphabet contains one uncontrollable event i.e. Σ = ΣC S{u} such that ΣC is the set of controllable events corresponding to the possible moves of the robot. The uncontrollable event u is defined from each of the blocked states and leads to q⊖which is a deadlock state. All other transitions (i.e. moves) are removed from the blocked states. Thus, if a robot moves into a blocked state, it uncontrollably transitions to the deadlock state q⊖which is physically interpreted to be a collision. We further assume that the robot fails to recover from collisions which is reflected by making q⊖a deadlock state. We note that q⊖does not correspond to any physical grid loca- tion. The set of blocked grid locations along with the obstacle state q⊖is denoted as QOBSTACLE ⫅Q. Figure 1 illustrates the navigation automaton for a nine state discretized workspace with two blocked squares. Note that the only outgoing transi- tion from the blocked states q1 and q8 is u. Next we augment the navigation FSA by specifying event generation probabilities defined by the map ˜π : Q × Σ →[0, 1] and the characteristic state-weight vector specified as χ : Q →[−1, 1]. The character- istic state-weight vector [20] assigns scalar weights to the PFSA states to capture the desirability of ending up in each state. Definition 13. The characteristic weights are specified for the 6 navigation automaton as follows: χ(qi) =  −1 if qi ≡q⊖ 1 if qi is the goal 0 otherwise (3) In the absence of dynamic constraints and state estimation uncertainties, the robot can ”choose” the particular controllable transition to execute at any grid location. Hence we assume that the probability of generation of controllable events is uniform over the set of moves defined at any particular state. Definition 14. Since there is no uncontrollable events defined at any of the unblocked states and no controllable events defined at any of the blocked states, we have the following consistent specification of event generation probabilities: ∀qi ∈Q, σj ∈ Σ, ˜π(qi, σj) = ( 1 No. of controllable events at qi , if σj ∈ΣC 1, otherwise The boundaries are handled by ”surrounding” the workspace with blocked position states shown as ”boundary obstacles” in the upper part of Figure 1(c). Definition 15. The navigation model id defined to have identi- cal connectivity as far as controllable transitions are concerned implying that every controllable transition or move (i.e. every element of ΣC) is defined from each of the unblocked states. 2.4. Decision-theoretic Optimization of PFSA The above-described probabilistic finite state automaton (PFSA) based navigation model allows us to compute opti- mally feasible path plans via the language-measure-theoretic optimization algorithm [20] described in Section 2. Keeping in line with nomenclature in the path-planning literature, we refer to the language-measure-theoretic algorithm as ν⋆in the sequel. For the unsupervised model, the robot is free to exe- cute any one of the defined controllable events from any given grid location (See Figure 1(b)). The optimization algorithm se- lectively disables controllable transitions to ensure that the for- mal measure vector of the navigation automaton is elementwise maximized. Physically, this implies that the supervised robot is constrained to choose among only the enabled moves at each state such that the probability of collision is minimized with the probability of reaching the goal simultaneously maximized. Al- though ν⋆is based on optimization of probabilistic finite state machines, it is shown that an optimal and feasible path plan can be obtained that is executable in a purely deterministic sense. Let GNAV be the unsupervised navigation automaton and G⋆ NAV be the optimally supervised PFSA obtained by ν⋆. We note that νi # is the renormalized measure of the terminating plant G⋆ NAV(θmin) with substochastic event generation probability ma- trix eΠθmin = (1 −θmin)eΠ. Denoting the event generating function (See Definition 3) for G⋆ NAV and G⋆ NAV(θmin) as ˜π : Q × Σ →Q and ˜πθmin : Q × Σ →Q respectively, we have ˜πθmin(qi, ǫ) = 1 (4a) ∀qi ∈Q, σj ∈Σ, ˜πθmin(qi, σj) = (1 −θmin)˜π(qi, σj) (4b) Notation 2.1. For notational simplicity, we use νi #(L(qi)) = ν#(qi) = ν#|i where ν# = θmin[I −(1 −θmin)Π#]−1χ Definition 16 (ν⋆-path). A ν⋆-path ρ(qi, q j) from state qi ∈Q to state q j ∈Q is defined to be an ordered set of PFSA states ρ = {qr1, · · · , qrM} with qrs ∈Q, ∀s ∈{1, · · · , M}, M ≤CARD(Q) such that qr1 = qi (5a) qrM = qj (5b) ∀i, j ∈{1, · · · , M}, qri , qrj (5c) ∀s ∈{1, · · · , M}, ∀t ≦s, ν#(qrt) ≦ν#(qrs) (5d) We reproduce without proof the following key results per- taining to ν⋆- planning as reported in [18]. Lemma 1. There exists an enabled sequence of transitions from state qi ∈Q \ QOBSTACLE to q j ∈Q \ {q⊖} in G⋆ NAV if and only if there exists a ν⋆-path ρ(qi, q j) in G⋆ NAV. Proposition 1. For the optimally supervised navigation au- tomaton G⋆ NAV, we have ∀qi ∈Q \ QOBSTACLE, L(qi) ⫅Σ⋆ C Corollary 1. (Obstacle Avoidance:) There exists no ν⋆-path from any unblocked state to any blocked state in the optimally supervised navigation automaton G⋆ NAV. Proposition 2 (Existence of ν⋆-paths). There exists a ν⋆-path ρ(qi, qGOAL) from any state qi ∈Q to the goal qGOAL ∈Q if and only if ν#(qi) > 0. Corollary 2. (Absence of Local Maxima:) If there exists a ν⋆- path from qi ∈Q to q j ∈Q and a ν⋆-path from qi to qGOAL then there exists a ν⋆-path from q j to qGOAL, i.e., ∀qi, q j ∈Q  ∃ρ1(qi, qGOAL) ^ ∃ρ2(qi, q j) ⇒∃ρ(q j, qGOAL)  2.5. Optimal Tradeoff between Computed Path Length & Avail- ability Of Alternate Routes qi q j qk qGOAL σ1 σ2 ω ω3 ω4 ω2 ω1 ν#(q j) > ν#(qk) Figure 2: Tradeoff between path-length and robustness under dynamic uncertainty: σ2ω is the shortest path to qGOAL from qi; but the ν⋆plan may be σ1ω1 due to the availability of larger number of feasible paths through qj. 7 Majority of reported path planning algorithms consider min- imization of the computed feasible path length as the sole op- timization objective. However, the ν⋆algorithm can be shown to achieve an optimal trade-off between path lengths and avail- ability of feasible alternate routes. If ω is the shortest path to goal from state qk, then the shortest path from state qi (with qi σ2 −−→qk) is given by σ2ω. However, a larger number of feasi- ble paths may be available from state q j (with qi σ1 −−→q j) which may result in the optimal ν⋆plan to be σ1ω1. Mathematically, each feasible path from state q j has a positive measure which may sum to be greater than the measure of the single path ω from state qk. The condition ν#(q j) > ν#(qk) would then imply that the next state from qi would be computed to be q j and not qk. Physically it can be interpreted that the mobile gent is bet- ter off going to q j since the goal remains reachable even if one or more paths become unavailable. The key results [18] are as follows: Lemma 2. For the optimally supervised navigation automaton G⋆ NAV, we have ∀qi ∈Q \ QOBSTACLE, ∀ω ∈L(qi), νi #({ω}) = θmin  1 −θmin CARD(ΣC) |ω| χ(δ#(qi, ω)) Proposition 3. For qi ∈Q \ QOBSTACLE, let qi σ1 −−→q j →· · · → qGOAL be the shortest path to the goal. If there exists qk ∈Q \ QOBSTACLE with qi σ2 −−→qk for some σ2 ∈ΣC such that ν#(qk) > ν#(q j), then the number of distinct paths to goal from state qk is at least CARD(ΣC) + 1. The lower bound computed in Proposition 3 is not tight and if the alternate paths are longer or if there are multiple ’short- est’ paths then the number of alternate routes required is sig- nificantly higher. Detailed examples can be easily presented to illustrate situation where ν⋆opts for a longer but more robust plan. 3. Generalizing The Navigation Automaton To Accommo- date Uncertain Execution In this paper, we modify the PFSA-based navigation model to explicitly reflect uncertainties arising from imperfect local- ization and the dynamic response of the platform to navigation commands. These effects manifest as uncontrollable transitions in the navigation automaton as illustrated in Figure 4. Note, while in absence of uncertainties and dynamic effects, one can disable transitions perfectly, in the modified model, such dis- abling is only partial. Choosing the probabilities of the uncon- trollable transitions correctly allows the model to incorporate physical movement errors and sensing noise in an amortized fashion. A sample run with a SEGWAY RMP at NRSL is shown in Figure 3. Note that the robot is unable to follow the plan ex- actly due to cellular discretization and dynamic effects. Such effects can be conceptually modeled by decomposing trajectory fragments into sequential combinations of controllable and un- controllable inter-cellular moves as illustrated in Figure 4(c). We do not need to actually decompose trajectories, it is merely a conceptual construct that gives us a theoretical basis for com- puting the probabilities of uncontrollable transitions from ob- served robot dynamics (as described later in Section 5, and therefore incorporate the amortized effect of uncertainties in the navigation automaton. 0 1 2 3 4 −8 −7 −6 −5 −4 −3 −2 −1 0 Plan Actual Trajectory 1.4 1.6 1.8 −6.4 −6.2 −6 −5.8 −5.6 −5.4 XY Scale in meters Experimental Run at NRSL with SEGWAY RMP Figure 3: Plan execution with SEGWAY RMP at NRSL, Pennstate 3.1. The Modified Navigation Automaton The modified navigation automaton GMOD NAV = (Q, Σ, δ, eΠ, χ) is defined similar to the formulation in Section 2.3, with the exception that the alphabet Σ is defined as follows: Σ = ΣC ∪ΣUC ∪{u} (6) where ΣC is the set of controllable moves from any unblocked navigation state (as before), while ΣUC is the set of uncontrol- lable transitions that can occur as an effect of the platform dy- namics and oather uncertainty effects. We assume that for each σ ∈ΣC, we have a corresponding event σu in ΣUC, such that both σ and σu represent the same physical move from a given navigation state; but while σ is controllable and may be dis- abled, σu is uncontrollable. Although for 2D circular robots we have: CARD(ΣC) = CARD(ΣUC), in general, there can exist un- controllable moves reflecting estimation errors that cannot be realized via a single controllable move. For example, for pla- nar rectangular robots with a non-zero minimum turn radius, there can be an uncontrollable shift in the heading without any change in the xy-positional coordinates, which may reflect er- rors in heading estimation, but such a move cannot be executed via controllable transitions due to the restriction on the mini- mum turn radius. We will discuss these issues in more details in the sequel. 8 C R L F RF B LF RB LB e1 e5 e3 e7 e2 e4 e6 e8 (a) C R L F RF B LF RB LB e1 e7 e8 e2, e3, e4, e5, e6 (b) eplacements J0 J4 J7 Trajectory Controllable Move Uncontrollable Move (c) C R L F RF B LF RB LB e′ 5 e′ 3 e′ 2 e′ 4 e′ 6 e1 e7 e8 ( Partially Disabled (d) Figure 4: (a) shows available moves from the current state (C) in unsupervised navigation automaton. (b) shows the enabled moves in the optimally supervised PFSA with no dynamic uncertainty, (c) illustrates the case with dynamic uncer- tainty, so that the robot can still uncontrollably (and hence unwillingly) make the disabled transitions, albeit with a small probability, i.e., probability of tran- sitions e′ 2, e′ 3, e′ 4 etc. is small. (d) illustrates the concept of using uncontrollable transitions to model dynamical response for a 2D circular robot: J0 is the target cell from J7, while the actual trajectory of the robot (shown in dotted line) ends up in J4. We can model this trajectory fragment as first executing a controllable move to J0 and then uncontrollably moving to J4. Definition 17. The coefficient of dynamic deviation γ(GMOD NAV ) is defined as follows: γ(GMOD NAV ) = 1 −max qi∈Q X σu∈ΣUC ˜π(qi, σu) (7) Definition 18. The event generation probabilities for GMOD NAV is defined as follows: ∀qi ∈Q \ {qGOAL}, σj ∈Σ, ˜π(qi, σj) =  1−P σu∈ΣUC ˜π(qi,σu) No. of controllable events at qi , if σj ∈ΣC ˜π(qi, σj), if σj ∈ΣUC 1, otherwise and for the goal, we define as before: ˜π(qGOAL, σj) = ( 1 No. of controllable events at qi , if σj ∈ΣC 1, otherwise Note that we assume there is no uncontrollability at the goal. This assumption is made for technical reasons clarified in the sequel and also to reflect the fact that once we reach the goal, we terminate the mission and hence such effects can be ne- glected. We note the following: • In the idealized case where we assume platform dynam- ics is completely absent, we have ˜π(qi, σu) = 0, ∀qi ∈ Q, ∀σu ∈ΣUC implying that γ(GMOD NAV ) = 1, while in prac- tice, we expect γ(GMOD NAV ) < 1. • In Definition 17, we allowed for the possibility of ˜π(qi, σu) being dependent on the particular navigation states qi ∈Q. A significantly simpler approach would be to redefine the probability of the uncontrollable events ˜π(qi, σu) as fol- lows: ˜πAV(σu) = 1 CARD(Q) X qi∈Q ˜π(qi, σu) (8) where ˜πAV(σu) is the average probability of the uncontrol- lable event σu being generated. The averaging of the probabilities of uncontrollable transitions is justified in situations where we can assume that the dynamic response of the platform is not dependent on the location of the platform in the workspace. In this simplified case, the event generation probabilities for GMOD NAV can be stated as: ∀qi ∈Q \ {qGOAL}, σj ∈Σ, ˜π(qi, σj) =  γ(GMOD NAV ) No. of controllable events at qi , if σj ∈ΣC ˜πAV(σj), if σj ∈ΣUC 1, otherwise The key difficulty is allowing the aforementioned dependence on states is not the decision optimization that would follow, but the complexity of identifying the probabilities; averaging re- sults in significant simplification as shown in the sequel. Thus, even if we cannot realistically average out the uncontrollable transition probabilities over the entire state space, we could de- compose the workspace to identify subregions where such an assumption is locally valid. In this paper, we do not address formal approaches to such decomposition, and will generally assume that the afore-mentioned averaging is valid through- out the workspace; the explicit identification of the sub-regions is more a matter of implementation specifics, and has little to do with the details of the planning algorithm presented here, and hence will be discussed elsewhere. In Section 5, we will address the computation of the probabilities of uncontrollable transitions from observed dynamics. First, we will establish the main planning algorithm as a solution to the performance opti- mization of the navigation automaton in the next section. 4. Optimal Planning Via Decision Optimization Under Dy- namic Effects The modified model GMOD NAV can be optimized via the measure- theoretic technique in a straightforward manner, using the ν⋆- algorithm reported in [18]. The presence of uncontrollable tran- sitions in GMOD NAV poses no problem (as far as the automaton opti- mization is concerned), since the underlying measure-theoretic optimization is already capable of handling such effects [20]. However the presence of uncontrollable transitions weakens some of the theoretical results obtained in [18] pertaining to navigation, specifically the absence of local maxima. We show 9 that this causes the ν⋆planner to lose some of its crucial advan- tages, and therefore must be explicitly addressed via a recursive decomposition of the planning problem. Proposition 4 (Weaker Version of Proposition 2). There exists a ν⋆-path ρ(qi, qGOAL) from any state qi ∈Q to the goal qGOAL ∈ Q if ν#(qi) > 0. Proof. We note that ν#(qi) > 0 implies that there necessarily exists at least one string ω of positive measure initiating from qi and hence there exists at least one string that terminates on qGOAL. The proof then follows from the definition of ν⋆-paths (See Definition 16). Remark 2. Comparing with Proposition 2, we note that the only if part of the result is lost in the modified case. Remark 3. We note that under the modified model, ν#(qi) < 0 needs to be interpreted somewhat differently. In absence of any dynamic uncertainty, ν#(qi) < 0 implies that no path to goal exists. However, due to weakening of Proposition 1 (See Proposition 4), ν#(qi) < 0 implies that the measure of the set of strings reaching the goal is smaller to that of the set of strings hitting an obstacle from the state qi. The ν⋆-planning algorithm is based on several abstract con- cepts such as the navigation automaton and the formal mea- sure of symbolic strings. It is important to realize that in spite of the somewhat elaborate framework presented here, ν⋆- optimization is free from heuristics, which is often not the case with competing approaches. In this light, the next proposition is critically important as it elucidates this concrete physical con- nection. PSfrag replacements No Connecting Path Goal A B Obstacles Figure 5: Absence of uncontrollable transitions at the goal imply that there is no path in the optimally disabled navigation automaton from point A to point B (or vice versa), since all controllable transitions will necessarily be disabled at the goal. It follows that the stationary probability vector may be different depending on whether one starts left or right to the goal. However, note that that any two points on the same side have a path (possibly made of uncontrollable transitions) between them; implying that the stationary probability vector will be identical if either of them is chosen as the start locations. Proposition 5. Given that a feasible path exists from the start- ing state to the goal, the ν⋆planning algorithm under non- trivial dynamic uncertainty (i.e. with γ(GMOD NAV ) < 1) maximizes the probability ℘GOAL of reaching the goal while simultaneously minimizing the probability ℘⊖of hitting an obstacle. Proof. Let ℘be the stationary probability vector for the stochastic transition probability matrix corresponding to the navigation automaton GMOD NAV , for a starting state from which a feasible path to goal exists. (Note that ℘may depend on the starting state; Figure 5 illustrates one such example. However, once we fix a particular starting state, the stationary vector ℘is uniquely determined). The selective disabling of controllable events modifies the transition matrix and in effect alters ℘, such that ℘Tχ is maximized [20], where χ is the characteristic weight vector, i.e. , χi = χ(qi). Recalling that χ(qGOAL) = 1, χ(q⊖) = −1 and χ(qi) = 0 if qi is neither the goal nor the abstract ob- stacle state q⊖, we conclude that the optimization, in effect, maximizes the quantity: ψ = ℘GOAL −℘⊖ (9) Also, note that the optimized navigation automaton has only two dump states, namely the goal qGOAL and the abstract ob- stacle state q⊖. That the goal qGOAL is in fact a dump state is ensured by not having uncontrollable transitions at the goal (See Definition 18). Hence we must have ∀qi ∈Q \ {qGOAL, q⊖}, ℘i = 0 (10) implying that ℘GOAL + ℘⊖= 1 (11a) =⇒ψ = 2℘GOAL −1 = 1 −2℘⊖ (11b) Hence it follows that the optimization maximizes ℘GOAL and simultaneously minimizes ℘⊖. Remark 4. It is easy to see that Proposition 5 remains valid if χ(qGOAL) = χGOAL > 1. In fact, the result remains valid as long as the characteristic weight of the goal is positive and the char- acteristic weight of the abstract obstacle state q⊖is negative. 4.1. Recursive Problem Decomposition For Maxima Elimina- tion Weakening of Proposition 1 (See Proposition 4) has the cru- cial consequence that Corollary 2 is no longer valid. Local maxima can occur under the modified model. This is a seri- ous problem for autonomous planning and must be remedied. The problem becomes critically important when applied to so- lution of mazes; larger the number of obstables, higher is the chance of ending up in a local maxima. While elimination of local maxima is notoriously difficult for potential based plan- ning approaches, ν⋆can be modified with ease into a recursive scheme that yields maxima-free plans in models with non-zero dynamic effects (i.e. with γ(GMOD NAV ) < 1). It will be shown in the sequel that for successful execution of the algorithm, we may need to assign a larger than unity char- acteristic weight χGOAL to the goal qGOAL. A sufficient lower bound for χGOAL, with possible dependence on the recursion step, is given in Proposition 6. The basic recursion scheme can be described as follows (Also see the flowchart illustration in Algorithm 3): 10 1. In the first step (i.e., at recursion step k = 1) we execute ν⋆-optimization on the given navigation automaton GMOD NAV and obtain the measure vector ν#[k]. 2. We denote the set of states with strictly positive measure as Qk (k denotes the recursion step), i.e., Qk = {qi ∈Q : ν# [k]|i > 0} (12) 3. If Qk = Qk−1, the recursion terminates; else we update the characteristic weights as follows: ∀qi ∈Qk, χ(qi) = χ[k] GOAL (13) and continue the recursion by going back to the first step and incrementing the step number k. Algorithm 3: Flowchart for recursive ν⋆-planning ag replacements Problem: GMOD NAV Execute ν⋆ Qk = Qk−1? Terminate Set k = k + 1 Save vector ν#[k] 1. Set: ∀qi ∈Qk χ(qi) = χ[k] GOAL 2. Eliminate Uncont. transitions from all qi ∈Qk No Yes Define: Qk = {qi ∈Q : ν#[k]|i > 0} Initialize: k = 0, Q0 = {qGOAL} Proposition 6. If θ[k] min is the critical termination probability (See Definition 11) for the ν⋆-optimization in the kth recursion step of Algorithm 3, then the following condition χ[k] GOAL > CARD(ΣC) 1 −θ[k] min 1 γ −1 ! (14) is sufficient to guarantee that the following statements are true: 1. If there exists a state qi ∈Q \ Qk from which at least one state qℓ∈Qk is reachable in one hop, then ν#[k]|i > 0. 2. The recursion terminates in at most CARD(Q) steps. 3. For the kth recursion step, either Qk ⫌Qk−1 or no feasible path exists to qGOAL from any state qi ∈Q \ Qk−1. PSfrag replacements qGOAL qi 1 −θmin (1 −γ)(1 −θmin) (1 −θmin) γ CARD(ΣC) qj qk q⊖ Figure 6: Illustration for Proposition 6. Uncontrollable events and strings are shown in dashed. Proof. Statement 1: We first consider the first recursion step, i.e., the case where k = 0 and Qk = {qGOAL} (See Algorithm 3). We note that the goal qGOAL achieves the maximum measure on account of the fact that only qGOAL has a positive characteristic weight, i.e., we have ∀qi ∈Q, ν#[1]|GOAL ≧ν#[1]|i (15) It follows that all controllable transitions from the goal will be disabled in the optimized navigation automaton obtained at the end of the first recursion step (See Definition 8 and Algo- rithms 1 & 2), which in turn implies that the non-renormalized measure of the goal (at the end of the first recursion step) is given by χGOAL 1 θmin . The Hahn Decomposition Theorem [46], allows us to write: ν#|i = ν#(L+(qi)) + ν#(L−(qi)) (16) where L+(qi), L−(qi) are the sets of strings initiating from state qi that have positive and negative measures respectively. Let qi ∈Q \ {qGOAL} such that qGOAL is reachable from qi in one hop. We note that since it is possible to reach the goal in one hop from qi, we have: ν#(L+(qi)) ≧θmin × γ(1 −θmin) CARD(ΣC) × χGOAL θmin (17) where the first term arises due to renormalization (See Defini- tion 7), the second term denotes the probability of the transition leading to the goal and the third term is the non-renormalized measure of the goal itself (as argued above). Since it is obvious that the goal achieves the maximum measure, the transition to the goal will obviously be enabled in the optimized automaton, which justifies the second term. It is clear that there are many more strings of positive measure (e.g. arising due to the self loops at the state qi that correspond to the disabled controllable events that do not transition to the goal from qi) which are not considered in the above inequality (which contributes to mak- ing the left hand side even larger); therefore guaranteeing the correctness of the lower bound stated in Eq.(17). Next, we compute a lower bound for ν#(L−(qi)). To that ef- fect, we consider an automaton G′ identical to the navigation automaton at hand in ever respect, but the fact that the qGOAL has zero characteristic. We denote the state corresponding to qi in this hypothesized automaton as q′ i, and the set of al states in G′ as Q′. We claim that, after a measure-theoretic optimiza- tion (i.e. after applying Algorithms 1 and 2), the measure of q′ i, denoted as ν⋆(q′ i), satisfies: ν⋆(q′ i) ≧−(1 −γ) (18) To prove the claim in Eq. (18), we first note that denoting the renormalized measure vector for G′ before any optimization as ν′, the characteristic vector as χ′ and for any termination prob- ability θ ∈(0, 1), we have: ||ν′||∞= ||θ[I −(1 −θ)Π]−1χ′||∞ ≦||θ[I −(1 −θ)Π]−1||∞× 1 = 1 (19) which follows from the following facts: 11 1. For all θ ∈(0, 1], θ[I −(1 −θ)Π]−1 is a row-stochastic matrix and therefore has unity infinity norm [19] 2. ||χ′||∞= 1, since all entries of χ′ are 0 except for the state corresponding to the obstacle state in the navigation au- tomaton, which has a characteristic of −1. Since the only non-zero characteristic is −1, it follows that no state in G′ can have a positive measure and we conclude from Eq. (19) that: ∀q′ j ∈Q′, ν(q′ j) ∈[−1, 0] (20) Note that q′ i is not blocked itself (since we chose qi such that a feasible 1-hop path to the goal exists from qi). Next, we sub- ject G′ to the measure-theoretic optimization (See Algorithms 1 & 2), which disables all controllable transitions to the blocked states. In order to compute a lower bound on the optimized measure for the state q′ i, (denoted by ν⋆(q′ i) ), we consider the worst case scenario where all neighboring states that can be reached from q′ i in single hops are blocked. Denoting the set of all such neighboring states of qi by N(q′ i), we have: ν⋆(q′ i) = X q′ j∈N(q′ i) Πu i jν(q′ j) ≧−1 × X q′ j∈N(q′ i) Πu i j = −1 × (1 −γ) (21) where Πu i j is the probability of the uncontrollable transition from q′ i to the neighboring state q′ j. Note that we can write Eq. (21) in the worst case scenario where each state in N(q′ i) is blocked, since all controllable transitions from q′ i will be dis- abled in the optimized plant under such a scenario, and only the uncontrollable transitions will remain enabled; and the proba- bilities of all uncontrollable transitions defined at state qi sums to 1 −γ. It is obvious that the lower bound computed in Eq. (21) also reflects a lower bound for ν#(L−(qi)), since addi- tion of state(s) with positive characteristic or eliminating obsta- cles cannot possibly make strings more negative. Furthermore, recalling that the goal qGOAL is actually reachable from state qi by a single hop, it follows that not all neighbors of qi in the navigation automaton are blocked, and hence we have the strict inequality: ν#(L−(qi)) > −(1 −γ) (22) Combining Eqns. (17) and (22), we note that the following condition is sufficient for guaranteeing ν#|i > 0. γ(1 −θmin) CARD(ΣC) × χGOAL > 1 −γ (23) which after a straightforward calculation yields the bound stated in Eq. 14, and the Statement 1 is proved for the first re- cursion step, i.e. for Qk = {qGOAL}. To extend the argument to later recursion steps of Algo- rithm 3, i.e., for k > 0, we argue as follows. Let Qk ⫌{qGOAL} and we have eliminated all uncontrollable transitions from all q j ∈Qk (as required in Algorithm 3). Further, let qi ∈Q \ Qk such that it is possible to reach some q j ∈Q in a single control- lable hop, i.e. qi σ −−−−−−−−→ controllable q j, q j ∈Qk (24) PSfrag replacements qi qi σ σ σ1 σ1 σ2 σ1, σ2 Qk q j1 q j2 q j3 qGOAL Figure 7: Illustration for Statement 1 of Proposition 6. Note that even if Qk has multiple states, qj1, qj2, qj3, the measure of any string (say σσ1σ1σ2) from qi is the same as if qi was directly connected to the goal qGOAL with all con- trollable events disabled at qGOAL. The bottom plate illustrates this by showing the hypothetical scenario where qi is connected to qGOAL by σ and σ1, σ2 are controllable events disabled at qGOAL. Note that for this argument to work, we must eliminate uncontrollable transitions from all states in Qk. We first claim that ∀q j ∈Qk, qr < Qk, ν#[k]| j > ν#[k]|r (25) which immediately follows from the fact that the optimal con- figuration (of transitions from states in Qk) at the end of the ν⋆-optimization at the kth step would be to have all control- lable transitions from states q j ∈Qk enabled if and only if the transition goes to some state in Qk, since in that case ev- ery string initiating from q j terminates on a state having char- acteristic χGOAL (since there is no uncontrollability from states within Qk by construction), whereas if a transition q j σ −−−−−−−−→ controllable qr, where q j ∈Qk, qr < Qk allows strings which end up in zero- characteristic states and also (via uncontrollable transitions) on negative-characteristic states. Eq. (25) implies that no enabled string exits Qk. It therefore follows that every string σω starting from the state qi, with ω ∈ Σ⋆ C and δ(qi, σ) ∈Qk (i.e., σ leads to some state within Qk) has exactly the same measure as if qi is directly connected to qGOAL and all controllable transitions are disabled at qGOAL (See Figure 7 for an illustration). This conceptual reduction implies that Eq. (17) is valid when Qk ⫌{qGOAL} since the lower bound for ν#(L+(qi)) can be computed exactly as already done for the case with Qk = {qGOAL}. The argument for obtaining the lower bound for ν#(L−(qi)) is the same as before, thus completing the proof for Statement 1 for all recursion steps of Algorithm 3 . Statement 2: Let QR ⫋Q be the set of states from which a feasible path to the goal exists. If CARD(QR) = 1, then we must have QR = {qGOAL} and the recursion terminates in one step. In general, for the kth recursion step, let CARD(Qk) < CARD(QR). Since there exists at least one state, not in Qk, from which a feasible path to the goal exists, it follows that there exists at least one state q j from which it is possible to reach a state in Qk in one hop. Using Statement 1, we can then conclude: Qk+1 , Qk ⇒CARD(Qk+1) ≧CARD(Qk) + 1 12 ⇒CARD(Qk+1) ≧k + 1 (26) which immediately implies that the recursion must terminate in at most CARD(Q) steps. Statement 3: Follows immediately from the argument used for proving State- ment 2. Remark 5. The generality of Eq. (14) is remarkable. Note that the lower bound is not directly dependent on the exact structure of the navigation automaton; what only matters is the number of controllable moves available at each state, the coefficient of dynamic deviation γ(GMOD NAV ) and the critical termination prob- ability θmin. Although the exact automaton structure and the probability distribution of the uncontrollable transitions are not directly important, their effect enters, in a somewhat non-trivial fashion, through the value of the critical termination probabil- ity. The reader might want to review Algorithm 2 (See also [19, 20]) which computes the critical termination probability in each step of the ν⋆-optimization for a better elucidation of the aforementioned connection between the structure of the navi- gation automaton and χGOAL. The dependencies of the acceptable lower bound for χGOAL with the coefficient of dynamic deviation γ(GMOD NAV ), as com- puted in Proposition 6, is illustrated in Figures 8(a) and (b). The key points to be noted are: 1. As γ(GMOD NAV ) →0+, χGOAL →+∞; which reflects the phys- ical fact that if no events are controllable, then we cannot optimize the mission plan no matter how large χGOAL is chosen. 2. As γ(GMOD NAV ) →1, χGOAL →0; which implies that in the absence of dynamic effects any positive value of χGOAL suffices. This reiterates the result obtained with χGOAL = 1 in [18]. 3. As the number of available controllable moves increases (See Figure 8(a)), we need a larger value of χGOAL; sim- ilarly if the critical termination probability θmin is large, then the value of χGOAL required is also large (See Fig- ure 8(b)). 4. The functional relationships in Figures 8(a) and (b) estab- lish the fact that for relatively smaller number of control- lable moves, a large value of γ(GMOD NAV ) and a small termi- nation probability, a constant value of χGOAL = 1 may be sufficient. 4.2. Plan Assembly & Execution Approach The plan vectors ν#[k] (Say, there are K of them, i.e., k ∈ {1, · · · , K}) obtained via the recursive planning algorithm de- scribed above, can be used for subsequent mission execution in two rather distinct ways: 1. (The Direct Approach:) • At any point during execution, if the current state qi ∈Qk for some k ∈{1, · · · , K}, then use the gra- dient defined by the plan vector ν#[k] to decide on Figure 8: Variation of the acceptable lower bound for χGOAL with γ(GMOD NAV ). (a) The set of controllable moves is expanded from CARD(ΣC) = 4 to CARD(ΣC) = 100 while holding θmin = 0.01 (b) The critical termination probability θmin is varied from 0.001 to 0.1 while holding CARD(ΣC) = 8. Note the lines are almost coincident in this case. the next move, i.e., q j is an acceptable next state if ν#[k]| j > ν#[k]|i and for states qℓthat can be reached from the current state qi via controllable events, we have ν#[k]| j ≧ν#[k]|ℓ. • if ∀k ∈{1, · · · , K}, qi < Qk, then terminate operation because there is no feasible path to the goal. • Note that this entails keeping K vectors in memory. 2. (The Assembled Plan Approach:) • Use ν#[k], k ∈{1, · · · , K} to obtain the assembled plan vector ν#A following Algorithm 4, which assigns a real value ν#A|i to each state qi in the workspace. We refer to this map as the assembled plan. • Make use of the gradient defined by ν#A to reach the goal, by sequentially moving to states with increas- ing values specified by the assembled plan, i.e.,. if the current state is qi ∈Q, then q j is an acceptable next state if ν#A| j > ν#A|i and for states qℓthat can be reached from the current state qi via controllable events, we have ν#A| j ≧ν#A|ℓ. 13 Algorithm 4: Assembly of Plan Vectors input : ν#[k], k = 1, · · · , K (Plan Vectors) output: ν#A (Assembled Plan) 1 begin 2 Set ν#A = 0; /* Zero vector */ 3 for k = 1 : K do 4 for i ∈Q do 5 ν#tmp|i = 0; 6 if ν#[k−1]|i > 0 then 7 νtmp # |i = 1; 8 else 9 if ν#[k]|i > 0 then 10 νtmp # |i = ν#[k]|i; 11 endif 12 endif 13 endfor 14 ν#A = ν#A + νtmp # ; 15 endfor 16 end 17 • We show in the sequel that if ν#A|i < 0, then no fea- sible path exists to the goal. Before we can proceed further, we need to formally establish some key properties of the assembled plan approach. In partic- ular, we have the following proposition: Proposition 7. 1. For a state qi ∈Q, a feasible path to the goal exists from the state qi, if and only if ν#A|i > 0. 2. The assembled plan ν#A is free from local maxima, i.e., if there exists a ν⋆-path (w.r.t. to ν#A) from qi ∈Q to q j ∈Q and a ν⋆-path from qi to qGOAL then there exists a ν⋆-path from q j to qGOAL, i.e., ∀qi, q j ∈Q  ∃ρ1(qi, qGOAL) ^ ∃ρ2(qi, q j) ⇒∃ρ(q j, qGOAL)  3. If a feasible path to the goal exists from the state qi, then the agent can reach the goal optimally by following the gradient of ν#A, where the optimality is to be understood as maximizing the probability of reaching the goal while simultaneously minimizing the probability of hitting an ob- stacle (i.e. in the sense stated in Proposition 5). Proof. Statement 1: Let the plan vectors obtained by the recursive procedure stated in the previous section be ν#[k] (Say, there are K of them, i.e., k ∈{1, · · · , K}) and further let the current state qi ∈Qk for some k ∈{1, · · · , K}. We observe that on account of Propo- sition 4, if k = 1, then ν#[k]|i > 0 is sufficient to guaran- tee that there exists a ν⋆-path ρ(qi, qGOAL) w.r.t the plan vec- tor ν#[1]. We further note that ν#[1]|i <= 0 ⇒qi < Q1 (See Eq. (12)), implying that ν#[1]|i > 0 is also necessary for the ex- istence of ρ(qi, qGOAL). Extending this argument, we note that, for k > 1, a ν⋆-path ρ(qi, q j) with q j ∈Qk−1 exists (with re- spect to the plan vector ν#[k]) if and only if ν#[k]|i > 0. Not- ing that ν#[k]|i > 0 ⇔ν#A|i > 0, (See Algorithm 4) we con- clude that a ν⋆-path ρ(qi, q j) with q j ∈Qk−1 exists (with re- spect to the plan vector ν#[k]) if and only if ν#A|i > 0. Also, since q j ∈Qk−1 ∧qi ∈Qk, it follows from Algorithm 4, that ν#A| j ≧1 + ν#A|i > 0. It follows that the same argument can be used recursively to find ν⋆-paths ρ(q j, qℓ1), · · · , ρ(q j, qGOAL) if and only if ν#A|i > 0. To complete the proof, we still need to show that if there exists a feasible path from a state qi to the goal qGOAL, then there exists a ν⋆-path ρ(qi, qGOAL). We argue as follows: Let qi = qr1 →qr2 →· · · →qrm−1 →qrm = qGOAL be a feasible path from the state qi to qGOAL. Furthermore, assume if possible that ∀k ν# [k]|i ≦0 (27) i.e., there exists no ν⋆-path from qi to qGOAL w.r.t ν#A. We observe that since it is possible to reach qGOAL from qrm−1 in one hop, using Proposition 6 we have: ν# [1]|rm−1 > 0 ⇒qrm−1 ∈Q1 (28) We further note: ν# [1]|rm−2 > 0 ⇒qrm−2 ∈Q1 (29) ν#[1]|rm−2 ≦0 ⇒ν#[2]|rm−2 > 0 ⇒qrm−2 ∈Q2 (30) Hence, we conclude either qrm−2 ∈Q2 or qrm−2 ∈Q1. It fol- lows by straightforward induction that either q1 ∈Qm−1 or q1 ∈Qm−2, which contradicts the statement in Eq. (27). There- fore, we conclude that if a feasible path to the goal exists from any state qi, then a ν⋆-path ρ(qi, qGOAL) (w.r.t ν#A) exists as well. This completes the proof of Statement 1. Statement 2: Given states qi, q j ∈Q, assume that we have the ν⋆-paths ρ1(qi, qGOAL) and ρ2(qi, q j). We observe that: ∃ρ1(qi, qGOAL) ⇒ν#A|i > 0 (See Statement 1) (31a) ∃ρ2(qi, q j) ⇒ν#A| j ≧ν#A|i (See Definition 16) (31b) ⇒ν#A| j > 0 ⇒∃ρ(q j, qGOAL) (See Statement 1) (31c) which proves Statement 2. Statement 3: Statements 1 and 2 guarantee that if a feasible path to the goal exists from a state qi ∈Q, then an agent can reach the goal by following a ν⋆-path (w.r.t ν#A) from qi, i.e., by sequentially moving to states which have a better measure as compared to the current state. We further note that a ν⋆-path ω w.r.t ν#A from any state qi to qGOAL can be represented as a concatenated sequence ω1ω2 · · · ωr · · · ωm where ωr is a ν⋆-path from some interme- diate state q j ∈Qs, for some s ∈{1, · · · , K}, to some state qℓ∈Qs−1. Since the recursive procedure optimizes all such intermediate plans, and since the outcome “reached goal from qi” can be visualized as the intersection of the mutually inde- pendent outcomes “reached Qs from qi ∈Qs−1”, “reached Qs+1 from q j ∈Qs” , · · · , “reached qGOAL from qℓ∈Q1”, the overall path must be optimal as well. This completes the proof. We compute the set of acceptable next states from the fol- lowing definition. 14 Definition 19. Given the current state qi ∈Q, Qnext is the set of states satisfying the strict inequality: Qnext = {q j ∈Q : ν#A| j > ν#A|i} (32) We note that Proposition 7 implies that Qnext is empty if and only if the current state is the goal or if no feasible path to the goal exists from the current state. 5. Computation of Amortized Uncertainty Parameters eplacements Actuation & Localization Uncertainty Grid Decomposition R Trajectory q1 q1 q2 q2 q3 q3 q4 q4 q5 q5 q6 q6 q7 q7 q8 q8 Q (a) ag replacements on & Localization Uncertainty Grid Decomposition R Trajectory q1 q2 q3 q4 q5 q6 q7 q8 Q X Y A B C D J0 J1 J1 J2 J2 J3 J3 J4 J4 J5 J5 J6 J6 J7 J7 J8 J8 Discretization Deviation Contour(D) → (b) Figure 9: (a) Model for 2D circular robot(b) Numerical integration technique for computing the dynamic parameters for the case of a circular robot modele.g. a SEGWAY RMP 200 Specific numerical values of the uncertainty parameters, i.e. the probability of uncontrollable transitions in the navigation automaton can be computed from a knowledge of the average uncertainty in the robot localization and actuation in the config- uration space. For simplicity of exposition, we assume a 2D cir- cular robot; however the proposed techniques are applicable to more general scenarios. The complexity of this identification is related to the dynamic model assumed for the platform (e.g. cir- cular robot in a 2D space, rectangular robot with explicit head- ing in the configuration etc.), the simplifying assumptions made for the possible errors, and the degree of averaging that we are willing to make. Uncertainties arise from two key sources: 1. Actuation errors: Inability of the robot to execute planned motion primitives exactly, primarily due to the dynamic response of the physical platform. 2. Localization errors: Estimation errors arising from sensor noise, and the limited time available for post-processing exteroceptive data for a moving platform. Even if we as- sume that the platform is capable of processing sensor data to eventually localize perfectly for a static robot, the fact that we have to get the estimates while the robot pose is changing in real time, implies that the estimates lag the actual robot configuration. Thus, this effect cannot be ne- glected even for the best case scenario of a 2D robot with an accurate global positioning system (unless the platform speed is sufficiently small). In our approach, we do not distinguish between the differ- ent sources of uncertainty, and attempt to represent the overall amortized effect as uncontrollability in the navigation automa- ton. The rationale for this approach is straightforward: we visu- alize actuation errors as the uncontrollable execution of transi- tions before the controllable planned move can be executed, and for localization errors, we assume that any controllable planned move is followed by an uncontrollable transition to the actual configuration. Smaller is the probability of the uncontrollable transitions in the navigation automaton, i.e., larger is the coef- ficient of dynamic deviation for each state, smaller is the un- certainty in navigation. From a history of observed dynamics or from prior knowledge, one can compute the distribution of the robot pose around the estimated configuration (in an amor- tized sense). Then the probability of uncontrollable transitions can be estimated by computing the probabilities with which the robot must move to the neighboring cells to approximate this distribution. The situation for a 2D circular robot is illustrated in Figure 9(a), where we assume that averaging over the ob- servations lead to a distribution with zero mean-error; i.e., the distribution is centered around the estimated location in the con- figuration space. For more complex scenarios (as we show in the simulated examples), this assumption can be easily relaxed. We call this distribution the deviation contour (D) in the sequel. The amortization or averaging is involved purely in estimating the deviation contour from observed dynamics (or from prior knowledge); a simple methodology for which will be presented in the sequel. However, we first formalize the computation of the uncertainty parameters from a knowledge of the deviation contour. For that purpose, we consider the current state in the naviga- tion automaton to be qi. Recall that qi maps to a set of possible configurations in the workspace. For a 2D circular robot, qi cor- responds to a set of x −y coordinates that the robot can occupy, while for a rectangular robot, qi maps to a set of (x, y, θ) coor- dinates. The footprint of the navigation automaton states in the configuration space can be specified via the map ξ : Q →2C, where C is the configuration space of the robot. In general, for a given current state qi, we can identify the set N(qi) ⊂Q of neighboring states that the robot can transition to in one move. The current state qi is also included in N(qi) for notational sim- plicity. In case of the 2D circular robot model considered in this paper, the cardinality of N(qi) is 8 (provided of course that qi is not blocked and is not a boundary state). For a position s ∈ξ(qi) of the robot, we denote a neighborhood of radius r of 15 the position s in the configuration space as Bs,r. The normalized “volume” intersections of Bs,r with the footprints of the states included in N(qi) in the configuration space can be expressed as : F j(s, r) = R A j dx Z ∪qjA j dx , ∀q j ∈N(qi) (33) where A j = Bs,r \ ξ(q j) and dx is the appropriate Lebesgue measure for the continuous configuration space. We observe that the expected or the average probability of the robot deviating to a neighboring state q j ∈N(qi) from a location s ∈ξ(qi) is given by: Z ∞ 0 F j(s, r)Ddr (34) Hence, the probability Πuc i j of uncontrollably transitioning to a neighboring state q j from the current state qi is obtained by considering the integrated effect of all possible positions of the robot within ξ(qi), i.e. we have: Πuc i j = Z ξ(qi) Z ∞ 0 F j(s, r)Ddrds X qj∈N(qi) Z ξ(qi) Z ∞ 0 F j(s, r)Ddrds (35) where dr, ds are appropriate Lebesgue measures on the contin- uous configuration space of the robot. It is important to note that the above formulation is completely general and makes no assumption on the structure of the configuration space, e.g., the calculations can be carried out for 2D circular robots, rectan- gular robots or platforms with more complex kinematic con- straints equally well. Figures 13(a)-(c) illustrate the computa- tion for a circular robot with eight controllable moves, .e.g., the situation for a SEGWAY RMP. The 2D circular case is however the simplest, where any state that can be reached by an uncon- trollable transition, can also be reached by a controllable move. For more complex scenarios, this may not be the case. For ex- ample, in the rectangular model, with constraints on minimum turn radius, the robot may not be able to move via a controllable transition from (x, y, h1) to (x, y, h2), where hi, i = 1, 2 is the heading in the initial and final configurations. However, there most likely will be an uncontrollable transition that causes this change, reflecting uncertainty in the heading estimation (See Figure 10). Also, one can reduce the averaging effect by con- sidering more complex navigation automata. For example, for a 2D circular robot, the configuration state can be defined to be (xprevious, yprevious, xcurrent, ycurrent), i.e. essentially considering a 4D problem. The identification of the uncertainty parameters on such a model will capture the differences in the uncontrol- lable transition probabilities arising from arriving at a given state from different directions. While the 2D model averages out the differences, the 4D model will make it explicit in the specification of the navigation automaton (See Figure 15). In the sequel, we will present comparative simulation results for these models. Note, in absence of uncertainty, the 4D imple- mentation is superfluous; (xprevious, yprevious, xcurrent, ycurrent) has no more information than (xcurrent, ycurrent) in that case. PSfrag replacements 15◦ Available Controllable → Transitions Controllable Move Uncontrollable Move Figure 10: A rectangular robot unable to execute zero-radius turns. There exists uncontrollable transitions that alter heading in place, which reflects uncertain- ties in heading estimation, although there are no controllable moves that can achieve this transition Next, we present a methodology for computing the relevant uncertainty parameters as a function of the robot dynamics. We assume a modular plan execution framework, in which the low- level continuous controller on-board the robotic platform is se- quentially given a target cell (neighboring to the current cell) to go to, as it executes the plan. The robot may be able to reach the cell and subsequently receives the next target, or may end up in a different cell due to dynamic constraints, when it receives the next target from this deviated cell as dictated by the computed plan. The inherent dynamical response of the particular robot determines how well the patform is able to stick to the plan. We formulate a framework to compute the probabilities of uncon- trollable transitions that best describe these deviations. Definition 20. The raw deviation ∆R(t) as a function of the op- eration time t is defined as follows: ∆R(t) = Θ(p(t), ζ(t)) (36) where p(t) is the current location of the robot in the workspace coordinates, ζ(t) is the location of the point within the current target cell which is nearest to the robot position p(t) (See Fig- ure 11), and Θ(·, ·) is an appropriate distance metric in the con- figuration space. The robot will obviously take some time to reach the target cell, assuming it is actually able to do so. We wish to eliminate the effect of this delay from our calculations, since a platform that is able to sequentially reach each target cell, albeit with some delay, does not need the plan to be modified. Further- more, unless velocity states are incorporated in the navigation automata, the plan cannot be improved for reducing this delay. We note that the raw deviation ∆R(t) incorporates the effect of this possibly variable delay and needs to be corrected for. We do so by introducing the delay corrected deviation ∆(t) as fol- lows: Definition 21. The delayed deviation ∆d(t, η) is defined as: ∆d(t, η) = Θ(p(t + η(t)), ζ(t)) (37) 16 S PSfrag replacements J0 J1 J2 J3 J4 J5 J6 J7 J8 σu A B C D ∆R(t0) ∆R(t) Robot Trajectory Target Cell Current Location (time = t0) (a) S PSfrag replacements J0 J1 J2 J3 J4 J5 J6 J7 J8 σu σu A B C D ∆R(t0) ∆R(t) Robot Trajectory Target Cell Current Location (time = t0) (b) Figure 11: Illustration for computation of amortized dynamic uncertainty pa- rameters where η(t) is some delay function satisfying ∀τ ∈R, η(τ) ≥0. Definition 22. The delay corrected deviation ∆(t) as a function of the operation time t is defined as: ∀t ∈R, ∆(t) = argin f ∆d(t, η) : ∀τ ∈R, η(τ) ≥0 ∆d(t, η) (38) Note that Definition 22 incorporates the possibility that the delay may vary in the course of mission execution. We will make the assumption that although the delay may vary, it does so slowly enough to be approximated as a constant function over relatively short intervals. If we further assume that we can make observations only at discrete intervals, we can approximately compute ∆(t) over a short interval I = [tinit, t final] as follows: ∀t ∈I, ∆(t) = argmin Θ(p(t+η),ζ(t)):η∈N∪{0} Θ(p(t + η), ζ(t)) (39) Furthermore, the approximately constant average delay η⋆over the interval I can be expressed as: η⋆= argmin η∈N∪{0} Θ(p(t + η), ζ(t)) (40) Since the delay may vary slowly, the computed value of η⋆may vary from one observation interval to another. For each interval Ik ∈{I1, · · · , IM}, one can obtain the approximate probability distribution of the delay corrected deviation ∆(t), which is de- noted as D[k]. Therefore, from a computational point of view, D[k] is just a histogram constructed from the ∆(t) values for the interval Ik (for a set of appropriately chosen histogram bins or intervals). For a sufficiently large number of observation in- tervals {I1, · · · , IM}, one can capture the deviation dynamics of the robotic platform by computing the expected distribution of ∆(t), i.e. computing D, which can be estimated simply by: D = 1 M M X k=1 D[k] (41) Once the distribution for the delay corrected deviation has been computed, we can proceed to estimate the probabilities of the uncontrollable transitions, as described before. Determination of the uncertainty parameters in the navigation model then al- lows us to use the proposed optimization to compute optimal plans which the robot can execute. We summarize the sequen- tial steps in the next section. 6. Summarizing ν⋆Planning & Subsequent Execution The complete approach is summarized in Algorithm 5. The planning and plan assembly steps (Lines 2 & 3) are to be done either offline or from time to time when deemed necessary in the course of mission execution. Replanning may be necessary if the dynamic model changes either due to change in the en- vironment or due to variation in the operational parameters of the robot itself, e.g., unforeseen faults that alter or restrict the kinematic or dynamic degrees of freedom. Onwards from Line 4 in Algorithm 5 is the set of instructions needed for mission execution. Line 5 computes the set of states to which the robot can possibly move from the current state. We select one state from the set of possible next states which have a strictly higher measure compared to the current state in the computed plan. It is possible that the set of such states Qnext (See Line 6) has more than one entry. Choice of any element in Qnext as the next desired state is optimal in the sense of Proposition 5 . However, one may use some additional criteria to aid this choice to yield plans suited to the application at hand, e.g., to minimize change of current direction of travel. For example one may choose the state from Qnext that requires least deviation from the current direction of movement, to minimize control effort. In general, we can penalize turning using a specified penalty β ∈[0, 1] as follows: Definition 23. Given a turning penalty β ∈[0, 1], the turn pe- nalized measure values on the set of possible next states Qnext is computed as follows: ∀q ∈Qnext, νβ(q) = (1 −β)ν#A(q) + β cos(h(q)) (42) where h(q) is the heading correction required for transitioning to q ∈Qnext, which for 2D circular robots is calculated as the 17 X: 8 Y: 15 gamma = 0.98 replacements γ = 0.75 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (a) β = 0 X: 8 Y: 15 gamma = 0.98 PSfrag replacements γ = 0.75 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (b) β = 0.35 X: 8 Y: 15 gamma = 0.98 PSfrag replacements γ = 0.75 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (c) β = 0.75 X: 8 Y: 15 gamma = 0.98 PSfrag replacements γ = 0.75 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (d) β = 1 Figure 12: Effect of increasing turning penalty β on making turns in choice of local transitions post ν⋆-optimization. Note that the potential gradient is identical in all four cases. The start state is (8, 15) as shown. The gradient shown (by the arrows) is for the computed measure vector ν#A, and hence is identical for all four cases angular correction between the line joining the center of the current state to that of q, and the line joining the center of the last state with that of the current one. the direction of the last state. The robot then chooses qnext as the state q ∈Qnext which has the maximum value for νβ(q). The effect of penalizing turns is shown in the Figure 12. Note that for maximum turn penalty, the computed plan is almost completely free from kinks. Also, note that the ν⋆optimization ensures that all these plans have the same probability of success and collision. As stated in Line 8, the robot may not be successful to actu- ally transition to the next chosen state due to dynamic effects. In particular, if the state that the robot actually ends up in has Algorithm 5: Summarized Planning & Mission Execution input : Model GMOD NAV begin 1 /* ←−Planning & Plan Assembly −→ */; Compute decomposed plans ν#[k] ; /* Algorithm 3 */ 2 Compute assembled plan ν#A ; /* Algorithm 4 */ 3 /* ←−Mission Execution −→ */; while true do 4 Find set of neighbors: N(qi) = q ∈Q : ∃σ ∈ΣC s.t. qi σ−→q ; 5 Compute Qnext = qj ∈N(qi) : ∀qk ∈N(qi), ν#A|j > ν#A|k ; 6 Choose one state qnext from set Qnext; 7 Attempt to move to qnext; /* May be unsuccessful 8 */ Read current state qi; /* Possibly qi , qnext */ 9 if qi == qGOAL then 10 Mission Successful Terminate Loop; 11 else 12 if ν#A|i ≦0 then 13 Mission Failed Terminate Loop; 14 endif 15 endif 16 endw 17 end 18 a non-positive measure (due to uncertainties and execution er- rors), then execution is terminated and the mission is declared unfeasible from that point (See Line 14). It is important to note that if a particular configuration maps to a navigation state with non-positive measure, then no feasi- ble path to goal exists from that configuration, irrespective of uncertainty effects. This underscores the property of the pro- posed algorithm that it finds optimal feasible paths; even if the only feasible path is very unsafe, it still is the only feasible path; and is therefore the optimal course of action (See Proposition 7 Statement 3). 7. Verification & Validation In this section we validate the proposed planning algorithm via detailed high fidelity simulation studies and in experimen- tal runs on a heavily instrumented SEGWAY RMP 200 at the robotic testbed at the Networked Robotics & Systems Labora- tory (NRSL), Pennstate. The results of these experiments ad- equately verify the theoretical formulations and the key claims made in the preceding sections. Remark 6. In depicting simulation results in the sequel, we re- fer to “computed paths/plans”. It is important to clarify, what we mean by such a computed or simulated path. The computed path is the sequence of configuration states that the robot would enter, if the uncertainties do not force it to deviate, i.e., the path depicts the best case scenario under the uncertainty model. Thus, the depictions merely give us a feel for the kind of paths the robot would take; in actual implementation, the trajectories would differ between runs. Also, when we refer to lengths of the computed paths, we are referring to the lengths of the paths in the best case scenario, i.e., the tight lower bound on the path length that will actually be encountered. 18 g replacements Probability 0.4 0.35 0.35 0.35 0.3 0.3 0.3 0.3 0.3 0.3 0.25 0.25 0.25 0.2 0.2 0.2 0.2 0.2 0.2 0.15 0.15 0.15 0.1 0.1 0.1 0.1 0.1 0.1 0.05 0.05 0.05 0 0 0 0 0 0 Run 1 Run 2 Run 3 Deviation ∆ (a) 0 Run 1 Run 2 Run 3 Deviation ∆ η⋆ Run Number η⋆ Probability 0.5 0.4 0.3 0.2 0.1 0 0 16 16 17 17 18 18 19 19 25 5 10 15 20 (b) Probability 16 17 18 19 25 5 10 15 20 γ = .93 .003 .003 .003 .003 .015 .015 .015 .015 ∆ 0.4 0.35 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 Probability E(D) Histogram (c) Figure 13: Computation of dynamic uncertainty parameters from observed dynamics for SEGWAY RMP 200 at NRSL, Pennstate: (a) shows the observed distribu- tion D of the deviation ∆for different runs, (b) shows the distribution of the delay η⋆for the various runs, Lower plate in (c) illustrates the expected distribution E(D) for deviation ∆while the upper plate in (c) enumerates the probabilities of the uncontrollable transitions to the neighboring cells, and the coefficient of dynamic deviation γ(GMOD NAV ) g replacements 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (a) γ(GMOD NAV ) = 1 PSfrag replacements 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (b) γ(GMOD NAV ) = 0.98 PSfrag replacements 5 5 10 10 15 15 20 20 25 25 30 35 40 45 50 (c) γ(GMOD NAV ) = 0.9 Figure 14: Simulation results with a circular robot model and different values of the coefficient of dynamic deviation γ(GMOD NAV ). Note the response of the navigation gradient as γ(GMOD NAV ) is decreased. 7.1. Simulation Results for Circular Robots The recursive version of the modified ν⋆planning algorithm presented in this paper (See Algorithm 5) is first validated in a detailed simulation example as illustrated in Figures 13(a-c) and 14(a-c). The workspace is chosen to be a 50 ×30 grid, with obstacles placed at shaded locations, as illustrated. The size of the workspace is chosen to correspond with the size of the ac- tual test-bed, where experimental runs on the robotic platform would be performed subsequently. Plates (a)-(c) in Figure 14 il- lustrates the gradient of the optimized measure vector (by short arrows) and a sample optimal path from location (15, 10) (up- per,left) to the goal (40, 20) (down,right). We note that the “po- tential field” defined by the measure gradient converges (i.e. has an unique sink) at the goal. Also, note that the coefficient of dy- namic deviation γ(GMOD NAV ) is decreased, the algorithm responds by altering the optimal route to the goal. In particular, the opti- mal path for smaller values of γ(GMOD NAV ) stay further away from the obstacles. The key point to note here is that the the pro- posed algorithm guarantees that this lengthening of the route to account for dynamic uncertainty is optimal, i.e., further length- ening by staying even further from the obstacles yields no ad- vantage in a probabilistic sense. This point has a direct practical 19 implication; one that can be verified experimentally as follows. Let us assume that we have a real-world robot equipped with on-board reactive collision avoidance, by which we can ensure that the platform does not actually collide with obstacles under dynamic uncertainty, but executes corrections dictated by local reactive avoidance. Then, the preceding result would imply that a using the correct value of dynamic deviation (for the specific platform) in the planning algorithm would result in routes that require the least number of local corrections; which in turn en- sures minimum time route traversals on the average. It is important to note that the assumption of a circular robot poses no critical restrictions. Similar results can be obtained for more complex models as well, i.e. rectangular platform with constrained turn radius. However, extension to multi-body mo- tion planning would require addressing the algorithmic com- plexity issues that become important even for the recursive ν⋆ for very large configuration spaces, and is a topic of future work. 7.2. Simulation Results for Non-symmetric Uncertainty As stated in the course of the theoretical development, it is possible to choose the degree of amortization or averaging that one is willing to allow in the specification of the navigation au- tomaton. As a specific example, one may choose to compute the probabilities of uncontrollable transitions with respect to some length of trajectory history; the simplest case is using the pre- vious state information to yield non-symmetric deviation con- tours (See Figure 15). The particular type of deviation contours illustrated in Figure 15 is obtained if the platform has a large stopping distance and inertia, and the heading and positional estimates are more or less accurate, i.e., the uncontrollability in the model is a stronger function of the dynamic response, rather than the estimation errors. A typical scenario is the SEGWAY RMP 200 with good global positioning capability, in which fac- toring in the dynamic response is important due to the inverted- pendulum two-wheel kinematics. For this simulation, we use a navigation automaton obtained from discretizing an essentially 4D underlying continuous configuration space. Each state (ex- cept the obstacle state) in the navigation automaton maps to a discretized pair of locations, reflecting the current robot loca- tion and the one from which it transitioned to the current loca- tion. We call this the 4D Model to distinguish it from the the significantly smaller and simpler 2D model. Note that the 2D model can be obtained from the 4D model by merging states with the same current location via averaging the probabilities of uncontrollable transitions over all possible previous states. Also note that (as stated earlier), in the absence of uncertainty, the 4D formulation adds nothing new; explicitly encoding the previous location in the automaton state gives us no new infor- mation. Table 1 enumerates the comparative model sizes. Table 1: Comparison of 4D and 2D Models Map Size No. of States Alphabet Size 2D Model 40 × 40 1600 8 4D Model 40 × 40 256 × 104 8 PSfrag replacements Deviation Contour Trajectory (a) (b) (c) Figure 15: Non-symmetric deviation contours arising from explicit dependence on the last discrete position for the robot PSfrag replacements GOAL INIT 4D model 2D model No uncertainty Figure 16: Comparative pathlengths for the chosen initial and goal configura- tions: 65 steps for 2D model, 51 steps for 4D model, 36 steps for the case with no uncertainty Comparisons of computed plans for a particular set of initial and goal configurations is shown in Figure 16. To accentuate the differences in the computed plans, the deviation contours were chosen so that the probability of uncontrollable transition in the current direction of travel is significantly more compared to that of deviating to left or right, i.e., the contours are re- ally narrow ellipses. Under such a scenario, the platform is more capable of navigating narrow corridors as compared to the amortized 2D counterpart. This is reflected in the paths shown in Figure 16, where the path for the 4D model is shorter, and goes through some of the narrow bottlenecks, while the path for the 2D model takes a safer path. Note, the path for the no- uncertainty case is even more aggressive, and shorter. In prac- tical implementation, when the uncontrollable probabilities are identified from observed dynamics or pre-existing continuous models, the differences in the two cases are often significantly less. 7.3. Simulation Results for Rectangular Robots The proposed planning algorithm is next applied to the case of rectangular robots, specifically ones that have a minimum non-zero turn radius. We further impose the constraint that the 20 platform cannot travel backwards, which is a good assumption for robots that have no ranging devices in the rear, and also for aerial vehicles (UAVs). Even assuming planar operation, this problem is essentially 3D, with the navigation automaton re- flecting the underlying configuration states of the form (x, y, h) where h is the current heading, which can no longer be ne- glected due to the inability of the patform to turn in place. A visual comparison of the models for the circular and rectangu- lar cases is shown in Figure 17(e). The heading is discretized at 15◦intervals, implying we have 24 discrete headings. This also means that for the same planar workspace, the number of states in the rectangular model is about 24 times larger the num- ber of states for the circular model. Also, while in the circular case, we had 8 neighbors, the number of neighboring configura- tions increases to 8 × 24 + 24 = 216 However, not all neighbors can be reached via controllable transitions due to the restriction on the turn radius; we assume a maximum turn of ±45◦in the model considered for the simulation. As explained earlier, all the neighbors may be reachable via uncontrollable transitions, which reflects uncertainties in estimation (See Figure 10). We test the algorithm with different values of γ(GMOD NAV ) as il- lustrated in Figure 17(a-d). Note the trajectories become more rounded and less aggressive (as expected) as the uncertainty is increased. Also note that the heading at the goal is different for the different cases. This is because, in the model , we specified as goals any state that maps to the goal location in the planar grid irrespective of the heading, i.e., the problem was solved with essentially 24 goals. Although for simplicity, the theo- retical development was presented assuming a single goal, the results can be trivially shown to extend to such scenarios. The trajectories however, will be significantly different if we insist on having a particular heading at the goal (See Figure 18). In the simulations for the rectangular model, we deliberately assume that any neighbor that cannot be reached via a con- trolled move is not reachable by an uncontrollable transition as well. Although this is not what we expect to encounter in field, the purpose of this assumption is to bring out an interest- ing consequence that we illustrate in Figures 19(a-b). As ex- plained above, this assumption implies that we have little or no uncertainty in local heading estimations (since the robot can- not turn in place, so there is no uncontrollable transition that alters heading in place). It therefore follows, that under this scenario, the platform would find it relatively safe to navigate narrow passages. This is exactly what we see in Figure 19(b), where the circular robot with same coefficient of dynamic un- certainty, really goes out of way to avoid the narrow passage, while the rectangular robot goes through. 7.4. Simulation Results for Mazes We simulate planning in a maze of randomly placed static obstacles. A sample case with optimal paths computed for dif- ferent coefficients of dynamic deviation is illustrated in Fig- ure 20(a). A key point to note is that the optimal path is length- ens first as γ(GMOD NAV ) is decreased, and then starts shortening again, which may seem paradoxical at first sight. However, this is exactly what we expect. Recall that the proposed al- gorithm minimizes the probability of collision. Also note that there are two opposing effects in play here; while a longer path that stays away from the obstacles influences to decrease the collision probability, the very fact that the path is longer has an increasing influence arising from the increased probability that an uncontrollable sequence would execute from some point in the path that leads to a collision. At relatively high values of γ(GMOD NAV ), the first effect dominates and we can effectively de- crease the collision probability by staying away from the ob- stacles thereby increasing the path length. However, at low values of γ(GMOD NAV ), the latter effect dominates, implying that increased path lengths are no longer advantageous. This in- teresting phenomenon is illustrated in Figure 20(b), where we clearly see that the path lengths peak in the γ(GMOD NAV ) = 0.72 to γ(GMOD NAV ) = 0.85 range (for the maze considered in Fig- ure 20(a)). Also note that the configuration space has to be sufficiently complex to actually see this effect; which is why we do not see this phenomenon in the simulation results pre- sented in Figures 14(a-c). 7.5. Experimental Runs on SEGWAY RMP 200 The proposed algorithm is validated on a SEGWAY RMP 200 which is a two-wheeled robot with significant dynamic un- certainty. In particular, the inverted-pendulum dynamics pre- vents the platform from halting instantaneously, and making sharp turns at higher velocities. At low velocities, however, the platform can make zero radius turns. The global positional fix is provided via an (in-house developed) over-head multi-camera vision system, which identifies the position and orientation of the robot in the testbed. The vision system yields a positional accuracy of ±7.5 cm, and a heading accuracy or ±0.1 rad for a stationary robot. The accuracy deteriorates significantly for a mobile target, but noise correction is intentionally not applied to simulate a high noise uncertain work environment. Further- more, the cameras communicate over a shared wireless network and randomly suffers from communication delays from time to time, leading to delayed positional updates to the platform. In the experimental runs conducted at NRSL the workspace dis- cretized into a 53 × 29 grid. Each grid location is about 4 sq. ft. allowing the SEGWAY to fit completely inside each such discretized positional state which justifies the simplified circu- lar robot modeling. The runs are illustrated in Figure 22. The robot was run at various allowed top speeds (vmax) ranging from 0.5 mph to over 2.25 mph. Only the extreme cases are illus- trated in the figure. For each speed, the uncertainty parameters were estimated using the formulation presented in Section 5. The sequence of computational steps for the low velocity case (vmax = 0.5 mph) are shown in Figure 13. Note the coefficient of dynamic deviation for the low velocity case turns out to be γlow = 0.973. For the high velocity case, (vmax = 2.25 mph), the coefficient is computed to have a value of γhigh = 0.93 (calcula- tion not shown). Also, the robot is equipped with an on-board low-level reactive collision avoidance algorithm, which ensures that the platform does not actually collide due path deviations; but executes local reactive corrections when faced with such sit- uations. The platform is equipped with multiple high frequency sonars, infra-red range finders and a high-precision SICK LMS 21 0 5 10 15 20 25 30 0 5 10 15 20 25 (a) γ(GMOD NAV ) = 1 0 5 10 15 20 25 30 0 5 10 15 20 25 (b) γ(GMOD NAV ) = 0.8 0 5 10 15 20 25 30 0 5 10 15 20 25 (c) γ(GMOD NAV ) = 0.5 0 5 10 15 20 25 30 0 5 10 15 20 25 (d) γ(GMOD NAV ) = 0.1 PSfrag replacements 15◦ Discretized Neighboring Configurations Circular Model Rectangular Model (e) Figure 17: Comparative trajectories for rectangular model 200 laser range finder. The data from these multiple ranging de- vices, mounted at various key locations on the platform, must be fused to obtain correct situational awareness. In this paper, we skip the details of this on-board information processing for the sake of brevity. The overall scheme is illustrated in Fig- ure 21 In the experimental runs, we choose two waypoints (marked A and B) in Figure 22 (plates a,b,d,e), and the mission is to plan and execute the optimal routes in sequence from A to B, back to A and repeat the sequence a specified number of times (thirty). This particular mission is executed for each top speed for a range of γ values, namely with γ ∈[0.75, 1] with incre- ments of 0.1. The expectation is that using the correct coef- ficient of dynamic deviation (as computed above for the two chosen speeds) for the given speed, would result in the mini- mum number of local corrections, leading to minimum average traversal times over thirty laps. The results are summarized in plates (c) (for the low ve- locity case) and (f) (for the high velocity case) in Figure 22. Note the fitted curve in both cases attain the minimal point very close to the corresponding computed γ values, namely, 0.97 for vmax = 0.5 mph and 0.92 for vmax = 2.25 mph. A visual com- parison of the trajectories in the plates (a) and (d) clearly re- veal that the path execution has significantly more uncertainties 22 PSfrag replacements Goal Location Goal Heading Note Difference (a) Strict Goal Heading Requirement PSfrag replacements Goal Location Goal Heading Note Difference (b) No Heading Requirement Figure 18: Comparative trajectories for rectangular model for γ(GMOD NAV ) = 0.8: Case (a) we demand that the robot reach the goal with a specified heading (−150◦). Case (b): Any heading at the goal is acceptable PSfrag replacements Circular Robot Rectangular Robot Note Note (a) γ(GMOD NAV ) = 1 PSfrag replacements Circular Robot Rectangular Robot Note Note (b) γ(GMOD NAV ) = 0.8 Figure 19: Comparative trajectories for rectangular model and circular model to illustrate the effect of different uncertainty assumptions: (a) Trajectories in the absence of uncertainty (b) Trajectories with γ(GMOD NAV ) = 0.8. in the high velocity case. Also note, that the higher average speed leads to repeated loss of position fix information in lo- cations around (row = 20, column = 35). Plates (b) and (e) illustrate the sequence of waypoints invoked by the robot in the two cases, being the centers of the states in the navigation au- tomaton that the robot visits during mission execution. Note that in the high velocity case, the variance of the trajectories is higher leading to a larger set of waypoints been invoked. Note that three distinct zones (denoted as Zone A, Zone B and Zone C) can be identified in the plates (e) and (f) of Figure 22. Zone A reflects the operation when γ is (incorrectly assumed to be) too large, leading to too many corrections, and hence execution time can be reduced by reducing γ. In Zone B, reducing γ in- creases execution time, since now the trajectories becomes un- necessarily safe, i.e. stays away from obstacles way more than necessary leading to longer than required paths and hence in- creased execution time. Zone C represents a sort of saturation zone where reducing γ has no significant effect, arising from the fact that the paths cannot be made arbitrarily safe by increasing path lengths. Although the experimental runs were not done for smaller values of γ, we can say from the experience with maze simulations (See Section 7.4), that the execution times will start reducing again as γ is further reduced. These results clearly show that the approach presented in this paper successfully integrates amortized dynamical uncertainty with autonomous planning, and establishes a computationally efficient framework to cyber-physical motion planning. 8. Summary & Future Research The recently proposed PFSA-based path planning algorithm ν⋆is generalized to handle amortized dynamic uncertainties in plan execution, arising from the physical limitations of sensing and actuation, and the inherent dynamic response of the physi- cal platforms. The key to this generalization is the introduction of uncontrollable transitions in the modified navigation automa- ton, and showing that ν⋆can be implemented in a recursive fashion to guarantee plan optimality under such circumstances. 23 PSfrag replacements γ = 1 γ = 0.9 γ = 0.8 γ = 0.6 5 10 15 20 25 30 35 40 45 50 60 70 80 G 1 0.8 0.4 0.1 S G (a) 60 γ(GMOD NAV ) Path Lengths Simulated Data Polynomial Fit 1 0.95 0.9 .85 0.8 0.75 0.7 0.2 0.6 0.1 0.5 0.4 0.3 90 70 100 105 110 80 120 (b) Figure 20: Effect of dynamic uncertainty on the optimal path lengths computed by ν⋆. Plate (a) illustrates the optimal paths for γ(GMOD NAV ) = 1.0, 0.8, 0.4, 0.1 from the start location marked by S and the goal marked by G. Plate (b) illustrates the variation of the length of the optimal paths as a function of γ(GMOD NAV ) for the maze illustrated in (a). Figure 21: Autonomous navigation scheme with ν⋆implementation on heavily instrumented Segway RMP (shown in inset). The theoretical algorithmic results is verified in detailed high- fidelity simulations and subsequently validated in experimental runs on the SEGWAY RMP 200 at NRSL, Pennstate. 8.1. Future Work Future work will extend the language-measure theoretic planning algorithm to address the following problems: 1. Multi-robot coordinated planning: Run-time complex- ity grows exponentially with the number of agents if one attempts to solve the full Cartesian product problem. How- ever ν⋆can be potentially used to plan individually fol- lowed by an intelligent assembly of the plans to take inter- action into account. 2. Hierarchical implementation to handle very large workspaces: Large workspaces can be solved more effi- ciently if planning is done when needed rather than solving the whole problem at once; however care must be taken to ensure that the computed solution is not too far from the optimal one. One the areas of current research is an al- gorithmic decomposition of the configuration space such that individual blocks are solved in parallel on communi- cating processors, with the interprocessor communication ensuring close-to-global optimality. We envision such an approach to be ideaally suited to scenarios involving multi- ple agents distributed over a large workspace which coop- eratively solve the global planning problem in an efficient resource-constrained manner. 3. Handling partially observable dynamic events: In this paper all uncontrollable transitions are assumed to be per- fectly observable. Physical errors and onboard sensor fail- ures may need to be modeled as unobservable transitions and will be addressed in future publications. A generaliza- tion of the measure-theoretic optimization technique under partial observation has been already reported [48]. The fu- ture goal in this direction is to incorporate the modifica- tions to allow ν⋆handle loss of observation and feedback information. References [1] J.-C. Latombe, Robot Motion Planning, International Series in Engineer- ing and Computer Science; Robotics: Vision, Manipulation and Sensors, Kluwer Academic Publishers, Boston, MA, U.S.A., 1991, 651 pages. [2] S. M. LaValle, Planning Algorithms, Cambridge University Press, Cam- bridge, U.K., 2006, available at http://planning.cs.uiuc.edu/. [3] K. Kondo, Motion planning with six degrees of freedom by multistrate- gicbidirectional heuristic free-space enumeration, IEEE Transactions on Robotics and Automation 7 (3) (1991) 267–277. 24 Sfrag replacements A B 5 10 10 15 15 20 20 25 25 30 35 40 45 50 60 70 80 (a) PSfrag replacements A B 5 10 10 15 15 20 20 25 25 30 35 40 45 50 60 70 80 (b) PSfrag replacements A B 5 10 10 15 15 20 20 25 25 30 35 40 45 50 60 70 80 (c) PSfrag replacements A B 5 10 10 15 15 20 20 25 25 30 35 40 45 50 60 70 80 (d) 70 80 0.75 0.8 0.85 0.91 0.93 0.95 0.97 0.99 145 150 155 160 165 170 175 180 185 190 195 200 205 210 Zone A Zone B Zone C Mean Runtime Smooth Spline Mean Runtime Smooth Spline γ(GMOD NAV ) Time (sec) (e) 80 0.75 0.8 0.85 0.91 0.93 0.95 0.97 0.99 145 150 155 160 165 170 175 180 185 190 195 200 205 210 Zone A Zone B Zone C Mean Runtime Smooth Spline Mean Runtime Smooth Spline γ(GMOD NAV ) Time (sec) 0.9 (f) Figure 22: Experimental runs on SEGWAY RMP 200: (a)-(c) Low speed runs and (d)-(f) High speed runs: Plates (a) and (d) shows the trace of the robot positions as read by the overhead vision system at NRSL for the low and high speed runs respectively. (b) and (e) shows the waypoints invoked by the robot in course of executing the specified mission in the low and high speed cases respectively. Plates (c) and (f) illustrate the variation of the mean mission execution times with the coefficient of dynamic deviation used fro planning in the low and high speed cases respectively. [4] J. Borenstein, Y. Koren, The vector field histogram-fast obstacle avoidance for mobile robots, Robotics and Automation, IEEE Transactions on 7 (3) (2002) 278–288. doi:10.1109/70.88137. URL http://dx.doi.org/10.1109/70.88137 [5] T. Lozano-Perez, A simple motion-planning algorithm for general robot manipulators, IEEE Transactions on Robotics and Automation 3 (3) (1987) 224–238. [6] D. A. Anisi, J. Hamberg, X. Hu, Nearly time-optimal paths for a ground vehicle, Journal of Control Theory and Applications. [7] J. Barraquand, B. Langlois, J.-C. Latombe, Robot motion planning with many degrees of freedom and dynamic constraints, MIT Press, Cam- bridge, MA, USA, 1990. [8] J. Langelaan, Tree-based trajectory planning to exploit atmospheric en- ergy, in: American Control Conference, 2008, 2008, pp. 2328–2333. doi:10.1109/ACC.2008.4586839. [9] S. Lahouar, E. Ottaviano, S. Zeghoul, L. Romdhane, M. Cecca- relli, Collision free path-planning for cable-driven parallel robots, Robotics and Autonomous Systems 57 (11) (2009) 1083 – 1093. doi:DOI:10.1016/j.robot.2009.07.006. [10] L. M. Ortega, A. J. Rueda, F. R. Feito, A solution to the path planning problem using angle preprocessing, Robotics and Autonomous Systems In Press, Corrected Proof (2009) –. doi:DOI:10.1016/j.robot.2009.07.028. [11] J. R. Andrews, N. Hogan, Impedance Control as a Framework for Im- plementing Obstacle Avoidance in a Manipulator, ASME, Boston, MA, 1983, pp. 243–251. [12] O. Khatib, Real-time obstacle avoidance for manipulators and mobile robots, in: IEEE International Conference on Robotics and Automation, Vol. 2, St. Louis, MI, 1985, pp. 500–505. [13] B. H. Krogh, A generalized potential field approach to obstacle avoid- ance control, in: International Robotics Research Conference, Bethlehem, 1984. [14] M. Kumar, D. Garg, R. Zachery, Multiple mobile agents control via artifi- cial potential functions and random motion, in: Proceedings of the ASME International Mechanical Engineering Congress and Exposition, ASME, Seattle, WA, 2007. doi:PaperNo.IMECE2007-41521. [15] S. Sarkar, E. Halland, M. Kumar, Mobile robot path plan- ning using support vector machines, in: ASME Dynamic Sys- tems and Control Conference, ASME, Ann Arbor, Michigan, 2008. doi:PaperNo.DSCC2008-2200. [16] J. Borenstein, Y. Koren, Potential field methods and their inherent lim- itations for mobile robot navigation, in: Proceedings of the 1991 IEEE International Conference on Robotics and Automation, 1991, pp. 1398– 1404. 25 [17] R. Tilove, Local obstacle avoidance for mobile robots based on the method of artificial potentials, Robotics and Automation, 1990. Pro- ceedings., 1990 IEEE International Conference on (1990) 566–571 vol.1doi:10.1109/ROBOT.1990.126041. [18] I. Chattopadhyay, G. Mallapragada, A. Ray, ν⋆: a robot path planning algorithm based on renormalized measure of probabilistic regular lan- guages, International Journal of Control 82 (5) (2008) 849–867. [19] I. Chattopadhyay, A. Ray, Renormalized measure of regular languages, Int. J. Control 79 (9) (2006) 1107–1117. [20] I. Chattopadhyay, A. Ray, Language-measure-theoretic optimal control of probabilistic finite-state systems, Int. J. Control. [21] J. M. O’kane, B. Tovar, P. Cheng, S. M. Lavalle, Algorithms for plan- ning under uncertainty in prediction and sensing, in: Chapter 18 in Au- tonomous Mobile Robots: Sensing, Control, Decision-Making, and Ap- plications, Marcel Dekker, 2005, pp. 501–547. [22] T. Lozano-Perez, M. T. Mason, R. H. Taylor, Automatic Synthesis of Fine-Motion Strategies for Robots, The Inter- national Journal of Robotics Research 3 (1) (1984) 3–24. doi:10.1177/027836498400300101. [23] A. Lazanas, J. Latombe, Landmark-based robot navigation, Vol. 92, AAAI Press, San Jose, California, 1992, pp. 816–822. [24] T. Fraichard, R. Mermond, Path planning with uncertainty for car- like robots, in Proc. of the IEEE Intl. conf. on Robotics & Automation (1998) 27–32. [25] N. J. Nilsson, Principles of Artificial Intelligence, Tioga, 1980. [26] H. Takeda, J.-C. Latombe, Sensory uncertainty field for mobile robot nav- igation, in Proc. of the IEEE Intl. conf. on Robotics & Automation (1992) 2465–2472. [27] P. E. Trahanias, Y. Komninos, Robot motion planning: Multi- sensory uncertainty fields enhanced with obstacle avoidance, in: Proc. of the IEEE/RSJ Intl. conf. on Intelligent Robots and Systems, 1996. [28] N. A. Vlassis, P. Tsanakas, A sensory uncertainty field model for un- known and non-stationary mobile robot environments, in: Proceedings of the IEEE Intl. conf. on Robotics & Automation, 1998. [29] N. Roy, S. Thrun, Coastal navigation with mobile robots, in: Advances in Neural Information Processing, Systems (NIPS, 1999. [30] R. Alami, T. Simeon, Planning robust motion strategies for a mobile robot, in: Proc. of the IEEE Intl. conf. on Robotics & Automation, 1994. [31] B. Bouilly, T. Simeon, R. Alami, A numerical technique for plan- ning motion strategies of a mobile robot in presence of uncertainty, in: Proc. of the IEEE Intl. conf. on Robotics & Automation, 1995. [32] M. Khatib, B. Bouilly, T. Simeon, R. Chatila, Indoor navigation with un- certainty using sensor-based motions, in Proc. of the IEEE Intl. conf. on Robotics & Automation 4 (1997) 3379–3384. [33] J. Barraquand, P. Ferbach, Motion planning with uncertainty: The infor- mation space approach, in: Proc. of the IEEE Intl. conf. on Robotics & Automation, 1995. [34] L. A. Page, A. C. Sanderson, Robot motion planning for sensor-based control with uncertainties, Vol. 2, Nagoya, Japan, 1995, pp. 1333–1340. [35] L. Blackmore, H. Li, B. Williams, A probabilistic approach to optimal ro- bust path planning with obstacles, in: Proceedings of the AIAA Guidance, Navigation and Control ConferenceNavigation and Control Conference, 2006. [36] L. Blackmore, A probabilistic particle control approach to optimal, robust predictive control, in: Proceedings of the AIAA Guidance, Navigation and Control ConferenceNavigation and Control Conference, 2006. [37] A. Lambert, N. L. Fort-Piat, Safe task planning integrating uncertainties and local maps federations, International Journal of Robotics Research, volume 19 (2000) 597–611. [38] A. Lambert, D. Gruyer, Safe path planning in an uncertain-configuration space, in: Robotics and Automation, 2003. Proceedings. ICRA ’03. IEEE International Conference on, Vol. 3, 2003, pp. 4185–4190. doi:10.1109/ROBOT.2003.1242246. [39] J. P. Gonzalez, A. T. Stentz, Planning with uncertainty in position: An optimal and efficient planner, in Proc. of the IEEE/RSJ Intl. conf. on In- telligent Robots and Systems (2005) 2435–2442. [40] J. P. Gonzalez, A. Stentz, Planning with uncertainty in position using high-resolution maps, in: Proc, of the IEEE Intl. conf. on Robotics & Automation, Rome, Italy, 2007. [41] R. Alterovitz, T. Sim´eon, K. Y. Goldberg, The stochastic motion roadmap: A sampling framework for planning with markov motion uncertainty, in: Robotics: Science and Systems, 2007. [42] P. Singla, T. Singh, A novel coordinate transformation for obstacle avoid- ance and optimal trajectory planning, in: 2008 AAS/AIAA Astrodynam- ics Specialist Conference and Exhibit, 2008. [43] A. Ray, Signed real measure of regular languages for discrete-event su- pervisory control, Int. J. Control 78 (12) (2005) 949–967. [44] V. Garg, An algebraic approach to modeling probabilistic discrete event systems, Proceedings of 1992 IEEE Conference on Decision and Control (Tucson, AZ, December 1992) 2348–2353. [45] V. Garg, Probabilistic lnaguages for modeling of DEDs, Proceedings of 1992 IEEE Conference on Information and Sciences (Princeton, NJ, March 1992) 198–203. [46] W. Rudin, Real and Complex Analysis, 3rd ed., McGraw Hill, New York, 1988. [47] I. Chattopadhyay, Quantitative control of probabilistic discrete event sys- tems, PhD Dissertation, Dept. of Mech. Engg. Pennsylvania State Univer- sity, http:// etda.libraries.psu.edu / theses / approved / WorldWideIndex / ETD-1443. [48] I. Chattopadhyay, A. Ray, Optimal control of infinite horizon partially ob- servable decision processes modeled as generatorsof probabilistic regular languages, International Journal of Control In Press. 26