1 Robust Belief Roadmap: Planning Under Intermittent Sensing Shaunak D. Bopardikar Brendan J. Englot Alberto Speranzon Abstract—In this paper, we extend the recent body of work on planning under uncertainty to include the fact that sensors may not provide any measurement owing to misdetection. This is caused either by adverse environmental conditions that prevent the sensors from making measurements or by the fundamental limitations of the sensors. Examples include RF-based ranging devices that intermittently do not receive the signal from beacons because of obstacles; the misdetection of features by a camera system in detrimental lighting conditions; a LIDAR sensor that is pointed at a glass-based material such as a window, etc. The main contribution of this paper is twofold. We first show that it is possible to obtain an analytical bound on the performance of a state estimator under sensor misdetection occurring stochastically over time in the environment. We then show how this bound can be used in a sample-based path planning algorithm to produce a path that trades off accuracy and robustness. Computational results demonstrate the benefit of the approach and comparisons are made with the state of the art in path planning under state uncertainty. Keywords—Path planning, Belief space planning, Autonomous systems, Localization. I. INTRODUCTION Map-based, GPS-denied navigation often relies on the mea- surement of environmental features to perform state estimation. Whether these features are extracted from camera or LIDAR data, or supplied by range beacons, their measurement will likely be corrupted by noise. Producing a consistent estimate in the presence of sensor noise is often the primary concern in designing a state estimator, but an important related question is how to navigate robustly when the sensor does not produce a measurement at all, either because the sensor itself is corrupted or environmental conditions prevent the sensor from making a measurement. Examples of such situations include RF-based ranging devices that intermittently do not receive signals from beacons due to obstacles, and misdetection of features by a camera system in textureless areas of the environment or due to adverse lighting conditions. The main contribution of this paper is twofold. We first show that it is possible to obtain an analytical bound on the performance of a state estimator under probabilistic sensor misdetections. We then show how this bound can be used in a sample-based path planning algorithm that minimizes goal- state uncertainty under such a stochastic mode of misdetection. Recent work in robotics has emphasized robust path plan- ning under various sources of uncertainty. The objective is The authors are with United Technologies Research Center. This work was supported by United Technologies Research Center under the Autonomy ini- tiative. Email: {bopardsd, englotbj, sperana}@utrc.utc.com. frequently to identify a feasible start-to-goal path that mini- mizes uncertainty along the path, uncertainty at the goal, or some combination of these and the length of the path. Actions and measurements affected by noise are often considered, as well as uncertain maps. The stochastic motion roadmap is a foundational work in modeling path planning under process noise as a Markov decision process (MDP), which is solved optimally using dynamic programming [1]. Using the framework of Partially Observable Markov Decision Processes (POMDP), Marthi addresses path planning in environments in which obstacles appear/disappear dynamically over time [2]. If stochastic measurements are considered in addition to actions, the planning problem, also a POMDP, is intractable over most state spaces relevant to problems in robotics. As a result, a variety of algorithms make simplifying assumptions and find high-quality feasible paths that manage uncertainty. Sample-based motion planning is often utilized to generate a set of collision-free feasible paths from which a minimum- uncertainty path is selected. The belief roadmap (BRM) [3] builds a probabilistic roadmap (PRM) [4] in a robot’s state space, propagates beliefs over the roadmap using an extended Kalman filter (EKF) [5], and plans a path of minimum goal- state uncertainty. This approach has been extended [6] to bias the PRM samples using a Sensory Uncertainty Field (SUF) [7], which expresses the spatial variation in sensor performance over the workspace. Rapidly-exploring random belief trees (RRBTs) [8] use the EKF to propagate belief states over a rapidly-exploring random graph (RRG) [9], to find asymptot- ically optimal paths that minimize goal-state uncertainty sub- ject to chance constraints. Linear quadratic Gaussian motion planning (LQG-MP) [10] pairs an LQG controller-estimator duo with trajectories planned using rapidly-exploring random trees (RRTs) [11], seeking a path that minimizes the product of collision probabilities at all states. Instead of sample-based planning, continuous optimization is used by Platt et al. [12]; locally optimal paths are computed directly in belief space under the assumption that the maximum-likelihood measure- ment is always obtained, and LQG estimation and control are applied. Some algorithms assume uncertainty in the map used for navigation. Missiuro and Roy [13] employ PRM path planning over a roadmap in which the positions of obstacle vertices are uncertain, and Guibas et al. [14] extend this approach to 3D workspaces. This methodology is combined with assump- tions of uncertain actions and measurements by Kurniawati et al. [15], who use a point-based POMDP planner to obtain an approximate minimum-cost solution, where the cost is a combination of movement and collision risk. A hierarchical approach is adopted by Vitus et al. [16], who manage all three arXiv:1304.7256v2 [cs.RO] 15 Sep 2013 2 sources of uncertainty by decomposing the workspace into a graph and optimizing over the graph in several steps. Wellman et al. [17] consider a setting in which the edge costs on the graph are uncertain, with potential probabilistic dependency between the costs, and provide a path planning algorithm that produces optimal paths under time-dependent uncertainty. Acar et al. [18] present an approach that uses geometric and topological features instead of sensor uncertainty models. Our problem is related to that of planning over a stochastic map. Uncertainty in the arrangement of obstacles in a map may adversely impact a robot’s navigation process, and similarly, the precision of a robot’s state estimate will be hindered if its sensors do not detect a measurement. Our aim is to develop a principled method for planning under uncertainty when mis- detection by the robot’s sensors is a primary concern. In total, we consider three sources of uncertainty: process noise, sensor noise, and sensor intermittency, for which sensor misdetections occur with a known probability. Like the BRM method of Prentice and Roy [3], an EKF is used to propagate belief states over a PRM, and a path of minimum goal-state uncertainty is selected. Performance guarantees are developed that bound the filter’s performance under probabilistic misdetections by the sensors, and the likelihood of these misdetections is considered explicitly, along with process and sensor noise, in selecting a minimum-uncertainty path. Regarding the estimation aspect of the problem, we demon- strate that the choice of the expected maximum eigenvalue of the error covariance matrix as a metric allows us to compute a novel upper bound on its evolution, which is further extended to the case of stochastic misdetections by the robot’s sensors. These bounds are distinct from existing results in the literature, surveyed in [19], which are mainly for the Algebraic Riccati equation, representing the steady state value of the expected error covariance instead of its instantaneous value that we are concerned with. Other metrics, such as the trace of the expected error covariance matrix, have been commonly considered in the past, cf. [3]. However, to the best of our knowledge, the trace does not offer a tractable means to bound its evolution over time, especially in the stochastic setting of sensor misdetections. Further, while introducing conservativeness in the sense that we consider the maximum mode for the uncertainty, propagating the maximum eigenvalue bound offers a computational advantage since we only need to propagate a scalar quantity instead of the entire covariance matrix. This paper is organized as follows. In Section II, we for- mulate the problem. An analytical bound on the performance of the state estimator with probabilistic sensor misdetections is derived in Section III. In Section IV, we describe in detail how the analytical bound is used for robust path planning. Computational results are reported in Section V. Finally, conclusions and future directions are discussed in Section VI. II. PROBLEM FORMULATION We consider a general model of an agent whose state evolves as per a non-linear discrete-time dynamical system x(t + 1) = f(x(t), n(t)), (1) where x ∈Rnx is the state describing the system at time t, f : Rnx × Rnn →Rnx describes the state transition map of the system and n ∈Rnn is the process noise. The agent is equipped with m sensors in order to estimate the state x. Sensors’ output is modeled as yj(t) = hj(x(t), vj(t)), ∀j ∈{1, . . . , m}, (2) where vj ∈Rnj is the process noise of the j-th sensor and h : Rnx × Rnj →Rnyj describes the relation between state and measurement. We assume that the noise vectors n and vj are independently generated mean-zero Gaussian random vectors. In this paper, we consider situations where sensors can misdetect features and thus, do not produce a measurement at certain time instants. Examples include RF-based signals from beacons that are not detected by the agent because of low SNR, misdetection of features using cameras because of abrupt change of lighting conditions, no LIDAR returns from certain materials, etc. Under these circumstances, analogous to the Kalman Filter case (see equations 185 and 186 in [20]), an Extended Kalman Filter (EKF) based estimator of the state x can be written as: P−1 t+1 = (FtPtF′ t + Qt)−1 + m X j=1 γj,t+1H′ jR−1 j,t+1Hj, (3) ˆxt+1 = Pt+1  (FtPtF′ t + Qt)−1f(ˆxt, 0)+ + m X j=1 γj,t+1H′ jR−1 j,t+1(yj(t + 1) −hj(f(ˆxt), 0))  , where ˆxt is the state estimate, Pt is the expected error covariance with respect to the process and sensor noise terms, Ft is the linearization of f around (ˆxt, 0) and Hj is the linearization of hj around (f(ˆxt), 0). The matrix Qt is the process noise covariance and the Rj,t+1 is the measurement noise covariance associated to the j-th sensor. The variables γj,t+1 are binary, 0 −1 stochastic variables that model the misdetection by the j-th sensor. We are interested to compute a BRM that trades off accuracy and robustness when sensors can stochastically be in misdetec- tion mode during the mission. We will call this Robust Belief Roadmap or RBRM. In this work, we make the following assumptions: Assumption II.1 (Misdetection map). For each sensor j, we can characterize the misdetection probability (1 −pj) at each location in the environment. Assumption II.2 (Independence). We assume γj, namely the misdetections to be independent over time and between sensors. Specifically, the γj are Bernoulli random variables with pj being the probability of γj = 1. For constant values of pj’s, Assumption II.2 is common in the literature pertaining to the research area involving estima- tion with intermittent observations, e.g., see [21]. However, most results in this field are concerned with stability of the estimation algorithms, while our focus in this paper is to char- acterize the evolution of the estimation performance. Further, 3 in our set-up, the parameters pj of the random variables γj,t in (3) are functions of the state x, whereas in estimation literature, the state dependence is not explicitly considered. Assumption II.3 (Consistency). We assume that the state estimate ˆxt is identical to the nominal trajectory to be followed by the vehicle. The last assumption implies that we have a reasonable nominal model for the motion of the vehicle, and the sensors possess a decent level of accuracy so that the vehicle moves close to the nominal trajectory. This work is concerned with the level of confidence measured through Pt that we can obtain in our state estimate ˆxt. With these assumptions, we address the following problems: Problem II.1. Given the sensor accuracies, their misdetection probabilities and a nominal trajectory, determine a bound on the evolution of the expected value of M(Pt), where M represents any function that can capture the uncertainty through Pt. Here, the expectation is taken over the joint distribution of the random variables γj,t representing sensor misdetection. Equipped with this theoretical bound, the second goal is to apply the bound to the following problem on path planning under uncertainty. Problem II.2. Given a set of candidate trajectories from a start location to a goal location, develop an algorithm that propagates the bound on E[M(Pt)] over the PRM, to output a path having minimum goal-state uncertainty. Remark II.1. The probability of misdetection of a sensor (Assumption II.1) is in general difficult to know precisely. However, thanks to rather realistic simulation environments capable to both simulate sensor responses as well as the environment, one can foresee the possibility of obtaining rather realistic models. For example, given a geometric model of the environment, edges and/or corners of buildings could be marked with different misdetection probabilities if some information from the type of material, the texture, the time of the day the mission is carried out, etc. are considered. Entire areas of the environment could be marked with misdetection probabilities, e.g., modeling the fact that an RF-signal cannot easily be detected behind buildings. This could be obtained with ray tracing based algorithms, see, e.g., Figure 1. The misdetection probability of some sensor could also be determined from historical data collected in the mission, for example correlating some information about the environment, such as obstacle density, time of the day the mission was carried out, etc. with the misdetection state of a sensor. Of course, it is impossible to capture all sources of un- certainty. However, if some of this information is available, the RBRM method can take it into consideration trading off accuracy and robustness. When such models are only of qualitative nature, the RBRM could be used by the user to assess the robustness of the solution. In practice, the misdetection probabilities can be used as user parameters. Varying such parameters, the user can study how the RBRM changes under, for example, more Fig. 1. Illustration of the model of the environment which could be used for planning. In this case, not only the geometric information about the obstacles and position of beacons is considered, but also the information about the accuracy certain sensors will have in detecting, e.g. edges of buildings and signal from a radio tower (beacon). Note that in this scenario a rather elaborate model is used, where edges are categorized into three classes: detectable, detectable with 85% and 60% probability. This different detection accuracy could be caused by the sun’s position. Signal strength map from radio tower could be determined using ray tracing algorithms. In this example, just for illustration purposes, we have three regions with different detection probabilities. pessimistic hypothesis on the behavior of certain sensors, having higher misdetection probability in more remote regions of the environment where more uncertainty is expected, etc. III. ANALYTICAL BOUND ON PERFORMANCE In the seminal work by Prentice and Roy [3], it was shown that if the covariance matrix is factorized as Pt = BtC−1 t , then the time evolution of the terms Bt, Ct is linear. This enabled the authors to develop and demonstrate a computation- ally efficient algorithm to compute a roadmap that captures the estimation accuracy. The function M used in [3] is the trace of the matrix. Leveraging the same factorization, one arrives at the following equation:  Bt Ct  =  Ft Qt(F−1 t )′ M(γt)Ft (F−1 t )′ + M(γt)Qt(F−1 t )′   Bt−1 Ct−1  , (4) where M(γt) = Pm j=1 γj,tH′ jR−1 j,t Hj, which depends on the stochastic variables γj,t. This can be thought as a transfer function that maps the matrices Bt and Ct from one node of the roadmap to the next [3]. In this stochastic setting, however, a direct application of the BRM proposed by Prentice and Roy in [3] requires extensive Monte Carlo simulations over all the variables γj,t. Even if the factorization provides a faster computation of the covariance, the method becomes 4 very quickly intractable, especially if γj,t changes spatially. See, e.g., the signal received from a radio tower as in Figure 1. A way to mitigate this is by taking the expectation with respect to the γj,t, namely by computing E(Bt) and E(Ct). This would enable us to compute an expected transfer func- tion between two nodes in the roadmap. However, note that E(Pt) = E(BtC−1 t ) ̸= E(Bt)(E(Ct))−1, thus preventing us to compute what the expected state covariance is at each node of the roadmap. In order to both obtain a meaningful metric on the ex- pected error covariance, which captures the estimate accuracy when there are sensor misdetections, and have computational tractability, we establish a bound on the largest eigenvalue of the covariance. Intuitively, we are approximating the uncer- tainty at each node of the roadmap with a ball whose radius is the largest eigenvalue of the covariance and we determine a bound on such radius. A. Bound with all sensors functioning In order to derive a bound on the maximum eigenvalue of the expected error covariance, we consider first the case when all the sensors are in working condition at all times. Such analysis is instrumental to derive a bound for the case of sensor misdetection. Without loss of generality and for the sake of simpler notation, we assume that there exists only one sensor that is in working condition. The bound can be easily generalized to multiple sensors. In the following, we will denote with λ(A) and with ¯λ(A) the minimum and maximum eigenvalue of A, respectively, where A is a positive definite matrix. For the expected error covariance, the following result provides a recursion to compute an upper bound on ¯λ(Pt). Theorem III.1. At every time instant t, ¯λ(Pt) ≤ ¯λ2(Ft)¯λ(Pt−1) + ¯λ(Qt) λ H′ tR−1 t Ht  (¯λ2(Ft)¯λ(Pt−1) + ¯λ(Qt)) + 1, (5) where the Jacobians Ft, Ht are evaluated at f(ˆxt−1, 0). The proof of Theorem III.1 is reported in the Section A of the Appendix. This result is stated in the form of a recursion mainly because the terms ¯λ2(Ft) and H′ tR−1 t Ht are functions of the estimate ˆxt−1. If we can uniformly upper and lower bound them respectively, e.g, in the linear time invariant case, then we can state a uniform upper bound for ¯λ(Pt) as a function of the initial value P0, given by the next result. For this result, we introduce the following notation. Let X := {ˆx0, ˆx1, . . . , ˆxT }, denote a set of estimates at different times. Under Assumption II.3, this set is identical to the nominal trajectory sampled at the corresponding times. Define, XS := n ˆx ∈X : λ  H(f(ˆx, 0))′R−1 t H(f(ˆx, 0))  = 0 o , denote the subset of X at which the sensor provides no useful information. The filter runs open-loop at these locations. Then, the following result holds. Theorem III.2. Suppose that the cardinality of XS is κ. Then, under Assumption II.3, at the final estimation time instant T, ¯λ(PT ) ≤b κ X j=1 aj−1 −aκζ + aκ.   d −ζc ζc + a T −κ 1 ζ + ¯λ(P0) + c ζc + a  1 −(d−ζc)T −κ (ζc+a)T −κ 1 −(d−ζc) (ζc+a)    , where a := sup ˆx∈X ¯λ(F(ˆx)) , b := sup t ¯λ(Qt) , c := inf ˆx∈X\XS λ Hj(f(ˆx, 0))′R−1 j,t Hj(f(ˆx, 0))  , d := bc/a + 1 , ζ := (d −a + p (d −a)2 + 4bc)/(2c) . The proof of this result is presented in Section A of the Appendix. To extend both of these results for m func- tioning sensors, we simply replace the term H′ tR−1 t Ht by Pm j=1 H′ j,tR−1 j,t Hj,t. B. Bound under Stochastic Sensor Misdetections In this section, we extend the analysis to the case of multiple sensors, added to improve robustness, but that can produce misdetections as per Assumptions II.1 and II.2. The metric which we analyze is E[¯λ(Pt)], where the expectation is taken over the stochastic process of sensor misdetections. For brevity, let ℓt := ¯λ(Pt). For k ∈{1, . . . , m}, define: ci1,...,ik = aλ  k X j=1 H′ ij,tR−1 ij,tHij,t  , di1,...,ik = bci1,...,ik/a + 1, (6) where a tuple i1, . . . , ik is a subset of {1, . . . , m}, and a, b are as defined in Theorem III.2. Using this notation, we have the following recursion which provides an upper bound on E[ℓt], referred to hereafter as E[ℓt]. Theorem III.3 (Stochastic misdetections). Under Assump- tions II.1 and II.2, at any given time instant t, E[ℓt] generated as per the following recursion, E[ℓt] = (aE[ℓt−1] + b)  (1 −p1) . . . (1 −pm) + p1(1 −p2) . . . (1 −pm) c1E[ℓt−1] + d1 + · · · + (1 −p1)(1 −p2) . . . pm cmE[ℓt−1] + dm + p1p2 . . . (1 −pm) c12E[ℓt−1] + d12 + · · · + (1 −p1) . . . pm−1pm cm−1,mE[ℓt−1] + dm−1,m ... + p1 . . . pm c1,...,mE[ℓt−1] + d1,...,m  . is an upper bound on E[ℓt]. The proof of this result is presented in Section A of the Appendix. 5 This bound requires the enumeration of all of the 2m possibilities of sensor combinations, and therefore, the com- putational complexity scales undesirably with m. One way to derive an efficient bound is to obtain a uniform lower bound ¯c on each of the c’s. In that case, the common denominator of the right hand side terms becomes ¯cE[ℓt−1]+ ¯d. The recursion then simplifies to E[ℓt] = (aE[ℓt−1]+b)   m Y j=1 (1 −pj) + 1 −Qm j=1(1 −pj) ¯c E[ℓt−1] + ¯d  . This recursion can be evaluated on similar lines to the proof of Theorem III.2 to obtain a E[ℓt] as a function of E[ℓ0]. For certain types of sensor suites, one may be able to derive a slightly conservative, but computationally efficient upper bound which we report next. Corollary III.1 (Simplified bound). Under Assumptions II.1 and II.2, at any given time instant t, E[ℓt] generated as per the following recursion, E[ℓt] = (aE[ℓt−1]+b)   m Y j=1 (1 −pj) + m X j=1 pj cjE[ℓt−1] + dj  . is an upper bound on E[ℓt]. The proof is reported in the Section A of the Appendix. The main advantage of this bound is the computational efficiency as compared to the one in Theorem III.3, since this needs to evaluate only m terms. However, this bound requires at least one of the sensors to have its c value to be strictly positive, and therefore, may become too conservative. We will use Theorem III.3 in our proposed RBRM approach. IV. APPLICATION TO PATH PLANNING MISSIONS The upper bound on E[ℓt] given in Theorem III.3 may be used to plan paths of minimum expected goal-state uncertainty in a manner similar to the belief roadmap algorithm [3]. We will assume that a probabilistic roadmap with node set N and edge set E is provided as input, along with beliefs µ0 and µgoal defining a start state and goal state on the roadmap. We also assume that for every node n ∈N, the triple n = {µ, E[ℓ], π} is stored, which contains the belief, the eigenvalue bound, and the path π (beginning at µ0) associated with this node. We refer to individual members of the triple using the notation n[µ], n[E[ℓ]], and n[π]. Belief propagation and graph search proceeds similarly to that of the standard BRM algorithm; E[ℓt] is propagated according to the recursive inequality given in Theorem III.3, and is used in place of the nominal-trajectory expected error covariance matrix that is propagated in the standard BRM. We assume the bound is used to compute a transfer function E[ℓ]l = ζ(i, l, E[ℓ]i), that takes as input the indices of an edge ei,l in the roadmap, and the eigenvalue bound associated with node ni. In the context of the graph search, we treat E[ℓ] independently of time, and assume that ni[E[ℓ]] represents the best-yet covariance eigenvalue bound identified at ni. The search process is shown in Algorithm 1. Algorithm 1 ngoal [π] = RBRM(µ0, µgoal, E[ℓ]0, N, E) for ei,l ∈E do ξ(i, l, E[ℓ]i) ←PropagateBound(ei,l) end for Q ←n0 = {µ0, E[ℓ]0, ∅} while Q ̸= ∅do ni ←Pop(Q) for nl ∈ei,l do if nl /∈ni [π] then E[ℓ]l ←ξ(i, l, ni[E[ℓ]]) if E[ℓ]l < nl[E[ℓ]] then nl ←{nl[µ], E[ℓ]l, ni[π] ∪nl} Q ←Push(Q, nl) end if end if end for end while return ngoal[π] The use of our proposed approach, i.e., propagation of E[ℓt], provides us with significant computational advantage over ex- isting methods such as [3]. If we were to use their factorization from (4), then we would have to compute: 1) 2m realizations of the matrix Bt (one for each subset of misdetecting sensors), 2) inverses of 2m realizations of the matrix Ct, 3) multiply each realization of Bt with corresponding C−1 t and finally, 4) sum up the 2m terms to compute E[Pt] or its trace. Instead, our approach requires the computation of minimum eigenvalues of 2m much smaller sized matrices, i.e., sums of the terms H′ jR−1 j,t Hj, for which efficient algorithms exist even for larger sizes [22], along with step 4) of the above, which provides significant savings in high dimensional state space. V. COMPUTATIONAL RESULTS We next plan paths of minimum uncertainty under process noise, sensor noise, and probabilistic misdetections for a planar Dubins vehicle [23] in an environment populated with obsta- cles. We assume a robot is using three sensors for navigation: ultra-wideband (UWB) range beacons, a laser rangefinder for measuring obstacle vertices, and odometry that is subject to drift over time. The beacons provide measurements throughout the workspace, but their noise properties are assumed to vary as a function of distance to the robot1, according to vj(t) ∼N(0, σ(dj(t))2) , (7) σ(dj(t)) = αdj(t) + σ0 . (8) The noise associated with the range measurement of beacon j has a standard deviation that varies linearly in the Euclidean distance dj(t) between the robot and sensor j. The standard deviation takes on value σ0 at range zero and increases according to the coefficient α. For the laser rangefinder, we assume the measurement of range to an obstacle vertex is 1A more general model could also consider a bias term as described in [3]. This could be easily added also in our framework. 6 0 2 4 6 8 10 0 2 4 6 8 10 Path of Minimum Realized Covariance Eigenvalue, Perfect Sensing Goal−State Eigenvalue Bound: (Failures Considered) 0.006 0 2 4 6 8 10 0 2 4 6 8 10 Path of Minimum Expected Covariance Eigenvalue, Sensor Failures Consired Goal−State Eigenvalue Bound: (Failures Considered) 0.003 start goal Fig. 2. Planned paths in a workspace populated with obstacles (measured by laser) and UWB beacons. The robot receives the beacon measurements with probability 0.1, and extracts obstacle corners from laser data with probability 0.9. At top, a path planned using ℓt as a performance metric, neglecting all probabilistic sensor misdetections. At bottom, a path planned using E[ℓt] as a performance metric, which considers the misdetection probability of each sensor. The UWB beacons are queried at every measurement iteration; the laser has a range of one unit and its planned measurements are rendered (green for a successful measurement and red for a misdetection) for a representative failure scenario. Ninety-five percent confidence covariance ellipses are plotted at regular intervals along each path. corrupted by Gaussian white noise with properties that do not vary spatially, and the vertices measured are always correctly associated with a prior map. The maximum range of the laser is limited, however, and obstacles can only be detected in close proximity to the robot. A start state and goal state are designated for the robot, and a PRM is used to identify feasible paths between the start and goal. To select the path of minimum goal-state uncertainty, two methodologies are compared: the original BRM algorithm, with no notion of sensor misdetections, and the proposed RBRM algorithm, which uses E[ℓt] as a cost metric instead of tr(Pt). For all path planning scenarios investigated, the standard BRM algorithm was found to choose the same path regardless of whether tr(Pt) is used as the cost metric or ℓt is used instead. Evaluating ℓt over the roadmap offers a better comparision with E[ℓt], and so both tr(Pt) and ℓt are computed for comparison with E[ℓt] in the results to follow. 100 200 300 400 500 0 0.005 0.01 0.015 0.02 Lambdamax Mean Value of Error Covariance Eigenvalue Along Planned Paths 100 200 300 400 500 0 0.005 0.01 0.015 0.02 Number of Measurements Collected Along Planned Path trace(P) Mean Value of Error Covariance Trace Along Planned Paths Original BRM Path (mean trace) Failures Considered (mean trace) Original BRM Path (mean eigenvalue) Failures Considered (mean eigenvalue) Failures Considered (eigenvalue bound) Fig. 3. At top, the propagation in time of the eigenvalue performance metrics over the paths in Figure 2. At bottom, tr(Pt) is also given for both paths. All quantities except E[ℓt] represent the mean over one hundred Monte Carlo trials in which different sequences of sensor misdetections occur according to the prescribed probabilities. The first scenario considered is illustrated in Figure 2, in which a robot must plan from start to goal in a workspace populated with three obstacles and four range beacons. Very simple collision-free paths are evident through the upper reaches of the workspace, but in cases where the probability of UWB misdetection is high, it is advantageous for the robot to travel through the obstacles to collect laser measurements that reduce position uncertainty. For the specific case plotted in Figure 2, the beacons have a ten percent probability of deliver- ing a successful measurement to the robot, and the laser (with a maximum range of one unit) has a ninety percent probability of successfully extracting an obstacle vertex and measuring its range to the robot. When intermittent sensing is neglected, the robot takes a short path to the goal that collects measurements from the UWB beacons only. When intermittent sensing is considered, the robot takes a detour through the obstacles to reduce the uncertainty of its state estimate. For both paths, one hundred Monte Carlo simulations were performed in which sensors fail according to the prescribed probabilities, and the resulting mean values of tr(Pt) and ℓt are compared with E[ℓt] in Figure 3. This planning scenario is next considered over a range of different misdetection probabilities, for both the laser and the UWB beacons, and the results are summarized in Figure 4. The number of planned laser measurements in the minimum uncertainty path, computed using E[ℓt], is given at top, and the value of E[ℓt] at each path’s goal state is given at bottom. 7 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 UWB Beacon Detection Probability Covariance Eigenvalue Bound as Sensor Reliability Varies Laser Detection Probability 1.5 2 2.5 3 3.5 4 x 10 −3 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Number of Planned Laser Measurements as Sensor Reliability Varies 0 50 100 150 200 Fig. 4. Characteristics of a planned path are plotted as a function of sensor reliability, for the two-sensor example shown in Figure 2. At top, the covariance eigenvalue bound at the goal state is illustrated, and at bottom, the number of planned laser measurements along the selected path is illustrated. The upper left corner of each plot corresponds to the parametrization used in Figures 2 and 3. The zero-range noise level σ2 0 selected for the UWB beacons is an order of magnitude lower than the constant variance representing the laser noise, and so the UWB beacons are used exclusively for all scenarios in which they are more than fifty percent reliable, even if the laser is more reliable. A final path planning test case with continuously varying sensor intermittency is considered in Figure 5. In a workspace populated with eight obstacles and no UWB beacons, we assume that a light source causes the expected sensing inter- mittency to vary continuously along the vertical axis of the workspace, with a high detection probability at bottom and a low detection probability at top. Neglecting sensor inter- mittency, the standard BRM algorithm plans a path through the upper region of the workspace, and considering sensor intermittency, planning with E[ℓt] yields a path that collects many high-probability measurements from the lower region of the workspace to minimize uncertainty at the goal state. The candidate metrics, averaged over one hundred simulated cases of sensor intermittency, are given in Figure 6. VI. CONCLUSION AND FUTURE DIRECTIONS In this paper we have described how to plan when sensors used for state estimation are not only noisy but may fail to produce measurements because of misdetections. Being able to tradeoff both accuracy and robustness is very appealing as 0 2 4 6 8 10 0 2 4 6 8 10 Path of Minimum Realized Covariance Eigenvalue, Perfect Sensing Goal−State Eigenvalue Bound: (Failures Considered) 0.012 0 2 4 6 8 10 0 2 4 6 8 10 Path of Minimum Expected Covariance Eigenvalue, Sensor Failures Considered Goal−State Eigenvalue Bound: (Failures Considered) 0.002 goal start p = 1 p = 0 Fig. 5. Planned paths in a workspace over which the probability of a successful corner detection varies spatially along the vertical axis. Obstacles measured at bottom have the highest probability of a successful measurement, and obstacles measured at top have a near-zero probability of a successful measurement. At top, a path planned using ℓt as a performance metric, neglecting all probabilistic sensor misdetections. At bottom, a path planned using E[ℓt] as a performance metric, which considers the misdetection probability of each sensor. The laser has a range of one unit and its planned measurements are rendered (green for a successful measurement and red for a misdetection) for a representative scenario. Ninety-five percent confidence covariance ellipses are plotted at regular intervals along each path. autonomous vehicles heavily rely on complex sensors such as cameras and LIDAR whose capability of extracting relevant in- formation, such as features and point clouds, strongly depends on environmental information that can be predicted to a certain extent, such effect of lighting conditions, type of surfaces, etc. Even in the case when such information is not fully available, the proposed methodology can be very beneficial to study the robustness of the path to intermittent sensing, by testing the robust roadmap, for example, by choosing various probabilities of intermittency at various location in the map. Future directions include the incorporation of map uncer- tainty within the current framework and the use of the current analysis for multi-objective path planning. REFERENCES [1] R. Alterovitz, T. Simeon, and K. Goldberg, “The stochas- tic motion roadmap: A sampling framework for planning 8 100 200 300 400 0 0.01 0.02 Lambdamax Mean Value of Error Covariance Eigenvalue Along Planned Paths 100 200 300 400 0 0.01 0.02 Number of Measurements Collected Along Planned Path trace(P) Mean Value of Error Covariance Trace Along Planned Paths Original BRM Path (mean eigenvalue) Failures Considered (mean eigenvalue) Failures Considered (eigenvalue bound) Original BRM Path (mean trace) Failures Considered (mean trace) Fig. 6. At top, the propagation in time of the eigenvalue performance metrics over the paths in Figure 5. At bottom, the trace of the expected error covariance matrix is given for both paths. All quantities except the eigenvalue bound represent the mean over one hundred Monte Carlo trials in which different sequences of sensor misdetections occur according to the prescribed probabilities. with Markov motion uncertainty,” in Proceedings of the Robotics: Science and Systems Conference, Atlanta, GA, 2007. [2] B. Marthi, “Robust navigation execution by planning in belief space,” in Proceedings of the Robotics: Science and Systems Conference, Zurich, Switzerland, 2008. [3] S. Prentice and N. Roy, “The belief roadmap: Efficient planning in belief space by factoring the covariance,” International Journal of Robotics Research, vol. 28, no. 11–12, pp. 1448–1465, 2009. [4] L. Kavraki, P. Svestka, J.-C. Latombe, and M. Over- mars, “Probabilistic roadmaps for path planning in high- dimensional configuration spaces,” IEEE Transactions on Robotics, vol. 12, no. 4, pp. 556–580, 1996. [5] A. Gelb, Ed., Applied Optimal Estimation. MIT Press, 1974. [6] R. He, S. Prentice, and N. Roy, “Planning in informa- tion space for a quadrotor helicopter in a GPS-denied environment,” in Proceedings of the IEEE International Conference on Robotics and Automation, Pasadena, CA, 2008, pp. 814–820. [7] H. Takeda, C. Facchinetti, and J.-C. Latombe, “Planning the motions of a mobile robot in a sensory uncertainty field,” IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, vol. 16, no. 10, pp. 1002–1017, 1994. [8] A. Bry and N. Roy, “Rapidly-exploring random belief trees for motion planning under uncertainty,” in Proceed- ings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 2011. [9] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion planning,” International Journal of Robotics Research, vol. 30, no. 7, pp. 846–894, 2011. [10] J. van den Berg, P. Abbeel, and K. Goldberg, “LQG- MP: Optimized path planning for robots with motion un- certainty and imperfect state information,” International Journal of Robotics Research, vol. 30, no. 7, pp. 895– 913, 2011. [11] S. LaValle and J. Kuffner, “Randomized kinodynamic planning,” International Journal of Robotics Research, vol. 20, no. 5, pp. 378–400, 2001. [12] R. P. Jr., R. Tedrake, L. Kaelbling, and T. Lozano-Perez, “Belief space planning assuming maximum likelihood observations,” in Proceedings of Robotics: Science and Systems Conference, Zaragoza, Spain, 2010. [13] P. Missiuro and N. Roy, “Adapting probabilistic roadmaps to handle uncertain maps,” in Proceedings of the IEEE International Conference on Robotics and Automation, Orlando, FL, 2006, pp. 1261–1267. [14] L. Guibas, D. Hsu, H. Kurniawati, and E. Rehman, “Bounded uncertainty roadmaps for path planning,” in Proceedings of the Workshop on the Algorithms Founda- tions of Robotics, Guanajuato, Mexico, 2008. [15] H. Kurniawati, T. Bandyopadhyay, and N. Patrikalakis, “Global motion planning under uncertain motion, sens- ing, and environment map,” in Proceeding of Robotics: Science and Systems Conference, Los Angeles, CA, 2011. [16] M. Vitus, W. Zhang, and C. Tomlin, “A hierarchical method for stochastic motion planning in uncertain envi- ronments,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Algarve, Portugal, 2012. [17] M. P. Wellman, M. Ford, and K. Larson, “Path planning under time-dependent uncertainty,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelli- gence, Montreal, Canada, 1995. [18] E. U. Acar, H. Choset, Y. Zhang, and M. Schervish, “Path planning for robotic demining: Robust sensor-based coverage of unstructured environments and probabilistic methods,” International Journal of Robotics Research, vol. 22, no. 7-8, pp. 441–466, 2003. [19] W. H. Kwon, Y. S. Moon, and S. C. Ahn, “Bounds in algebraic Riccati and Lyapunov equations: A survey and some new results,” Int. Jour. of Control, vol. 64, no. 3, 1996. [20] H. Durrant-Whyte, “Multi sensor data fu- sion,” http://www.acfr.usyd.edu.au/pdfs/training/ multiSensorDataFusion/dataFusionNotes.pdf, 2001. [21] B. Sinopoli, L. Schenato, M. Franceschetti, K. Poolla, M. Jordan, and S. S. Sastry, “Kalman filtering with inter- mittent observations,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1453–1464, 2004. [22] Y. Okamoto and H. J. Maris, “A novel algorithm for calculation of the extreme eigenvalues of large Hermitian matrices,” Computer Phy. Comm., vol. 76, no. 2, 1993. 9 [23] L. Dubins, “On curves of minimal length with a con- straint on average curvature, and with prescribed initial and terminal positions and tangents,” American Journal of Mathematics, vol. 79, no. 3, pp. 497–516, 1957. [24] R. A. Horn and C. R. Johnson, Matrix Analysis. Cam- bridge University Press, 1985. APPENDIX In this appendix, we have included the proofs of all technical results presented in this paper. To prove Theorem III.1, we will use the following set of matrix identities. Although these inequalities are fairly standard, we provide a proof for the sake of completeness. Lemma A.1 ([24]). Given n × n positive definite matrices X, Y, Z, λ(X + Y ) ≥λ(X) + λ(Y ) (9) ¯λ(X + Y ) ≤¯λ(X) + ¯λ(Y ) (10) ¯λ(XY ) ≤¯λ(X)¯λ(Y ) (11) where λ, ¯λ denote the minimum and the maximum eigenvalues of the matrix, respectively. Proof: We prove only the max. The min part follows analogously by reversing signs. To prove (10), recall that ¯λ(X + Y ) = max x̸=0 ∥(X + Y )x∥ ∥x∥ ≤max x̸=0 ∥Xx∥ ∥x∥+ max x̸=0 ∥Y x∥ ∥x∥ = ¯λ(X) + ¯λ(Y ), where we used the triangle inequality in the second step. (11) follows by using the same steps, except that we use the sub- multiplicativity of the norm to obtain the inequality. We are now ready to establish Theorem III.1. Proof of Theorem III.1: For the recursion, we use the following inverse covariance form [20] P−1 t = (Ft−1Pt−1F′ t−1 + Qt−1)−1 + H′ tR−1 t Ht, where F, H are Jacobians of f, h around the state prediction at time t −1. Using the bounds in Lemma A.1, we obtain ¯λ(Pt) = ¯λ h H′ tR−1 t Ht + (Ft−1Pt−1F′ t−1 + Qt−1)−1i−1 = 1 λ  H′ tR−1 t Ht + (Ft−1Pt−1F′ t−1 + Qt−1)−1  (9) ≤ 1 λ H′ tR−1 t Ht  + λ((Ft−1Pt−1F′ t−1 + Qt−1)−1) = 1 λ H′ tR−1 t Ht  + (¯λ(Ft−1Pt−1F′ t−1 + Qt−1))−1 (10),(11) ≤ ¯λ2(Ft−1)¯λ(Pt−1) + ¯λ(Qt−1) λ H′ tR−1 t Ht  (¯λ2(Ft−1)¯λ(Pt−1) + ¯λ(Qt−1)) + 1. To prove Theorem III.2, we require the following interme- diate result, which establishes a bound for the value of a scalar variable l which may evolve as per one out of two equations at any given time. Lemma A.2. Suppose that in the time interval [0, 1, . . . , T], a scalar variable ℓevolves as per ℓt+1 =    aℓt + b cℓt + d, for some T −κ instants, alt + b, for the remaining κ instants, where a, b, c, d are some finite positive scalars, then ℓT ≤b κ X j=1 aj−1 −ζaκ + aκ.   d −ζc ζc + a T −κ 1 ζ + l0 + c ζc + a  1 −(d−ζc)T −κ (ζc+a)T −κ 1 −(d−ζc) (ζc+a)     where ζ is defined in Theorem III.1. Proof: Observing that for the above evolution of l, al + b cl + d + b ≥a(al + b) + b cl + d , which means that for any sequence of κ occurrences of the second equation, one can always upper bound the resulting l trajectory by considering all the occurrences of evolution by the first equation, followed by the second. The evolution given by the first equation in the time interval [0, T −κ] can be simplified as follows. Set µt := 1 ζ + ℓt Since bc > 0, we get µt ≥d −ζc ζc + aµt−1 + c ζc + a ⇒µt ≥ d −ζc ζc + a k µ0 + c ζc + a 1 −(d −ζc)k/(ζc + a)k 1 −(d −ζc)/(ζc + a)  , Therefore, ℓT −κ ≤1 .   d −ζc ζc + a T −κ 1 ζ + l0 + c ζc + a  1 −(d−ζc)T −κ (ζc+a)T −κ 1 −(d−ζc) (ζc+a)    −ζ. The claim now follows since lT can be at most the above right hand side subject to the second, linear evolution for κ time steps. We can now prove Theorem III.2. Proof of Theorem III.2: Consider the recursion from Theorem III.1. Substituting zt := ¯λ(Pt), we obtain the following linear rational recurrence, zt ≤azt−1 + b czt−1 + d (12) 10 where we used the definition of a, b, c and d. Now, whenever ˆx ∈X \ XS, z will evolve as per (12). Otherwise, z evolves as per zt ≤azt−1 + b, which happens at most κ times as per the assumption. There- fore, applying Lemma A.2, the claim is established. Proof of Theorem III.3: Conditioning on the value of ℓt−1, we can write the following equality upon enumerating all possibilities of the sensors misdetecting. E[ℓt|ℓt−1] = (aℓt−1 + b)  (1 −p1) . . . (1 −pm) + p1(1 −p2) . . . (1 −pm) c1ℓt−1 + d1 + · · · + (1 −p1)(1 −p2) . . . pm cmℓt−1 + dm + p1p2 . . . (1 −pm) c12ℓt−1 + d12 + · · · + (1 −p1) . . . pm−1pm cm−1,mℓt−1 + dm−1,m ... + p1 . . . pm c1,...,mℓt−1 + d1,...,m  . Now, the function g(x) := (ax + b)/(cx + d), is concave in x ∈R+ for any positive c, d such that ad > bc (This can be checked by computing ∂2g/∂x2). Therefore, from (6), each of the summing terms on the right hand side of the above equality are concave in the argument ℓt−1. Unconditioning on the random variable ℓt−1, and applying Jensen’s inequality2, we obtain E[ℓt] ≤(aE[ℓt−1] + b)  (1 −p1) . . . (1 −pm) + p1(1 −p2) . . . (1 −pm) c1E[ℓt−1] + d1 + · · · + (1 −p1)(1 −p2) . . . pm cmE[ℓt−1] + dm + p1p2 . . . (1 −pm) c12E[ℓt−1] + d12 + · · · + (1 −p1) . . . pm−1pm cm−1,mE[ℓt−1] + dm−1,m ... + p1 . . . pm c1,...,mE[ℓt−1] + d1,...,m  . (13) The final step now is to apply mathematical induction on t. Clearly the claim holds for t = 1, since E[ℓ0] = E[ℓ0]. Now assume that the claim holds for t −1, i.e., E[ℓt−1] is an upper bound on E[ℓt−1] obtained as per the equality in Theorem III.3. At time t, we can write inequality (13). Now, the function g(x) := (ax + b)/(cx + d), is monotonically increasing with x ∈R+ for any positive c, d such that ad > bc (This can be checked by verifying that ∂g/∂x > 0). Therefore, each of the summing terms on the right hand side of (13) are monotonically increasing functions of their argument, i.e., E[ℓt−1]. Therefore, substituting E[ℓt−1] in place of E[ℓt−1], we obtain an upper bound on the right hand side, which is precisely the definition of E[ℓt]. This completes the proof. 2Jensen’s inequality: For a random variable X with finite expectation, and a concave, real-valued function φ(X), E[φ(X)] ≤φ(E[X]). Proof of Corollary III.1: We begin with the equality E[ℓt+1|lt] = (aℓt + b)(1 −p1) . . . (1 −pm) + p1(1 −p2) . . . (1 −pm) c1ℓt + d1 + · · · + (1 −p1)(1 −p2) . . . pm cmℓt + dm + p1p2 . . . (1 −pm) c12lt + d12 + · · · + (1 −p1) . . . pm−1pm cm−1,mℓt + dm−1,m ... + p1 . . . pm c1,...,mℓt + d1,...,m . We first collect all the terms that contain the term p1 in their numerator. For all of these terms, observe that the denominator terms can be lower bounded by c1ℓt +d1, since each of the c’s are at least equal to c1 and likewise for the d’s. The numerator of those terms is each less than or equal to p1. Thus, we have E[ℓt+1|ℓt] ≤(aℓt + b)(1 −p1) . . . (1 −pm) + p1 c1ℓt + d1 + (1 −p1)p2 . . . (1 −pm) c2ℓt + d1 + · · · + (1 −p1)(1 −p2) . . . pm cmlt + dm + (1 −p1)p2p3 . . . (1 −pm) c23ℓt + d23 + · · · + (1 −p1) . . . pm−1pm cm−1,mℓt + dm−1,m ... Now, we repeat this procedure successively from the indices 2 through m. Now, the function g(x) := (ax + b)/(cx + d), is concave in x ∈R+ for any positive c, d such that ad > bc (This can be checked by computing ∂2g/∂x2). Therefore, from (6), each of the summing terms on the right hand side of the above equality are concave in the argument ℓt−1. Unconditioning on the random variable ℓt−1, and applying Jensen’s inequality for each of the summing terms, we obtain E[ℓt] ≤(aE[ℓt−1] + b)×   m Y j=1 (1 −pj) + m X j=1 pj cjE[ℓt−1] + dj  . (14) The final step now is to apply mathematical induction on t. Clearly the claim holds for t = 1, since E[ℓ0] = E[ℓ0]. Now assume that the claim holds for t −1, i.e., E[ℓt−1] is an upper bound on E[ℓt−1] obtained as per the statement of Corollary III.1. At time t, we can write inequality (14). Now, the function g(x) := (ax + b)/(cx + d), is monotonically increasing with x ∈R+ for any positive c, d such that ad > bc (This can be checked by verifying that ∂g/∂x > 0). Therefore, each of the summing terms on the right hand side of (13) are monotonically increasing functions of their argument, i.e., E[ℓt−1]. Therefore, substituting E[ℓt−1] in place of E[ℓt−1], we obtain an upper bound on the right hand side, which is precisely the definition of E[ℓt]. This completes the proof.