Completeness of Randomized Kinodynamic Planners with State-based Steering St ́ ephane Caron a,c , Quang-Cuong Pham b , Yoshihiko Nakamura a a Department of Mechano-Informatics, The University of Tokyo, Japan. b School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore. c Corresponding author: stephane.caron@normalesup.org Abstract Probabilistic completeness is an important property in motion planning. Although it has been established with clear assumptions for geometric planners, the panorama of completeness results for kinodynamic planners is still incomplete, as most exist- ing proofs rely on strong assumptions that are difficult, if not impossible, to verify on practical systems. In this paper, we focus on an important class of kinodynamic planners, namely those that interpolate trajectories in the state space. We provide a proof of probabilistic completeness for these planners under assumptions that can be readily verified from the system’s equations of motion and the user-defined interpolation function. Our proof relies crucially on a property of interpolated tra- jectories, termed second-order continuity (SOC), which we show is tightly related to the ability of a planner to benefit from denser sampling. We analyze the impact of this property in simulations on a low-torque pendulum. Our results show that a simple RRT using a second-order continuous interpolation swiftly finds solution, while it is impossible for the same planner using standard Bezier curves (which are not SOC) to find any solution. 1 Keywords: kinodynamic planning, probabilistic completeness 1. Introduction A deterministic motion planner is said to be complete if it returns a solution whenever one exists [2]. A randomized planner is said to be probabilistically com- plete if the probability of returning a solution, when there is one, tends to one as execution time goes to infinity [3]. Theoretical as they may seem, these two notions 1 This paper is a revised and expanded version of [1], which was presented at the International Conference on Robotics and Automation , 2014. Preprint submitted to Robotics and Autonomous Systems November 23, 2015 arXiv:1511.05259v2 [cs.RO] 20 Nov 2015 are of notable practical interest, as proving completeness requires one to formalize the problem by hypotheses on the robot, the environment, etc. While experiments can show that a planner works for a given robot, in a given environment, for a given query, etc., a proof of completeness is a certificate that the planner works for a pre- cise set of problems. The size of this set depends on how strong the assumptions required to make the proof are: the weaker the assumptions, the larger the set of solvable problems. Probabilistic completeness has been established for systems with geometric constraints [4, 3] such as e.g. obstacle avoidance [5]. However, proofs for systems with kinodynamic constraints [6, 7, 8] have yet to reach the same level of generality. Proofs available in the literature often rely on strong assumptions that are difficult to verify on practical systems (as a matter of fact, none of the previously mentioned works verified their hypotheses on non-trivial systems). In this paper, we establish probabilistic completeness (Section 3) for a large class of kinodynamic planners, namely those that interpolate trajectories in the state space. Unlike previous works, our assumptions can be readily verified from the system’s equations of motion and the user-defined interpolation function. The most important of these properties is second-order continuity (SOC), which states that the interpolation function varies smoothly and locally between states that are close. We evaluate the impact of this property in simulations (Section 4) on a low-torque pendulum. Experiments validate our completeness theorem, and suggest that SOC is an important design guideline for kinodynamic planners that interpolate in the state space. 2. Background 2.1. Kinodynamic Constraints Motion planning was first concerned only with geometric constraints such as obstacle avoidance or those imposed by the kinematic structures of manipula- tors [9, 4, 8, 6]. More recently, kinodynamic constraints, which stem from differ- ential equations of dynamic systems, have also been taken into account [10, 6, 5]. Kinodynamic constraints are more difficult to deal with than geometric con- straints because they cannot in general be expressed using only configuration-space variables – such as the joint angles of a manipulator, the position and the orienta- tion of a mobile robot, etc. Rather, they involve higher-order derivatives such as velocities and accelerations. There are two types of kinodynamic constraints: Non-holonomic constraints: non-integrable equality constraints on higher-order derivatives, such as found in wheeled vehicles [11], under-actuated manipu- lators [12] or space robots. 2 Hard bounds: inequality constraints on higher-order derivatives such as torque bounds for manipulators [13], support areas [14] or wrench cones for hu- manoid stability [15], etc. Some authors have considered systems that are subject to both types of constraints, such as under-actuated manipulators with torque bounds [12]. 2.2. Randomized Planners Randomized planners such as such as Probabilistic Roadmaps (PRM) [4] or Rapidly-exploring Random Trees (RRT) [6] build a roadmap on the state space. Both rely on repeated random sampling of the free state space, i.e. states with non- colliding configurations and velocities within the system bounds. New states are connected to the roadmap using a steering function, which is a method used to drive the system from an initial to a goal configuration. The steering method may be imperfect, e.g. it may not reach the goal exactly, not take environment collisions into account, only apply to states that are sufficiently close, etc. The objective of the motion planner is to overcome these limitations, turning a local steering function into a global planning method. PRM builds a roadmap that is later used to generate motions between many initial and final states (many-to-many queries). When new samples are drawn, they are connected to all neighboring states in the roadmap using the steering function, resulting in a connected graph. Meanwhile, RRT focuses on driving the system from one initial state x init towards a goal area (one-to-one queries). It grows a tree by connecting new samples to one neighboring state, usually their closest neighbor. Both PRM’s and RRT’s extension step are represented by Algorithm 1, which relies on the following sub-routines (see Fig. 1 for an illustration): • SAMPLE ( S ) : randomly sample an element from a set S ; • PARENTS ( x, V ) : return a set of states in the roadmap V from which steer- ing towards x will be attempted; • STEER ( x, x ′ ) : generate a system trajectory from x towards x ′ . If success- ful, return a new node x steer ready to be added to the roadmap. Depending on the planner, the successfulness criterion may be “reach x ′ exactly” or “reach a vicinity of x ′ ”. The design of each sub-routine greatly impacts the quality and even the com- pleteness of the resulting planner. In the literature, SAMPLE ( S ) is usually im- plemented as uniform random sampling over S , but some authors have suggested adaptive sampling as a way to improve planner performance [16]. In geometric 3 Algorithm 1 Extension step of randomized planners (PRM or RRT) Require: initial node x init , number of iterations N 1: ( V, E ) ← ( { x init } , ∅ ) 2: for N steps do 3: x rand ← SAMPLE ( X free ) 4: X parents ← PARENTS ( x rand , V ) 5: for x parent in X parents do 6: x steer ← STEER ( x parent , x rand ) 7: if x steer is a valid state then 8: V ← V ∪ { x steer } 9: E ← E ∪ { ( x parent , x steer ) } 10: end if 11: end for 12: end for 13: return ( V, E ) planners, PARENTS ( q, V ) is usually implemented from the Euclidean norm over C as PARENTS ( q, V ) := arg min q ′ ∈ V ‖ q ′ − q ‖ . This choice results in the so-called Voronoi bias of RRTs [6]. Both experiments and theoretical analysis support this choice for geometric planning, however it becomes inefficient for kinodynamic planning, as was showed by Shkolnik et al. [17] on systems as simple as the torque-limited pendulum. 2.3. Steering Methods This paper focuses on steering functions. These can be classified into three categories: analytical, state-based and control-based steering. Analytical steering. This category corresponds to the ideal case when one can compute analytical trajectories respecting the system’s differential constraints, which are usually called (perfect) steering functions in the literature [6, 18]. Unfortu- nately, it only applies to a handful of systems. Reeds and Shepp curves for cars are a notorious example of this [11]. Control-based steering. Generate a control u : [0 , ∆ t ] → U adm , where U adm de- notes the set of admissible controls, and compute the corresponding trajectory by forward dynamics . This approach has been called incremental simulation [19], control application [6] or control-space sampling [18] in the literature. It is widely 4 x' =SAMPLE() ROOT P 1 STEER( P , x' ) 1 P 2 STEER( P , x' ) 2 P 3 STEER( P , x' ) 3 Figure 1: Illustration of the extension routine of randomized planners. To grow the roadmap toward the sample x ′ , the planner selects a number of parents PARENTS ( x ′ ) = { P 1 , P 2 , P 3 } from which it applies the STEER ( P i , x ′ ) method. applicable, as it only requires forward-dynamic calculations, but usually results in weak steering functions as the user has no or limited control over the destination state. In works such as [6, 5], random functions u are sampled from a family of primitives ( e.g. piecewise-constant functions), a number of them are tried and only the one bringing the system closest to the target is retained. Linear-Quadratic Reg- ulation (LQR) [20, 21] also qualifies as control-based steering: in this case, u is computed as the optimal policy for a linear approximation of the system given a quadratic cost function. State-based steering. Interpolate a trajectory γ int : [0 , ∆ t ] → C , for instance a Bezier curve matching the initial and target configurations and velocities, and com- pute a control that makes the system track that trajectory. For fully-actuated sys- tem, this is typically done using inverse dynamics . An interpolated trajectory is rejected if no suitable control can be found. Compared to control-based steering, this approach applies to a more limited range of systems, but delivers more control over the destination state. Algorithm 2 gives the prototype of state-based steering functions. 2.4. Previous works Randomized planners such as RRT and PRM are both simple to implement 2 yet efficient for geometric planning. The completeness of these planners has been es- tablished for geometric planning in [6, 7, 8]. In their proof, Hsu et al. [8] quantified 2 For instance, the RRT used in the simulations of this paper was implemented in less than a hunder lines of Python code. 5 Algorithm 2 Prototype of state-based steering functions STEER ( x, x ′ ) 1: γ int ← INTERPOLATE ( x, x ′ ) 2: u int := INVERSE DYNAMICS ( γ int ( t ) , ̇ γ int ( t ) , ̈ γ int ( t )) 3: if ∀ t ∈ [0 , ∆ t ] , u int ( t ) ⊂ U adm then 4: return the last state of γ int 5: end if 6: return failure the problem of narrow passages in configuration space with the notion of ( α, β ) - expansiveness. The two constants α and β express a geometric lower bound on the rate of expansion of reachability areas. There is, however, a gap between geometric and kinodynamic planning [10] in terms of proving probabilistic completeness. When Hsu et al. extended their solution to kinodynamic planning [5], they applied the same notion of expansive- ness, but this time in the X × T (state and time) space with control-based steering. Their proof states that, when α > 0 and β > 0 , their planner is probabilistically complete. However, whether α > 0 or α = 0 in the non-geometric space X × T remains an open question. As a matter of fact, the problem of evaluating ( α, β ) has been deemed as difficult as the initial planning problem [8]. In a parallel line of work, LaValle et al. [6] provided a completeness argument for kinodynamic planning, based on the hypothesis of an attraction sequence , i.e. a covering of the state space where two major problems of kinodynamic planning are already solved: steering and antecedent selection. Unfortunately, the existence of such a sequence was not established. In the two previous examples, completeness is established under assumptions whose verification is at least as difficult as the motion planning problem itself. Ar- guably, too much of the complexity of kinodynamic planning has been abstracted into hypotheses, and these results are not strong enough to hold the claim that their planners are probabilistically complete in general. This was exemplified re- cently when Kunz and Stilman [22] showed that RRTs with control-based steering were not probabilistically complete for a family of control inputs (namely, those with fixed time step and best-input extension). At the same time, Papadopoulos et al. [18] established probabilistic completeness for the same planner using a differ- ent family of control inputs (randomly sampled piecewise-constant functions). The picture of completeness for kinodynamic planners therefore seems to be a nuanced one. Karaman et al. [7] introduced the RRT* path planner an extended it to kinody- namic planning with differential constraints in [23], providing a sketch of proof for the completeness of their solution. However, they assumed that their planner had 6 access to the optimal cost metric and optimal local steering, which restricts their analysis to systems for which these ideal solutions are known. The same authors tackled the problem from a slightly different perspective in [24] where they sup- posed that the PARENTS function had access to w -weighted boxes, an abstraction of the system’s local controlability. However, they did not show how these boxes can be computed in practice 3 and did not prove their theorem, arguing that the reasoning was similar to the one in [7] for kinematic systems. To the best of our knowledge, the present paper is the first to provide a proof of probabilistic completeness for kinodynamic planners using state-based steering. 2.5. Terminology A function is smooth when all its derivatives exist and are continuous. Let ‖ · ‖ denote the Euclidean norm. A function f : A → B between metric spaces is Lipschitz when there exists a constant K f such that ∀ ( x, y ) ∈ A, ‖ f ( x ) − f ( y ) ‖ ≤ K f ‖ x − y ‖ . The (smallest) constant K f is called the Lipschitz constant of the function f . Let C denote n -dimensional configuration space, where n is the number of degrees of freedom of the robot. The state space X is the 2 n -dimensional manifold of configuration and velocity coordinates x = ( q, ̇ q ) . A trajectory is a continuous function γ : [0 , ∆ t ] → C , and the distance of a state x ∈ X to a trajectory γ is dist γ ( x ) := min t ∈ [0 , ∆ t ] ‖ ( γ, ̇ γ )( t ) − x ‖ . A kinodynamic system can be written as a time-invariant differential system: ̇ x ( t ) = f ( x ( t ) , u ( t )) , (1) where u ∈ U denotes the control input and x ( t ) ∈ X . Let U adm ⊂ U denote the subset of admissible controls. (For instance, U adm = [ τ min , τ max ] ⊂ U = R represents bounded torques for a single joint.) A control function u : [0 , ∆ t ] → U has δ -clearance when its image is in the δ -interior of U adm , i.e. for any time t , B ( u ( t ) , δ ) ⊂ U adm . A trajectory γ that is solution to the differential system (1) us- ing only controls u ( t ) ∈ U adm is called an admissible trajectory. The kinodynamic motion planning problem is to find an admissible trajectory from q init to q goal . 3 The definition of w -weighted boxes is quite involved: it depends on the joint flow of vector fields spanning the tangent space of the system’s manifold. 7 3. Completeness Theorem 3.1. System assumptions Our model for an X -state randomized planner is given by Algorithm 1 using state-based steering. We first assume that: Assumption 1. The system is fully actuated. Full actuation allows us to write the equations of motion of the system in gen- eralized coordinates as: M ( q ) ̈ q + C ( q, ̇ q ) ̇ q + g ( q ) = u, (2) where u ∈ U adm and we assume that the set of admissible controls U adm is com- pact. Since torque constraints are our main concern, we will focus on U adm := { u ∈ U , | u | ≤ τ max } , (3) which is indeed compact. 4 (Vector comparisons are component-wise.) Finally, we suppose that forward and inverse dynamics mappings have Lipschitz smoothness: Assumption 2. The forward dynamics function f is Lipschitz continuous in both of its arguments, and its inverse f − 1 (the inverse dynamics function u = f − 1 ( x, ̇ x ) ) is Lipschitz in both of its arguments. These two assumptions are satisfied when f is given by (2) as long as the matrices M ( q ) and C ( q, ̇ q ) are bounded and the gravity term g ( q ) is Lipschitz. Indeed, for a small displacement between x and x ′ , ∥ ∥ u ′ − u ∥ ∥ ≤ ‖ M ‖ ∥ ∥ ̈ q ′ − ̈ q ∥ ∥ + ‖ C ( q, ̇ q ) ‖ ∥ ∥ ̇ q ′ − ̇ q ∥ ∥ + K g ∥ ∥ q ′ − q ∥ ∥ (4) Let us illustrate this on the double pendulum depicted in Figure 2. When both links have mass m and length l , the gravity term g ( θ 1 , θ 2 ) = mgl 2 [sin θ 1 + sin( θ 1 + θ 2 ) sin( θ 1 + θ 2 )] is Lipschitz with constant K g = 2 mgl , while the inertial term is bounded by ‖ M ‖ ≤ 3 ml 2 . When joint angular velocities are bounded by ω , the norm of the Coriolis tensor is bounded by 2 ωml 2 . Using (4), one can therefore derive the Lipschitz constant K f − 1 of the inverse dynamics function. 4 The application of our proof of completeness to an arbitrary compact set presents no technical difficulty: one can just replace | u | ≤ τ max with d ( u, ∂ U adm ) , with ∂ U adm the boundary of U adm . Using Equation (3) avoids this level of verbosity. 8 (A) g 2 1 (B) Figure 2: Single (A) and double (B) pendulums. Under torque bounds, these systems must swing back and forth several times before they can reach for the upright position, as depicted in (B) (lighter images represent earlier times). 3.2. Interpolation assumptions We also require smoothness for the interpolated trajectories: Assumption 3. Interpolated trajectories γ int are smooth Lipschitz functions, and their time-derivatives ̇ γ int ( i.e. interpolated velocities) are also Lipschitz. The following two assumptions ensure a continuous behaviour of the interpo- lation procedure: Assumption 4 (Local boundedness) . Interpolated trajectories stay within a neigh- borhood of their start and end states, i.e. there exists a constant η such that, for any ( x, x ′ ) ∈ X 2 , the interpolated trajectory γ int : [0 , ∆ t ] → C resulting from INTERPOLATE ( x, x ′ ) is included in a ball of center x and radius η ‖ x ′ − x ‖ . Assumption 5 (Discrete-acceleration convergence) . When start and end states be- come close, accelerations of interpolated trajectories uniformly converge to the discrete acceleration between them, i.e. there exists some ν > 0 such that, if γ int : [0 , ∆ t ] → C results from INTERPOLATE ( x, x ′ ) , then ∀ τ ∈ [0 , ∆ t ] , ∥ ∥ ∥ ∥ ̈ γ int ( τ ) − ∆ ̇ q ∆ t disc ∥ ∥ ∥ ∥ ≤ ν ‖ ∆ x ‖ , where ∆ t disc := ‖ ∆ q ‖ / ‖ ̇ q ‖ . 9 Note that the expression ∆ ̇ q ∆ t disc above represents the discrete acceleration between x and x ′ . Its continuous analog would be ‖ ̇ q ‖ d ̇ q ‖ d q ‖ = ‖ ̇ q ‖ d ̇ q ‖ ̇ q ‖ d t = d ̇ q d t . These three assumptions ensure that the planner interpolates trajectories lo- cally and “continuously” when x and x ′ are close. We will call them altogether second-order continuity , where “second-order” refers to the discrete acceleration encoded in small variations (∆ q, ∆ ̇ q ) . This continuous behavior plays a key role in our proof of completeness, as it ensures that denser sampling will allow finding arbitrarily narrow state-space passages. Let us consider again the example the double pendulum, for the interpolation function γ = INTERPOLATE ( x, x ′ ) given by γ : [0 , ∆ t ] → C t 7 → ∆ ̇ q 2∆ t t 2 + ( ∆ q ∆ t − ∆ ̇ q 2 ) t + q. (5) The duration ∆ t is taken as ∆ t disc , so that γ (0) = q , γ (∆ t ) = q ′ and ̈ γ is the discrete acceleration. This interpolation, like any polynomial function, is Lips- chitz smooth; Assumption 5 is verified by construction, and Assumption 4 can be checked as follows: ‖ γ ( t ) − γ (0) ‖ ≤ t ∥ ∥ ∥ ∥ 1 2 ∆ ̇ q ∆ t t + ∆ q ∆ t − ∆ ̇ q 2 ∥ ∥ ∥ ∥ ≤ ∆ t ∥ ∥ ∥ ∥ ∆ ̇ q 2∆ t t + ∆ q ∆ t − ∆ ̇ q 2 ∥ ∥ ∥ ∥ ≤ 3 2 ‖ ∆ ̇ q ‖ ∆ t + ‖ ∆ q ‖ ≤ ‖ ∆ q ‖ ( 1 + ‖ ∆ ̇ q ‖ ‖ ̇ q ‖ ) ≤ ‖ ∆ q ‖ ( 1 + o ‖ ∆ x ‖ (1) ) . 3.3. Completeness theorem In order to prove the theorem, we will use the following two lemmas, which are proved in Appendix A. Lemma 1. Let g : [0 , ∆ t ] → R k denote a smooth Lipschitz function. Then, for any ( t, t ′ ) ∈ [0 , ∆ t ] 2 , ∥ ∥ ∥ ∥ ̇ g ( t ) − g ( t ′ ) − g ( t ) | t ′ − t | ∥ ∥ ∥ ∥ ≤ K g 2 | t ′ − t | . 10 Lemma 2. If there exists an admissible trajectory γ with δ -clearance in control space, then there exists δ ′ < δ and a neighboring admissible trajectory γ ′ with δ ′ - clearance in control space whose acceleration never vanishes, i.e. such that ‖ ̈ γ ′ ‖ is always greater than some constant ̈ m > 0 . We can now state our main theorem: Theorem 1. Consider a time-invariant differential system (1) with Lipschitz-continuous f and full actuation over a compact set of admissible controls U adm . Suppose that the kinodynamic planning problem between two states x init and x goal admits a smooth Lipschitz solution γ : [0 , T ] → C with δ -clearance in control space. A randomized motion planner (Algorithm 1) using a second-order continuous inter- polation is probabilistically complete. Proof. Let γ : [0 , ∆ t ] → C , t 7 → γ ( t ) denote a smooth Lipschitz admissible tra- jectory from x init to x goal , and u : [0 , ∆ t ] → U adm its associated control trajectory with δ -clearance in control space. Consider two states x = ( q, ̇ q ) and x ′ = ( q ′ , ̇ q ′ ) , as well as their corresponding time instants on the trajectory t := arg min t ‖ ( γ ( t ) , ̇ γ ( t )) − x ‖ , t ′ := arg min t ∥ ∥ ( γ ( t ) , ̇ γ ( t )) − x ′ ∥ ∥ . Supposing without loss of generality that t ′ > t , we denote by ∆ t = t ′ − t and ∆ t disc = ‖ ̇ q ‖ / ‖ ∆ q ‖ . Given a sufficiently dense sampling of the state space, we suppose that dist γ ( x ) ≤ ρ and dist γ ( x ′ ) ≤ ρ for a radius ρ such that ρ/ ∆ t = O (∆ t ) and ρ/ ∆ t disc = O (∆ t ) ; i.e. the radius ρ is quadratic in the time difference. Let γ int : [0 , ∆ t ] → C denote the result of the interpolation between x and x ′ . For τ ∈ [0 , ∆ t ] , the torque required to follow the trajectory γ int is u int ( τ ) := f ( γ int ( τ ) , ̇ γ int ( τ ) , ̈ γ int ( τ )) . Since u has δ -clearance in control space, | u int ( τ ) | ≤ | u int ( τ ) − u ( t ) | + | u ( t ) | ≤ | f ( γ int ( τ ) , ̇ γ int ( τ ) , ̈ γ int ( τ )) − f ( γ ( t ) , ̇ γ ( t ) , ̈ γ ( t )) | + (1 − δ ) τ max , (As previously, vector inequalities are component-wise.) Let us denote by | ̃ u int | the first term of this inequality. We will now show that | ̃ u int | = O (∆ t ) → 0 when ∆ t → 0 , and therefore that | u int ( τ ) | ≤ τ max for a small enough ∆ t ( i.e. when 11 sampling density is high enough). Let us first rewrite it as follows: | ̃ u int | = | f ( γ int ( τ ) , ̇ γ int ( τ ) , ̈ γ int ( τ )) − f ( γ ( t ) , ̇ γ ( t ) , ̈ γ ( t )) | ≤ ‖ f ( γ int ( τ ) , ̇ γ int ( τ ) , ̈ γ int ( τ )) − f ( γ ( t ) , ̇ γ ( t ) , ̈ γ ( t )) ‖ ∞ ≤ K f ‖ ( γ int ( τ ) , ̇ γ int ( τ )) − ( γ ( t ) , ̇ γ ( t )) ‖ + K f ‖ ̈ γ int ( τ ) − ̈ γ ( t ) ‖ ≤ K f [( η + ν ) ‖ ∆ x ‖ + dist γ ( x )] ︸︷︷︸ position-velocity term (PV) + K f ∥ ∥ ∥ ∥ ‖ ̇ q ‖ ‖ ∆ q ‖ ∆ ̇ q − ̈ γ ( t ) ∥ ∥ ∥ ∥ ︸︷︷︸ acceleration term (A) . The replacement of the norm ‖·‖ by ‖·‖ ∞ is possible because all norms of R n are equivalent (a change in norm will be reflected by a different constant K f ). The transition from the second to the third row uses Lipschitz smoothness of f , as well as the triangular inequality to separate position-velocity and acceleration coordi- nates. The transition from the third to the fourth row relies on the two interpolation assumptions: local boundedness (yields the η factor in the distance term) and con- vergence to the discrete-acceleration (yields the ν factor in the distance term, as well as the acceleration term). The position-velocity term (PV) satisfies: ( D ) ≤ (2 ρ + ‖ ∆ γ ‖ )( η + ν ) + ρ ≤ 1 2 K γ ( η + ν )∆ t + (1 + 2( η + ν )) ρ. Since ρ = O (∆ t ) , we have ( PV ) = O (∆ t ) and thus | ̃ u | ≤ ( A ) + O (∆ t ) . Next, the difference (A) can be bounded as: ( A ) ≤ ∥ ∥ ∥ ∥ ∆ ̇ q ‖ ̇ q ‖ ‖ ∆ q ‖ − ∆ ̇ γ ‖ ̇ γ ( t ) ‖ ‖ ∆ γ ‖ ∥ ∥ ∥ ∥ ︸︷︷︸ (∆) + ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ ∣ ∣ ∣ ∣ ‖ ̇ γ ( t ) ‖ − ‖ ∆ γ ‖ ∆ t ∣ ∣ ∣ ∣ ︸︷︷︸ (A’) + ∥ ∥ ∥ ∥ ∆ ̇ γ ∆ t − ̈ γ ( t ) ∥ ∥ ∥ ∥ . ︸︷︷︸ (A”) From Lemma 1, the two terms (A’) and (A”) satisfy: ( A’ ) ≤ K ̇ γ 2 ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ ∆ t = O (∆ t ) , ( A” ) ≤ K ̇ γ 2 ∆ t = O (∆ t ) , where the first upper bound O (∆ t ) comes from the fact that ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ ∼ ∆ t → 0 ∆ t . We now have | ̃ u | ≤ (∆) + O (∆ t ) . The term ( ∆ ) can be seen as the deviation between 12 the discrete accelerations of γ int and γ . Let us decompose it in terms of norm and angular deviations: (∆) ≤ ∥ ∥ ∥ ∥ ( ∆ ̇ γ ‖ ∆ ̇ γ ‖ − ∆ ̇ q ‖ ∆ ̇ q ‖ ) ‖ ̇ γ ‖ ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ + ∆ ̇ q ‖ ∆ ̇ q ‖ ( ‖ ∆ ̇ γ ‖ ‖ ̇ γ ‖ ‖ ∆ γ ‖ − ‖ ∆ ̇ q ‖ ‖ ̇ q ‖ ‖ ∆ q ‖ )∥ ∥ ∥ ∥ ≤ 2 ‖ ̇ γ ‖ ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ ( 1 − cos ̂ (∆ ̇ q, ∆ ̇ γ ) ) ︸︷︷︸ angular deviation term ( θ ) + ∣ ∣ ∣ ∣ ‖ ̇ γ ‖ ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ − ‖ ∆ ̇ q ‖ ‖ ̇ q ‖ ‖ ∆ q ‖ ∣ ∣ ∣ ∣ ︸︷︷︸ norm deviation term (N) The factor 2 ‖ ̇ γ ‖‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ before ( θ ) is O (1) when ∆ t → 0 , while simple vector geom- etry then shows that sin ̂ (∆ ̇ q, ∆ ̇ γ ) ≤ dist γ ( x ) + dist γ ( x ′ ) ‖ ∆ ̇ γ ‖ ≤ ρ ̈ m ∆ t , where ̈ m := min t ‖ ̈ γ ( t ) ‖ . From Lemma 2, we can assume this minimum acceler- ation to be strictly positive. Then, it follows from ρ = O (∆ t 2 ) that the sine above is O (∆ t ) . Recalling the fact that 1 − cos θ < sin θ for any θ ∈ [0 , π/ 2] , we have ( θ ) = O (∆ t ) . Finally, ( N ) ≤ ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ |‖ ̇ γ ‖ − ‖ ̇ q ‖| + ‖ ̇ q ‖ ∣ ∣ ∣ ∣ ‖ ∆ ̇ γ ‖ ‖ ∆ γ ‖ − ‖ ∆ ̇ q ‖ ‖ ∆ q ‖ ∣ ∣ ∣ ∣ ≤ O (∆ t · ρ ) + ‖ ̇ q ‖ ( ‖ ∆ q ‖ + ‖ ∆ ̇ q ‖ ) O ( ρ ) ‖ ∆ q ‖ ( ‖ ∆ q ‖ + O ( ρ )) ≤ O (∆ t · ρ ) + ‖ ̇ q ‖ ρ ‖ ∆ q ‖ + O ( ρ ) + ‖ ̇ q ‖ ‖ ∆ ̇ q ‖ ‖ ∆ q ‖ O ( ρ ) ‖ ∆ q ‖ + O ( ρ ) Where we used the fact that ‖ ∆ γ ‖ ≤ dist γ ( x )+ ‖ ∆ q ‖ + dist γ ( x ′ ) = ‖ ∆ q ‖ + O ( ρ ) , and similarly for ‖ ∆ ̇ γ ‖ . Because ‖ ∆ q ‖ = ‖ ̇ q ‖ ∆ t disc + O (∆ t 2 disc ) and ρ/ ∆ t disc = O (∆ t ) , the last two fractions are O (∆ t ) , so our last term ( N ) = O (∆ t ) . Overall, we have derived an upper bound | u ( τ ) | ≤ (1 − δ ) τ max + O (∆ t ) . As a consequence, there exists a constant δt > 0 such that, whenever ∆ t ≤ δt , interpolated torques satisfy | u | ≤ τ max and the interpolated trajectory γ int = INTERPOLATE ( x, x ′ ) is admissible. Note that the constant δt is uniform, in the sense that it does not depend on the index t on the trajectory. Conclusion of the Proof. We have effectively constructed the attraction sequence conjectured in [6]. We can now conclude the proof similarly to the strategy sketched in that paper. Let us denote by B t := B (( γ, ̇ γ )( t ) , δρ ) , the ball of radius δρ cen- tered on ( γ, ̇ γ )( t ) ∈ X , where δρ = O ( δt 2 ) as before. Suppose that the roadmap 13 contains a state x ∈ B t , and let t ′ := t + δt . If the planner samples a state x ′ ∈ B t ′ , the interpolation between x and x ′ will be successful and x ′ will be added to the roadmap. Since the volume of B t ′ is non-zero, the event { SAMPLE ( X free ) ∈ B t ′ } will happen with probability one as the number of extensions goes to infinity. At the initialization of the planner, the roadmap is reduced to x init = ( γ (0) , ̇ γ (0)) . Therefore, using the property above, by induction on the number of time steps δt , the last state x goal = ( γ ( T ) , ̇ γ ( T )) will be eventually added to the roadmap with probability one, and the planner will find an admissible trajectory connecting x init to x goal . 4. Completeness and state-based steering in practice Shkolnik et al. [17] showed how RRTs could not be directly applied to kinody- namic planning due to their poor expansion rate at the boundaries of the roadmap. They illustrated this phenomenon on the planning problem of swinging up a (sin- gle) pendulum vertically against gravity. Let us consider the same system, i.e. the 1-DOF single pendulum depicted in Figure 2 (A), with length l = 20 cm and mass m = 8 kg. It satisfies the system assumptions of Theorem 1 a fortiori , as we saw that they apply to the double pendulum. We assume that the single actuator of the pendulum, corresponding to the joint angle θ in Figure 2, has limited actuation power: | τ | ≤ τ max . The static equilibrium of the system requiring the most torque is given at θ = ± π/ 2 with τ = 1 2 lmg ≈ 7 . 84 Nm. Therefore, when τ max < 7 . 84 Nm, it is impossible for the system to raise upright directly, and the pendulum rather needs to swing back and forth to accumulate kinetic energy before it can swing up. For any τ max > 0 , the pendulum can achieve the swingup in a finite number of swings N , with N → ∞ as τ max → 0 . 4.1. Bezier interpolation A common solution [25, 26, 27] to connect two states ( q, ̇ q ) and ( q ′ , ̇ q ′ ) is the cubic Bezier curve (also called “Hermit curve”) which is the quadratic function B ( t ) such that B (0) = q , ̇ B (0) = ̇ q , B ( T ) = q ′ and ̇ B ( T ) = q ′ , where T is the fixed duration of the interpolated trajectory. Its expression is given by: B ( t ) = − 2∆ q + T ( ̇ q + ̇ q ′ ) T 3 t 3 + 3∆ q − 2 ̇ q − ̇ q ′ T 2 t 2 + ̇ qt + q This interpolation is straightforward to implement, however it does not verify our Assumption 5, as for instance ̈ B (0) = 6∆ q − 4 ̇ q − 2 ̇ q ′ T 2 ∆ x → 0 − − − − → − 6 ̇ q T 2 6 = 0 . (6) 14 Our proof of completeness does not apply to such interpolators: even though a feasible trajectory is sampled as closely as possible (∆ x → 0) , the interpolated acceleration will not approximate the smooth acceleration underlying the feasible trajectory. Proposition 1. A randomized motion planner interpolating pendulum trajectories by Bezier curves with a fixed duration T cannot find non-quasi static solutions by increasing sampling density. Proof. When actuation power decreases, the pendulum needs to store kinetic en- ergy in order to swing up, which implies that all swingup trajectories go through velocities | ̇ θ | > ̇ θ swingup ( τ max ) . The function ̇ θ swingup increases to a positive limit ̇ θ lim swingup as τ max → 0 , where ̇ θ lim swingup > √ 8 g/l from energetic considerations. 5 Yet, feasible accelerations are also bound by | ̈ θ | ≤ Kτ max for some constant K > 0 . Combining both observations in (6) yields: Kτ max ≥ 6 | ̇ θ | T 2 > 6 ̇ θ swingup ( τ max ) T 2 ⇒ ̇ θ swingup ( τ max ) ≤ KT 2 6 τ max . Since the planner uses a constant T and ̇ θ swingup increases to ̇ θ lim swingup > √ 8 g/l when τ max decreases to 0, this inequality cannot be satisfied for arbitrary small ac- tuation power τ max . Hence, even with an arbitrarily high sampling density around a feasible trajectory γ ( t ) , the planner will not be able to reconstruct a feasible ap- proximation γ int ( t ) . 4.2. Second-order continuous interpolation Let ̇ q avg := 1 2 ( ̇ q + ̇ q ′ ) denote the average velocity between ( q, ̇ q ) and ( q ′ , ̇ q ′ ) . Since the system has only one degree of freedom, one can interpolate trajectories that comply with our Assumption 5 using constant accelerations with a suitable trajectory duration: C : [0 , ∆ t C ] → ] − π, π ] t 7 → C ( t ) = q + t ̇ q + t 2 2 (∆ ̇ q/ ∆ t C ) . One can check that choosing ∆ t C = (∆ q/ ̇ q avg ) results in ̇ C (0) = ̇ q , ̇ C (∆ t C ) = ̇ q ′ , C (0) = q and C (∆ t C ) = q ′ . This duration is similar to the term ∆ t disc in 5 The expression ̇ θ = √ 8 g/l corresponds to the kinetic energy 1 4 ml ̇ θ 2 = mgl , the latter being the (potential) energy of the system at rest in the upward equilibrium. During a successful last swing, the kinetic energy at θ = 0 is 1 4 ml ̇ θ 2 swingup + W g + W τ = mgl , with W g < 0 the work of gravity and W τ the work of actuation forces between θ = 0 and θ = π . The work W τ vanishes when τ max → 0 . 15 Assumption 5, with both expressions converging to the same value as ∆ x → 0 . We call C ( t ) the second-order continuous 1-DOF (SOC1) interpolation. Note that this interpolation function only applies to single-DOF systems. For multi-DOF systems, the correct duration ∆ t C used to transfer from one state to another is different for each DOF, hence constant accelerations cannot be used. One can then apply optimization techniques [20, 28] or use a richer family of curves such as piecewise linear-quadratic segments [29]. 4.3. Comparison in simulations According to Theorem 1 and our previous discussion, a randomized planner based on Bezier interpolation is not expected to be probabilistically complete as τ max → 0 , while the same planner using the SOC1 interpolation will be complete at any rate. We asserted this statement in simulations of the pendulum with RRT [30]. Our implementation of RRT is that described in Algorithm 1, with the addi- tion of the steer-to-goal heuristic: every m = 100 steps, the planner tries to steer to x goal rather than x rand . This extra step speeds up convergence when the sys- tem reaches the vicinity of the goal area. We use uniform random sampling for SAMPLE ( S ) , while for PARENTS ( x ′ , V ) returns the k = 10 nearest neighbors of x ′ in the roadmap V . All the source code used in these experiments can be accessed at [31]. We compared the performance of RRT with the Bezier and SOC1 interpola- tions, all other parameters being the same, on a single pendulum with τ max = 5 Nm. The RRT-SOC1 combo found a four-swing solution after 26,300 RRT ex- tensions, building a roadmap with 6434 nodes (Figure 3). Meanwhile, even after one day of computations and more than 200,000 RRT extensions, the RRT-Bezier combo could not find any solution. Figure 4 shows the roadmap at 100,000 extensions (26,663 nodes). Interestingly, we can distinguish two zones in this roadmap. The first one is a dense, diamond-shape area near the downward equilibrium θ = 0 . It corresponds to states that are straightforward to connect by Bezier interpolation, and as expected from Proposition 1, velocities ̇ θ in this area decrease sharply with θ . The second one consists of two cones directed torwards the goal. Both areas exhibit a higher density near the axis ̇ θ = 0 , which is also consistent with Proposition 1. The comparison of the two roadmaps is clear: with a second-order continu- ous interpolation, the RRT-SOC1 planner leverages additional sampling into ex- ploration of the state space. Conversely, RRT-Bezier lacks this property (Propo- sition 1), and its roadmap stays confined to a subset of the pendulum’s reachable space. 16 Figure 3: Phase-space portrait of the roadmap constructed by RRT using the second-order contin- uous (SOC1) interpolation. The planner found a successful trajectory (red line) after 26,300 exten- sions. This planner is probabilistically complete (Theorem 1) thanks to the fact that SOC1 curves satisfy Assumption 5. Figure 4: Roadmap constructed by RRT after 100,000 extensions using the Bezier interpolation. Reachable states are distributed in two major areas: a central, diamond shape corresponding to the states that the planner can connect at any rate, and two cones directed towards the goal ( θ = π or θ = − π ). Even after several days of computations, this planner could not find a successful motion plan. Our completeness theorem does not apply to this planner because Bezier curves do not satisfy Assumption 5. 17 5. Conclusion In this paper, we provided the first “operational” proof of probabilistic com- pleteness for a large class of randomized kinodynamic planners, namely those that interpolate state-space trajectories. We observed that an important ingredient for completeness is the “continuity” of the interpolation procedure, which we char- acterized by the second-order continuity (SOC) property. In particular, we found in simulation experiments that this property is critical to planner performances: a standard RRT with second-order continuous interpolation has no difficulty finding swingup trajectories for a low-torque pendulum, while the same RRT with Bezier interpolation (which are not SOC) could not find any solution. This experimentally confirms our completeness theorem and suggests that second-order continuity is an important design guideline for kinodynamic planners with state-based steering. References References [1] S. Caron, Q.-C. Pham, Y. Nakamura, Completeness of randomized kino- dynamic planners with state-based steering, in: Robotics and Automation (ICRA), 2014 IEEE International Conference on, IEEE, 2014, pp. 5818– 5823. [2] J.-C. Latombe, Robot motion planning, Vol. 124, Springer US, 1991. [3] S. LaValle, Planning algorithms, Cambridge Univ Press, 2006. [4] L. E. Kavraki, P. Svestka, J.-C. Latombe, M. H. Overmars, Probabilis- tic roadmaps for path planning in high-dimensional configuration spaces, Robotics and Automation, IEEE Transactions on 12 (4) (1996) 566–580. [5] D. Hsu, R. Kindel, J.-C. Latombe, S. Rock, Randomized kinodynamic mo- tion planning with moving obstacles, The International Journal of Robotics Research 21 (3) (2002) 233–255. [6] S. M. LaValle, J. J. Kuffner, Randomized kinodynamic planning, The Inter- national Journal of Robotics Research 20 (5) (2001) 378–400. [7] S. Karaman, E. Frazzoli, Sampling-based algorithms for optimal motion plan- ning, The International Journal of Robotics Research 30 (7) (2011) 846–894. [8] D. Hsu, J.-C. Latombe, R. Motwani, Path planning in expansive configura- tion spaces, in: Robotics and Automation, 1997. Proceedings., 1997 IEEE International Conference on, Vol. 3, IEEE, 1997, pp. 2719–2726. 18 [9] T. Lozano-Perez, Spatial planning: A configuration space approach, Comput- ers, IEEE Transactions on 100 (2) (1983) 108–120. [10] B. Donald, P. Xavier, J. Canny, J. Reif, Kinodynamic motion planning, Jour- nal of the ACM (JACM) 40 (5) (1993) 1048–1066. [11] J.-P. Laumond, Robot Motion Planning and Control, Springer-Verlag, New York, 1998. [12] F. Bullo, K. M. Lynch, Kinematic controllability for decoupled trajectory planning in underactuated mechanical systems, Robotics and Automation, IEEE Transactions on 17 (4) (2001) 402–412. [13] J. Bobrow, S. Dubowsky, J. Gibson, Time-optimal control of robotic manip- ulators along specified paths, The International Journal of Robotics Research 4 (3) (1985) 3–17. [14] P.-B. Wieber, On the stability of walking systems, in: Proceedings of the international workshop on humanoid and human friendly robotics, 2002. [15] S. Caron, Q.-C. Pham, Y. Nakamura, Leveraging cone double description for multi-contact stability of humanoids with applications to statics and dynam- ics, in: Robotics: Science and System, 2015. [16] J. Bialkowski, M. Otte, E. Frazzoli, Free-configuration biased sampling for motion planning, in: Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on, IEEE, 2013, pp. 1272–1279. [17] A. Shkolnik, M. Walter, R. Tedrake, Reachability-guided sampling for plan- ning under differential constraints, in: Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, IEEE, 2009, pp. 2859–2865. [18] G. Papadopoulos, H. Kurniawati, N. M. Patrikalakis, Analysis of asymptot- ically optimal sampling-based motion planning algorithms for lipschitz con- tinuous dynamical systems, arXiv preprint arXiv:1405.2872. [19] T. Kunz, M. Stilman, Probabilistically complete kinodynamic planning for robot manipulators with acceleration limits, in: Intelligent Robots and Sys- tems (IROS 2014), 2014 IEEE/RSJ International Conference on, IEEE, 2014, pp. 3713–3719. [20] A. Perez, R. Platt, G. Konidaris, L. Kaelbling, T. Lozano-Perez, Lqr-rrt*: Op- timal sampling-based motion planning with automatically derived extension heuristics, in: Robotics and Automation (ICRA), 2012 IEEE International Conference on, IEEE, 2012, pp. 2537–2542. 19 [21] R. Tedrake, Lqr-trees: Feedback motion planning on sparse randomized trees. [22] T. Kunz, M. Stilman, Kinodynamic rrts with fixed time step and best-input extension are not probabilistically complete, in: Algorithmic Foundations of Robotics XI, Springer, 2015, pp. 233–244. [23] S. Karaman, E. Frazzoli, Optimal kinodynamic motion planning using incre- mental sampling-based methods, in: Decision and Control (CDC), 2010 49th IEEE Conference on, IEEE, 2010, pp. 7681–7687. [24] S. Karaman, E. Frazzoli, Sampling-based optimal motion planning for non- holonomic dynamical systems, in: IEEE Conference on Robotics and Au- tomation (ICRA), 2013. [25] K. Jolly, R. S. Kumar, R. Vijayakumar, A bezier curve based path planning in a multi-agent robot soccer system without violating the acceleration limits, Robotics and Autonomous Systems 57 (1) (2009) 23–33. [26] I. ˇ Skrjanc, G. Klanˇ car, Optimal cooperative collision avoidance between mul- tiple robots based on bernstein–b ́ ezier curves, Robotics and Autonomous sys- tems 58 (1) (2010) 1–9. [27] K. Hauser, Fast interpolation and time-optimization on implicit contact sub- manifolds., in: Robotics: Science and Systems, Citeseer, 2013. [28] Q.-C. Pham, S. Caron, Y. Nakamura, Kinodynamic planning in the configu- ration space via velocity interval propagation, Robotics: Science and System. [29] K. Hauser, V. Ng-Thow-Hing, Fast smoothing of manipulator trajectories us- ing optimal bounded-acceleration shortcuts, in: Robotics and Automation (ICRA), 2010 IEEE International Conference on, IEEE, 2010, pp. 2493– 2498. [30] S. M. LaValle, J. J. Kuffner Jr, Rapidly-exploring random trees: Progress and prospects. [31] Source code to be published online. 20 Appendix A. Proofs of the lemmas Lemma 1. Let g : [0 , ∆ t ] → R k denote a smooth Lipschitz function. Then, for any ( t, t ′ ) ∈ [0 , ∆ t ] 2 , ∥ ∥ ∥ ∥ ̇ g ( t ) − g ( t ′ ) − g ( t ) | t ′ − t | ∥ ∥ ∥ ∥ ≤ K g 2 | t ′ − t | . Proof. For t ′ > t , ∥ ∥ ∥ ∥ ̇ g ( t ) − g ( t ′ ) − g ( t ) t ′ − t ∥ ∥ ∥ ∥ ≤ 1 t ′ − t ∥ ∥ ∥ ∥ ∥ ∫ t ′ t ( ̇ g ( t ) − ̇ g ( w )) d w ∥ ∥ ∥ ∥ ∥ ≤ 1 t ′ − t ∫ t ′ t ‖ ̇ g ( t ) − ̇ g ( w ) ‖ d w ≤ K g t ′ − t ∫ t ′ t | t − w | d w ≤ K g 2 ( t ′ − t ) . Lemma 2. If there exists an admissible trajectory γ with δ -clearance in control space, then there exists δ ′ < δ and a neighboring admissible trajectory γ ′ with δ ′ -clearance in control space which is always accelerating, i.e. such that ‖ ̈ γ ′ ‖ is always greater than some constant ̈ m > 0 . Proof. If there is a time interval [ t, t ′ ] on which ̈ γ ≡ 0 , suffices to add a wavelet function δ ̈ γ i of arbitrary small amplitude δ ̈ q i and zero integral over [ t, t ′ ] to gen- erate a new trajectory ̈ γ + δ ̈ γ where the acceleration cancels on at most a discrete number of time instants. Adding accelerations δ ̈ γ i directly is possible thanks to full actuation, while δ ′ -clearance can be achieved for δ ′ ≤ δ by taking sufficiently small amplitudes δ ̈ q i . Suppose now that the roots of ̈ γ form a discrete set { t 0 , t 1 , . . . , t m } . Let t 0 be one of these roots, and let [ t, t ′ ] denote a neighbordhood of t 0 . Repeat the process of adding wavelet functions δ ̈ γ i and δ ̈ γ j of zero integral over [ t, t ′ ] and arbitrary small amplitude to two coordinates i and j , but this time enforcing that the sum of the two wavelets satisfies | δ ̈ γ i + δ ̈ γ j | ≥ ij > 0 . This method ensures that the root t 0 is eliminated (either ̈ γ i ( t 0 ) 6 = 0 or ̈ γ j ( t 0 ) 6 = 0 ) without introducing new roots. We conclude by iterating the process on the finite set of roots. 21