arXiv:1812.06132v1 [math.OC] 14 Dec 2018 Bernstein approximation of optimal control problems∗ Venanzio Cichella†, Isaac Kaminer‡, Claire Walton‡, Naira Hovakimyan§, Ant´onio M. Pascoal¶ December 18, 2018 Abstract Bernstein polynomial approximation to a continuous function has a slower rate of convergence as compared to other approximation methods. “The fact seems to have precluded any numerical application of Bernstein polynomials from having been made. Perhaps they will find application when the properties of the approximant in the large are of more importance than the closeness of the approximation.” – has remarked P.J. Davis in his 1963 book Interpolation and Approximation. This paper presents a direct approximation method for nonlinear optimal control problems with mixed input and state constraints based on Bernstein polynomial approximation. We provide a rigorous analysis showing that the proposed method yields consistent approximations of time continuous optimal control problems. Furthermore, we demonstrate that the proposed method can also be used for costate estimation of the optimal control problems. This latter result leads to the formulation of the Covector Mapping Theorem for Bernstein polynomial approximation. Finally, we explore the numerical and geometric properties of Bernstein polynomials, and illustrate the advantages of the proposed approximation method through several numerical examples. 1 Introduction Optimal control problems that arise from most engineering applications are in general very complex. Finding a closed-form solution to these problems can be difficult or even impossible, and therefore they must be solved numerically. Numerical methods include indirect and direct methods [1]. Indirect methods solve the problems by converting them into boundary value problems. Then, the solutions are found by solving systems of differential equations. On the other hand, direct methods are based on transcribing optimal control problems into nonlinear programming problems (NLPs) using some discretization scheme [1–4]. These NLPs can be solved using ready-to-use NLP solvers (e.g. MATLAB, SNOPT, etc.) and do not require calculation of costate and adjoint variables as indirect methods do. A pioneering work in the literature on direct methods is the one of Polak on consistency of approximation theory reported in his book (see [5, Section 3.3]). Borrowing tools from variational analysis, Polak provides a theoretical framework to assess the convergence properties of direct methods. Motivated by the consistency of approximation theory, a wide range of direct methods that use different discretization schemes have been developed, including Euler [5], Runge-Kutta [6], Pseudospectral [7] methods, as well as the method presented in this paper. Pseudospectral methods are the most popular direct methods nowadays, and they have been applied successfully for solving a wide range of optimization problems, e.g. [7–13]. They offer several advantages over many other discretization methods, mainly owing to their spectral (exponential) rate of convergence. ∗This work was supported by AFOSR, ONR, NSF and NASA. †Venanzio Cichella is with the Department of Mechanical Engineering, University of Iowa, Iowa City, 52242 IA venanzio-cichella@uiowa.edu ‡Isaac Kaminer and Claire Walton are with the Department of Mechanical and Aerospace Engineering, Naval Postgraduate School, Monterey, CA 93940 {kaminer, clwalton1 }@nps.edu §Naira Hovakimyan is with the Department of Mechanical Science and Engineering, University of Illinois at Urbana- Champaign, Urbana, IL 61801 nhovakim@illinois.edu ¶Ant´onio M. Pascoal is with the Institute for Systems and Robotics (ISR), Instituto Superior Tecnico (IST), Univ. Lisbon, Portugal. antoniog@isr.ist.utl.pt 1 However, as pointed out in [14–17], there is one salient disadvantage associated with these methods. When discretizing the state and/or the input, the constraints are enforced at the discretization nodes; unfortunately, satisfaction of constraints cannot be guaranteed in between the nodes. To avoid violation of the constraints in between the nodes, the order of approximation (number of nodes) can be increased; however, this leads to larger NLPs, which may become computationally expensive and too inefficient to solve. This problem does not limit itself to pseudospectral methods, but it is common to methods that are based on discretization. This undesirable behaviour becomes obvious, for example, when considering the optimal trajectory gener- ation problem for multi-vehicle missions, where a large number of vehicles have to reach their final destinations by following trajectories that have to guarantee intervehicle separation for safety all the time. Clearly, with a small order of approximation, separation between the trajectories will be hardly satisfied. Increasing the number of nodes will eventually produce spatially separated trajectories, but will also drastically increase the number of collision avoidance constraints and thus also the complexity of the problem. Pseudospectral methods also suffer from a drawback when dealing with non-smooth optimal control problems. This drawback is mainly related to the well known Gibbs phenomenon [18], common to all approximation methods based on orthogonal polynomials. The Gibbs phenomenon, visible in the form of oscillations, reduces the accuracy of the approximation to first order away from discontinuities and to O(1) in the neighborhoods of jumps [19]. Several extensions of pseudospectral methods have been developed to deal with this disadvantage and lessen the effect of the Gibbs phenomenon (e.g. [20–22]). The most accurate methods require the location of the discontinuities to be known a priori, which is often impractical or difficult. Other methods are based on the estimation of these locations, which could result in inefficiency or ill conditioning of the discretized problem, especially when the number of discontinuities is large and unknown. The present article proposes a direct method based on Bernstein approximation. Bernstein approximants have several nice properties. First of all, the approximants converge uniformly to the functions that they approximate – and so do their derivatives [23, Chapter 3]. This, as we will discuss later, is useful for derivation of convergence properties of the proposed computational method. Moreover, Bernstein polynomials behave well, even when the functions being approximated are non-smooth. In fact, as demonstrated in [24], the Gibbs phenomenon does not occur when approximating piecewise smooth, monotone functions with both left and right derivatives at every point by Bernstein polynomials. As a result, the proposed method based on Bernstein approximation lends itself to problems that have discontinuous states and/or controls, e.g. bang-bang optimal control problems (see also [25]). Finally, due to their favorable geometric properties (see [23, Chapter 5]) Bernstein polynomials afford computationally efficient algorithms for the computation of state and input constraints for the whole time interval where optimization takes place, and not only at discretization points (see [16, 26]). Hence, with the proposed approach the solutions can be guaranteed to be feasible and satisfy the constraints for all times, while retaining the computational efficiency of methods based on discretization. “There is a price that must be paid for these beautiful approximation properties: the convergence of the Bernstein polynomials is very slow.” – wrote P.J. Davis in his 1963 book Interpolation and Approximation [27]. He continues: “This fact seems to have precluded any numerical application of Bernstein polynomials from having been made. Perhaps they will find application when the properties of the approximant in the large are of more importance than the closeness of the approximation.” In fact, the slow convergence of the Bernstein approximation implies that the approach proposed in the present paper is outperformed by, for example, pseudospectral methods in terms of convergence rate. This is not surprising, since the choice of nodes and the interpolating polynomials in the pseudospectral methods is dictated by approximation accuracy and convergence speed, while sacrificing satisfaction of constraints in between the nodes. On the other hand, our approach prioritizes constraint satisfaction at the expense of a slower convergence rate. The paper is structured as follows: in Section 2 we present the notation and the mathematical results, which will be used later in the paper. Section 3 introduces the optimal control problem of interest and some related assumptions. Section 4 presents the NLP method based on Bernstein approximation that approximates the optimal control problem. Section 5 demonstrates that the proposed approximation method yields approximate results that converges uniformly to the optimal solution. In Section 6 we derive the KarushKuhnTucker (KKT) conditions associated with the NLP. Section 7 compares these conditions to the first order optimality conditions for the original optimal control problem and states the Covector Mapping 2 Theorem for Bernstein approximation. Numerical examples are discussed in Section 8. The paper ends with conclusions in Section 9. 2 Notation and mathematical background Vector valued functions are denoted by bold letters, x(t) = [x1(t) , . . . , xn(t)]⊤, while vectors are denoted by bold letters with an upper bar, ¯x = [x1 , . . . , xn]⊤∈Rn. The symbol Cr denotes the space of functions with r continuous derivatives. Cr n denotes the space of n-vector valued functions in Cr. || · || denotes the Euclidean norm, ||¯x|| = p x2 1 + . . . + x2n. 2.1 Bernstein polynomials The Bernstein basis polynomials of degree N are defined as bj,N(t) = N j  tj(1 −t)N−j , t ∈[0, 1] , for j = 0, . . . , N, with N j  = N! j!(N −j)! . They were originally introduced by the mathematician Sergei Natanovich Bernstein in 1912 to facilitate a constructive proof of the Weierstrass approximation theorem [28]. An Nth order Bernstein polynomial xN : [0, 1] →R is a linear combination of N + 1 Bernstein basis polynomials of order N, i.e. xN(t) = N X j=0 ¯xjbj,N(t) , t ∈[0, 1] , where ¯xj ∈R, j = 0, . . . , N, are referred to as Bernstein coefficients (also known as control points). For the sake of generality, and with a slight abuse of terminology, in this paper we extend the definition of a Bernstein polynomial given above to a vector of Nth order polynomials xN : [0, 1] →Rn expressed in the following form xN(t) = N X j=0 ¯xj,Nbj,N(t) , t ∈[0, 1] , (1) where ¯x0,N, . . . , ¯xN,N are n-dimensional Bernstein coefficients. Bernstein polynomials were popularized by Pierre B´ezier in the early 1960s as useful tools for geometric design (B´ezier used Bernstein polynomials to design the shape of the cars at the Renault company in France), and are now widely used in computer graphics, animations and type fonts such as postscript fonts and true type fonts. For this reason, the Bernstein polynomial introduced in Equation (1) is often referred to as a B´ezier curve, especially when used to describe a spatial curve. Bernstein polynomials possess favorable geometric and numerical properties that can be exploited in many application domains. For an extensive review on Bernstein polynomials and their properties the reader is referred to [23]. The derivative and integral of a Bernstein polynomial xN(t) can be easily computed as ˙xN(t) = N N−1 X j=0 (¯xj+1,N −¯xj,N)bj,N−1(t) and Z 1 0 xN(t) = w N X j=0 ¯xj,N , w = 1 N + 1 , (2) 3 respectively. Bernstein polynomials can be used to approximate smooth functions. Consider a n-vector valued function x : [0, 1] →Rn. The Nth order Bernstein approximation of x(t) is a vector of Bernstein polynomials xN(t) computed as in (1) with ¯xj,N = x(tj) and tj = j N for all j = 0, . . . , N. Namely, xN(t) = N X j=0 x(tj)bj,N(t) , tj = j N . (3) The following results hold for Bernstein approximations. Lemma 1 (Uniform convergence of Bernstein approximation) Let x(t) ∈C0 n on [0, 1], and xN(t) be computed as in Equation (3). Then, for arbitrary order of approximation N ∈Z+, the Bernstein approxima- tion xN(t) satisfies ||xN(t) −x(t)|| ≤C0Wx(N −1 2 ) , where C0 is a positive constant satisfying C0 < 5n/4, and Wx(·) is the modulus of continuity of x(t) in [0, 1] [29–31]. Moreover, if x(t) ∈C1 n, then ∥˙xN(t) −˙x(t)∥≤C1Wx′(N −1 2 ) , where C1 is a positive constant satisfying C1 < 9n/4 and Wx′(·) is the modulus of continuity of ˙x(t) in [0, 1] [32]. ■ Lemma 2 [33] Assume x(t) ∈Cr+2 n , r ≥0, and let xN(t) be computed as in Equation (3). Let x(r)(t) denote the rth derivative of x(t). Then, the following inequalities hold for all t ∈[0, 1]: ||xN(t) −x(t)|| ≤C0 N ... ||x(r) N (t) −x(r)(t)|| ≤Cr N , where C0, . . . , Cr are independent of N. ■ Lemma 3 If x(t) ∈C0 n on [0, 1], then we have Z 1 0 x(t)dt −w N X j=0 x  j N  ≤CIWx(N −1 2 ) with w = 1 N+1, where CI > 0 is independent of N. Moreover, if x(t) ∈C2 n, then Z 1 0 x(t)dt −w N X j=0 x  j N  ≤CI N . ■ The Lemma above follows directly from Lemmas 1 and 2 and Equation (2). The following properties of Bernstein polynomials are relevant to this paper. Property 1 (End point values) The Bernstein polynomial given by Equation (1) satisfies xN(0) = ¯x0,N and xN(1) = ¯xN,N. Moreover, the tangent of a Bernstein polynomial at the initial and final points lies on the vectors ¯x1,N −¯x0,N and ¯xN,N −¯xN−1,N, respectively. A graphical depiction of this property is illustrated in Figure 1a. □ 4 Property 2 (Convex hull) A Bernstein polynomial is completely contained in the convex hull of its Bern- stein coefficients (see Figure 1b). □ Property 3 (de Casteljau Algorithm) The de Casteljau algorithm is an efficient and numerically stable recursive method to evaluate a Bernstein polynomial at any given point. The de Casteljau algorithm is also used to split a Bernstein polynomial into two independent ones. Given an Nth order Bernstein polynomial xN : [0, 1] →Rd, and a scalar tdiv ∈[0, 1], the Bernstein polynomial at tdiv can be computed using the following recursive relation ¯x[0] i,N = ¯xi,N , i = 0, . . . , N ¯x[j] i,N = ¯x[j−1] i,N (1 −tdiv) + ¯x[j−1] i+1,Ntdiv , i = 0, . . . , N −j , j = 1, . . . , N . Then, the Bernstein polynomial evaluated at tdiv is given by xN(tdiv) = ¯x[N] 0,N . Moreover, the Bernstein polynomial can be subdivided at tdiv into two Nth order Bernstein polynomials with Bernstein coefficients ¯x[0] 0,N, ¯x[1] 0,N, . . . , ¯x[N] 0,N , and ¯x[N] 0,N, ¯x[N−1] 1,N , . . . , ¯x[0] N,N . Figure 1c depicts a 2D curve defined by an 5th order Bernstein polynomial (with Bernstein coefficients described by blue circles). The curve is subdivided into two 5th order Bernstein polynomials, each with Bernstein coefficients described by black and red circles. □ Property 4 (Minimum distance) The minimum distance between two Bernstein polynomials fN(t) and gN(t), with t ∈[0, 1], namely min ta,tb∈[0,1] ||fN(ta) −gN(tb)|| , arg min ta,tb∈[0,1] ||fN(ta) −gN(tb)|| (4) can be efficiently computed by exploiting the convex-hull property and the de Casteljau algorithm [23], in combination with the Gilbert-Johnson-Keerthi (GJK) distance algorithm [34]. The latter is widely used in computer graphics and video games to compute the minimum distance between convex shapes. In [26] the authors propose an iterative procedure that uses the above tools to compute (4) within a desired tolerance. This procedure is extremely useful for motion planning applications to efficiently compute the spatial clear- ance between two paths, or between a path and an obstacle. For example, the minimum distance between the 2D Bernstein polynomial and the point depicted in Figure 1d is computed in less than 5 ms using an implementation in MATLAB, while the minimum distance between the 3D Bernstein polynomials depicted in Figure 1e is computed in less than 30 ms. The same procedure can also be employed to compute the extrema (maximum and minimum) of a Bernstein polynomial [35]. □ 3 Problem formulation This paper considers the following optimal control problem: Problem 1 (Problem P) Determine x : [0, 1] →Rnx and u : [0, 1] →Rnu that minimize I(x(t), u(t)) = E(x(0), x(1)) + Z 1 0 F(x(t), u(t))dt , (5) 5 -2 0 2 4 6 X [m] 0 2 4 6 8 Y [m] (a) End point values property. -2 0 2 4 6 X [m] 0 2 4 6 8 Y [m] (b) Convex hull property. -2 0 2 4 6 X [m] 0 2 4 6 8 Y [m] (c) de Casteljau algorithm. -2 0 2 4 6 X [m] 0 2 4 6 8 Y [m] (d) Minimum distance to a point. 0 10 20 30 40 150 Z [m] 100 Y [m] 50 X [m] 0 20 15 10 5 0 -5 -10 (e) Distance between two 3D Bernstein polynomials Figure 1: 2D and 3D spatial curves defined by Bernstein polynomials. 6 subject to ˙x = f(x(t), u(t)) , ∀t ∈[0, 1], (6) e(x(0), x(1)) = 0 , (7) h(x(t), u(t)) ≤0 , ∀t ∈[0, 1] , (8) where E : Rnx × Rnx →R and F : Rnx × Rnu →R are the terminal and running costs, respectively, f : Rnx × Rnu →Rnx describes the system dynamics, e : Rnx × Rnx →Rne is the vector of boundary conditions, and h : Rnx × Rnu →Rnh is the vector of state and input constraints. ▽ The following assumptions hold: Assumption 1 E, F, f, e, and h are Lipschitz continuous with respect to their arguments. Assumption 2 Problem P admits optimal solutions x∗(t) and u∗(t) that satisfy x∗(t) ∈C1 nx and u∗(t) ∈ C0 nu. 4 Bernstein approximation of Problem P The purpose of this section is to formulate a discretized version of Problem P, here referred to as Problem PN, where N denotes the order of approximation. This requires that we approximate the input and state functions, the cost function, the system dynamics, and the equality and inequality constraints in Problem P. First, consider the following Nth order vectors of Bernstein polynomials: xN(t) = N X j=0 ¯xj,Nbj,N(t) , uN(t) = N X j=0 ¯uj,Nbj,N(t), (9) with xN : [0, 1] →Rnx, uN : [0, 1] →Rnu, ¯xj,N ∈Rnx and ¯uj,N ∈Rnu. Let ¯xN ∈Rnx×(N+1) and ¯uN ∈Rnu×(N+1) be defined as ¯xN = [¯x0,N , . . . , ¯xN,N], ¯uN = [¯u0,N , . . . , ¯uN,N]. Let 0 = t0 < t1 < . . . < tN = 1 be a set of equidistant time nodes, i.e. tj = j N . Then Problem PN can be stated as follows: Problem 2 (Problem PN) Determine ¯xN and ¯uN that minimize IN(¯xN, ¯uN) = E(xN(0), xN(tN)) + w N X j=0 F(xN(tj), uN(tj)) , (10) subject to ∥˙xN(tj) −f(xN(tj), uN(tj))∥≤δN P , ∀j = 0, . . . , N , (11) e(xN(0), xN(tN)) = 0 , (12) h(xN(tj), uN(tj)) ≤δN P 1 , ∀j = 0, . . . , N , (13) where w = 1 N+1, and δN P is a small positive number that depends on N and converges uniformly to 0, i.e. limN→∞δN P = 0. ▽ Remark 1 Compared to the constraints of Problem P, the dynamic and inequality constraints given by Equations (11) and (13) are relaxed. Motivated by previous work on consistency of approximation theory [5], the bound δN P , referred to as relaxation bound, is introduced to guarantee that Problem PN has a feasible solution. As it will become clear later, the relaxation bound can be made arbitrarily small by choosing a sufficiently large order of approximation N. Furthermore, note that when N →∞, then the right hand sides of Equations (11) and (13) equal to zero, i.e. the difference between the constraints imposed by Problems P and PN vanishes. 7 5 Feasibility and consistency of Problem PN The outcome of Problem PN is a set of optimal Bernstein coefficients ¯x∗ N and ¯u∗ N that determine the vectors of Bernstein polynomials x∗ N(t) and u∗ N(t), i.e. x∗ N(t) = N X j=0 ¯x∗ j,Nbj,N(t) , u∗ N(t) = N X j=0 ¯u∗ j,Nbj,N(t) . (14) Now we address the following theoretical issues: 1. existence of a feasible solution to Problem PN, 2. convergence of the pair (x∗ N(t), u∗ N(t)) to the optimal solution of Problem P, given by (x∗(t), u∗(t)). The main results of this section are summarized in Theorems 1 and 2 below. Theorem 1 (Feasibility) Let δN P = CP max{Wx′(N −1 2 ) , Wx(N −1 2 ) , Wu(N −1 2 )} , (15) where CP is a positive constant independent of N, and Wx′(·), Wx(·) and Wu(·) are the moduli of continuity of ˙x(t), x(t) and u(t), respectively. Then Problem PN is feasible for arbitrary order of approximation N ∈Z+. ■ Proof: Let x(t) and u(t) be a feasible solution for Problem P, which exists by Assumption 2. Let us define the Bernstein coefficients ¯xk,N = x(tk) , ¯uk,N = u(tk) , ∀k ∈{0, . . . , N} , (16) and the resulting vectors of Bernstein polynomials as xN(t) = N X j=0 ¯xj,Nbj,N(t) , uN(t) = N X j=0 ¯uj,Nbj,N(t) . (17) In what follows, we show that the above polynomials satisfy the constraints in (11), (12) and (13), with δN P defined in Equation (15), thus proving Theorem 1. Equations (16) and (17), together with Lemma 1 and Assumption 2, imply that xN(t), ˙xN(t) and uN(t) converge uniformly to x(t), ˙x(t) and u(t), respectively. More precisely, for all t ∈[0, 1] from Lemma 1 we have ||xN(t) −x(t)|| ≤CxWx(N −1 2 ) , ||uN(t) −u(t)|| ≤CuWu(N −1 2 ) , || ˙xN(t) −˙x(t)|| ≤Cx′Wx′(N −1 2 ) , (18) where Cx < 5nx/4, Cu < 5nu/4 and Cx′ < 9nx/4 (see Lemma 1). To prove that the dynamic constraint is satisfied, we add and subtract the term ˙x(tk) −f(x(tk), u(tk)) from the left hand side of Equation (11), which yields || ˙xN(tk) −f(xN(tk), uN(tk))|| ≤|| ˙xN(tk) −˙x(tk)|| + || ˙x(tk) −f(x(tk), u(tk))|| + ||f(xN(tk), uN(tk)) −f(x(tk), u(tk))|| . The second term on the right hand side of the inequality above is zero (see Equation (6)). Using Equation (18) and the fact that f is Lipschitz (see Assumption 1) with Lipschitz constant Lf, we get || ˙xN(tk) −f(xN(tk), uN(tk))|| ≤Cx′Wx′(N −1 2 ) + Lf  CxWx(N −1 2 ) + CuWu(N −1 2 )  ≤(Cx′ + Lf(Cx + Cu)) max(Wx′(N −δ), Wx(N −δ), Wu(N −δ)) . 8 Thus, the dynamic constraint in Equation (11) is satisfied with δN P given by Equation (15). Using a similar argument, the satisfaction of the constraint in Equation (13) follows easily by Assumption 1, namely that h is Lipschitz, i.e. h(xN(tj), uN(tj)) ≤h(x(tj), u(tj)) + ||h(xN(tj), uN(tj)) −h(x(tj), u(tj))|| ≤||h(xN(tj), uN(tj)) −h(x(tj), u(tj))|| ≤Lh(Cx + Cu) max(Wx(N −δ), Wu(N −δ)) . Finally, using the end point value property of Bernstein polynomials, i.e. Property 1 in Section 2, we have xN(0) = ¯x0,N and xN(1) = ¯xN,N, which by definition implies that e(xN(0), xN(tN)) = e(x(0), x(1)) = 0, thus proving Equation (12). ♠ Corollary 1 If the optimal state x∗(t) and control u∗(t) solutions to Problem P exist and satisfy ˙x∗(t) ∈ C2 nx and u∗(t) ∈C2 nu in [0, 1], then Theorem 1 holds with δN P = CP N −1 , where CP is a positive constant independent of N. ■ Proof: The proof of Corollary 1 follows easily by applying Lemma 2 to the proof of Theorem 1. ♠ Remark 2 From the definition of δN P in Theorem 1 (and Corollary 1), it follows that for any arbitrarily small scalar ǫP > 0 there exists N1 such that for all N ≥N1, we have δN P ≤ǫP . In other words, the relaxation bound in Problem PN can be made arbitrarily small by choosing sufficiently large N, while retaining the feasibility result (Theorem 1 and Corollary 1). Theorem 2 (Consistency) Let {(¯x∗ N, ¯u∗ N)}∞ N=N1 be a sequence of optimal solutions to Problem PN, and {(x∗ N(t), u∗ N(t))}∞ N=N1 be a sequence of Bernstein polynomials, given by (14). Assume {(x∗ N(t), u∗ N(t))}∞ N=N1 has a uniform accumulation point, i.e. lim N→∞(x∗ N(t), u∗ N(t)) = (x∞(t), u∞(t)) , ∀t ∈[0, 1], (19) and assume that ˙x∞(t) and u∞(t) are continuous on [0, 1]. Then (x∞(t), u∞(t)) is an optimal solution for Problem P. ■ Proof: This proof is divided into three steps: (1) we prove that (x∞(t), u∞(t)) is a feasible solution to Problem P; (2) we show that lim N→∞IN(¯x∗ N, ¯u∗ N) = I(x∞(t), u∞(t)) ; (20) (3) we prove that (x∞(t), u∞(t)) is an optimal solution of Problem P, i.e. I(x∞(t), u∞(t)) = I(x∗(t), u∗(t)) . Step (1). First, we show that (x∞(t), u∞(t)) satisfies the dynamic constraint of Problem P: ˙x∞(t) −f(x∞(t), u∞(t)) = 0 . We show this by contradiction. Assume that the above equality does not hold. Then there exists t′, such that || ˙x∞(t′) −f(x∞(t′), u∞(t′))|| > 0 . (21) Since the nodes {tk}N k=0, tk = k N are dense in [0, 1], there exists a sequence of indices {kN}∞ N=0 such that lim N→∞tkN = t′. 9 Then, from continuity of ˙x∞(t), x∞(t) and u∞(t), the left hand side of Equation (21) satisfies || ˙x∞(t′) −f(x∞(t′), u∞(t′))|| = lim N→∞|| ˙x∗ N(tkN ) −f(x∗ N(tkN ), u∗ N(tkN ))||. However, the dynamic constraint in Problem PN is || ˙x∗ N(tkN ) −f(x∗ N(tkN ), u∗ N(tkN ))|| ≤δN P , which implies that lim N→∞|| ˙x∗ N(tkN ) −f(x∗ N(tkN ), u∗ N(tkN ))|| = 0. The above result contradicts Equation (21), thus proving that (x∞(t), u∞(t)) satisfies the dynamic constraint in Equation (6). The equality and inequality constraints in (12) and (13) follow easily by an identical argument. Step (2). To prove that Equation (20) is satisfied we need to show the following: lim N→∞w N X j=0 F(x∗ N(tj), u∗ N(tj)) = Z 1 0 F(x∞(t), u∞(t))dt , (22) and lim N→∞E(x∗ N(0), x∗ N(tN)) = E(x∞(0), x∞(1)) . (23) Using Lemma 3, together with the Lipschitz assumption on F (see Assumption 1) and the continuity of x∞(t) and u∞(t), we get lim N→∞w N X j=0 F(x∞(tj), u∞(tj)) = Z 1 0 F(x∞(t), u∞(t))dt . Finally, applying the convergence assumption given by Equation (19), the result in Equation (22) follows. Similarly, using the Lipschitz assumption on E, one can show that Equation (23) holds, thus completing the proof of Step (2). Step (3). Finally, it remains to show that I(x∞(t), u∞(t)) = I(x∗(t), u∗(t)) . First, recall that x∗(t) and u∗(t) are optimal solutions of Problem P, while ¯x∗ N and ¯u∗ N are optimal solutions of Problem PN. Let ˜¯xk,N = x∗(tk), ˜¯uk,N = u∗(tk), ∀k ∈{1, . . . , N} and ˜¯xN = [˜¯x0,N , . . . , ˜¯xN,N], ˜¯uN = [˜¯u0,N , . . . , ˜¯uN,N]. Following an argument similar to the one in the proof of Theorem 1, one can show that there exists N1 such that for any N ≥N1 the pair (˜¯xN, ˜¯uN) is a feasible solution of Problem PN. Moreover, an argument similar to the one in the proof of Step (2) yields lim N→∞IN(˜¯xN, ˜¯uN) = I(x∗(t), u∗(t)) . (24) Then we have I(x∗(t), u∗(t)) ≤I(x∞(t), u∞(t)) = lim N→∞IN(¯x∗ N, ¯u∗ N) ≤lim N→∞IN(˜¯xN, ˜¯uN) . (25) The last inequality, combined with (24), gives I(x∗(t), u∗(t)) = I(x∞(t), u∞(t)) , which completes the proof of Theorem 2. ♠ 10 6 Costate estimation for Problem P using Bernstein approxima- tion 6.1 First order optimality conditions of Problem P We start by deriving the first order necessary conditions for Problem P. Let λ(t) : [0, 1] →Rnx be the costate trajectory, and let µ(t) : [0, 1] →Rnh and ν ∈Rne be the multipliers. By defining the Lagrangian of the Hamiltonian (also known as the D-form [36]) as L(x(t), u(t), λ(t), µ(t)) = H(x(t), u(t), λ(t)) + µ⊤(t)h(x(t), u(t)) , where the Hamiltonian H is given by H(x(t), u(t), λ(t)) = F(x(t), u(t)) + λ⊤(t)f(x(t), u(t)) , the dual of Problem P can be formulated as follows [36]. Problem 3 (Problem Pλ) Determine x(t), u(t), λ(t), µ(t) and ν that for all t ∈[0, 1] satisfy Equations (6), (7), (8) and µ⊤(t)h(x(t), u(t)) = 0 , µ(t) ≥0 , (26) ˙λ⊤(t) + Lx(x(t), u(t), λ(t), µ(t)) = ˙λ⊤(t) + Fx(x(t), u(t)) + λ⊤(t)fx(x(t), u(t)) + µ⊤(t)hx(x(t), u(t)) = 0 , (27) λ⊤(0) = −ν⊤ex(0)(x(0), x(1)) −Ex(0)(x(0), x(1)) , (28) λ⊤(1) = ν⊤ex(1)(x(0), x(1)) + Ex(1)(x(0), x(1)) , (29) Lu(x(t), u(t), λ(t), µ(t)) = λ⊤(t)fu(x(t), u(t)) + Fu(x(t), u(t)) + µ⊤(t)hu(x(t), u(t)) = 0 . (30) ▽ In the above problem, subscripts are used to denote partial derivatives, e.g. Fx(x, u) = ∂ ∂xF(x, u). The following assumptions are imposed onto Problem Pλ. Assumption 3 E, F, f, e and h are continuously differentiable with respect to their arguments, and their gradients are Lipschitz continuous over the domain. Assumption 4 Solutions x∗(t), u∗(t), λ∗(t), µ∗(t) and ν∗of Problem Pλ exist and satisfy x∗(t) ∈C1 nx, u∗(t) ∈C0 nu, λ∗(t) ∈C1 nx and µ∗(t) ∈C0 nh in [0, 1]. Remark 3 Notice that Problem Pλ implicitly assumes the absence of pure state constraints in Problem P. If the inequality constraint in Equation (8) is independent of u(t), then the costate λ(t) must also satisfy the following jump condition [36]: λ(t− e ) = λ(t+ e ) + h⊤ x(te)η , where te is the entry or exit time into a constrained arc in which the inequality constraint is active, t− e and t+ e denote the left-hand side and right-hand side limits of the trajectory, respectively, and η is a constant covector. For simplicity, the theoretical results that will be presented in Section 7 do not consider the jump conditions above, i.e., the inequality constraints are dependent on u(t). Nevertheless, numerical examples will be presented in Section 8 showing the applicability of the discretization method to pure state-constrained problems. 6.2 KKT conditions of Problem PN Let us introduce the following Nth order Bernstein polynomials: λN(t) = N X j=0 ¯λj,Nbj,N(t) , µN(t) = N X j=0 ¯µj,Nbj,N(t), (31) 11 with λN : [0, 1] →Rnx, µN : [0, 1] →Rnh, ¯λj,N ∈Rnx and ¯µj,N ∈Rnh, and the vector ¯ν ∈Rne. Finally, let ¯λN ∈Rnx×(N+1) and ¯µN ∈Rnu×(N+1) be defined as ¯λN = [¯λ0,N , . . . , ¯λN,N], ¯µN = [¯µ0,N , . . . , ¯µN,N]. With the above notation, the Lagrangian for problem PN can be written as LN = E(xN(0), xN(tN)) + w N X j=0 F(xN(tj), uN(tj)) + N X j=0 λ⊤ N(tj)(−˙xN(tj) + f(xN(tj), uN(tj))) + N X j=0 µ⊤ N(tj)h(xN(tj), uN(tj)) + ¯ν⊤e(xN(0), xN(tN)) . Then the dual of Problem PN can be stated as follows: Problem 4 (Problem PNλ) Determine ¯xN, ¯uN, ¯λN, ¯µN and ¯ν that satisfy the primal feasibility condi- tions, namely Equations (11), (12) and (13), the complementary slackness and dual feasibility conditions µ⊤ N(tk)h(xN(tk), uN(tk)) ≤N −1δN D , µN(tk) ≥−N −1δN D1 , ∀k = 0, . . . , N , (32) and the stationarity conditions ∂LN ∂¯xk,N ≤δN D , ∂LN ∂¯uk,N ≤δN D , ∀k = 0, . . . , N, (33) where δN D is a small positive number that depends on N and satisfies limN→∞δN D = 0. ▽ At this point one might expect results similar to the ones in Section 5, i.e. feasibility (Theorem 1) and consistency (Theorem 2). Nevertheless, similarly to most results on costate estimation [37–39], this is not the case, and additional conditions must be added to Equations (11)-(13), (32) and (33) in order to obtain consistent approximations of the solutions of Problem Pλ. These conditions, often referred to as closure conditions in the literature, are given as follows: λ⊤ N(0) w + ¯ν⊤ex(0)(xN(0), xN(tN)) + Ex(0)(xN(0), xN(tN)) ≤δN D (34) λ⊤ N(tN) w −¯ν⊤ex(1)(xN(0), xN(tN)) −Ex(1)(xN(0), xN(tN)) ≤δN D . (35) In other words, the closure conditions are constraints that must be added to Problem PNλ so that the solution of this problem approximates the solution of Problem Pλ. We notice that the conditions given above are discrete approximations of the conditions given by Equations (28) and (29). With this setup, we define the following problem: Problem 5 (Problem P clos Nλ ) Determine ¯xN, ¯uN, ¯λN, ¯µN and ¯ν that satisfy the primal feasibility con- ditions, namely Equations (11), (12) and (13), the complementary slackness and dual feasibility conditions (32), the stationarity conditions (33), and the closure conditions (34) and (35). ▽ The solution of Problem P clos Nλ presents a set of optimal Bernstein coefficients ¯x∗ N, ¯u∗ N, ¯λ∗ N, ¯µ∗ N (which determine the Bernstein polynomials x∗ N(t), u∗ N(t), λ∗ N(t) and µ∗ N(t)) and a vector ¯ν∗. 12 7 Feasibility and consistency of Problem P clos Nλ The objective of this section is to investigate the ability of the solutions of Problem P clos Nλ to approximate the solutions of Problem Pλ. In what follows, we first show the existence of a solution for Problem P clos Nλ that satisfies also the closure conditions (feasibility). Second, we investigate the convergence properties of this solution as N →∞(consistency). Third, by combining these two results, we finally formulate the covector mapping theorem for Bernstein approximations, which provides a bijective map (covector mapping) between the solution of Problem P clos Nλ and the solution of Problem Pλ. The main results of this section are summarized in the three theorems below. Theorem 3 (Feasibility) Let δN D = CD max{δN P , Wλ′(N −1 2 ) , Wλ(N −1 2 ) , Wµ(N −1 2 )} , (36) where CD is a positive constant independent of N, δN P was defined in Equation (15), and Wλ′(·), Wλ(·), and Wµ(·) are the moduli of continuity of ˙λ(t), λ(t) and µ(t), respectively. Then Problem P clos Nλ is feasible for arbitrary order of approximation N ∈Z+. ■ Proof: Similar to the proof of Theorem 1, this proof follows by constructing a solution for Problem P clos Nλ , with δN D given by Equation (36). To this end, let x(t), u(t), λ(t), µ(t) and ν be a solution of Problem Pλ, which exists by Assumption 4, and define ¯xj,N = x(tj) , ¯uj,N = u(tj) , (37) ¯λj,N = wλ(tj) , ¯µj,N = wµ(tj) , ¯ν = ν , (38) for all j = 0, . . . , N, tj = j N , w = 1 N+1, with corresponding Bernstein polynomials given by xN(t) = N X j=0 ¯xj,Nbj,N(t) , uN(t) = N X j=0 ¯uj,Nbj,N(t) , λN(t) = N X j=0 ¯λj,Nbj,N(t) , µN(t) = N X j=0 ¯µj,Nbj,N(t). (39) The remainder of this proof shows that xN(t), uN(t), λN(t) , µN(t) and ¯ν given above satisfy Equations (32)-(35) (we notice that the satisfaction of Equations (11)-(13) has already been addressed in the proof of Theorem 1). We start by defining the Bernstein coefficients ˜¯λj,N and ˜¯µj,N as follows ˜¯λj,N = ¯λj,N w , ˜¯µj,N = ¯µj,N w , (40) with corresponding Bernstein polynomials given by ˜λN(t) = N X j=0 ˜¯λj,Nbj,N(t) , ˜µN(t) = N X j=0 ˜¯µj,Nbj,N(t). Notice that ˜λN(t) = λN(t) w , ˜µN(t) = µN(t) w . (41) Combining Equations (37), (38) and (40) and using Assumption 4 and Lemma 1, we get ||xN(t) −x(t)|| ≤CxWx(N −1 2 ) , ||uN(t) −u(t)|| ≤CuWu(N −1 2 ) , || ˙xN(t) −˙x(t)|| ≤Cx′Wx′(N −1 2 ) , ||˜λN(t) −λ(t)|| ≤CλWλ(N −1 2 ) , ||˜µN(t) −µ(t)|| ≤CµWµ(N −1 2 ) , || ˙˜λN(t) −˙λ(t)|| ≤Cλ′Wλ′(N −1 2 ) , (42) where Cλ < 5nx 4 , Cµ < 5nh 4 , Cλ′ < 9nx 4 and Wλ(·), Wµ(·) and Wµ(·) are the moduli of continuity of λ(t), µ(t) and ˙λ(t), respectively. 13 Now we show that the bound in Equation (32) is satisfied. Using Equation (41), and adding and sub- tracting w(µ⊤(tk)h(xN(tk), uN(tk)) + µ⊤(tk)h(x(tk), u(tk))), we get ∥µ⊤ N(tk)h(xN(tk), uN(tk))∥= ∥w ˜µ⊤ N(tk)h(xN(tk), uN(tk))|| ≤w∥(˜µ⊤ N(tk) −µ⊤(tk))h(xN(tk), uN(tk))∥+ w∥µ⊤(tk)h(x(tk), u(tk))∥ + w∥µ⊤(tk)(h(xN(tk), uN(tk)) −h(x(tk), u(tk))∥ Using Equation (26), the above inequality reduces to ∥µ⊤ N(tk)h(xN(tk), uN(tk))∥≤w∥(˜µ⊤ N(tk) −µ⊤(tk))h(xN(tk), uN(tk))∥ + w∥µ⊤(tk)(h(xN(tk), uN(tk)) −h(x(tk), u(tk))∥ ≤w||h(xN(tk), uN(tk))||CµWµ(N −1 2 ) + w∥µ⊤(tk)∥Lh(CxWx(N −1 2 ) + CuWu(N −1 2 )), where we used the bounds in Equation (42) together with the Lipschitz assumption on h (see Assumptions 3). Finally, from using Assumptions 3 and 4, it follows that h and µ are bounded on [0, 1] with bounds hmax and µmax, respectively. Therefore, we get ∥µ⊤ N(tk)h(xN(tk), uN(tk))∥≤w[hmaxCµWµ(N −1 2 ) + µmaxLh(CxWx(N −1 2 ) + CuWu(N −1 2 ))], which implies that the bound in Equation (32) is satisfied with δN D given by Equation (36) and CD > hmaxCµ + µmaxLh(Cx + Cu). Similarly, µN(tk) = w ˜µN(tk) ≥wµ(tk) −w∥µ(tk) −˜µN(tk)∥1 ≥−N −1CµWµ(N −1 2 )1 , which proves that Equation (32) holds. Now consider the left equation in (33). For k = 0 we have ∂LN ∂¯x0,N = Ex(0)(xN(0), xN(tN)) + w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) + N X j=0 λ⊤ N(tj) h −˙b0,N(tj) + fx(xN(tj), uN(tj))b0,N(tj) i + N X j=0 µ⊤ N(tj)hx(xN(tj), uN(tj))b0,N(tj) + ¯ν⊤ex(0)(xN(0), xN(tN)) . (43) Substituting w˜λN(tj) = λN(tj) and w ˜µN(tj) = µN(tj), the equation above can be written as ∂LN ∂¯x0,N = Ex(0)(xN(0), xN(tN)) + w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) + w N X j=0 ˜λ⊤ N(tj) h −˙b0,N(tj) + fx(xN(tj), uN(tj))b0,N(tj) i +w N X j=0 ˜µ⊤ N(tj)hx(xN(tj), uN(tj))b0,N(tj) + ¯ν⊤ex(0)(xN(0), xN(tN)) . (44) 14 Notice that the following inequalities are satisfied: w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(x(t), u(t))b0,N(t)dt ≤¯C1(N −1 2 + Wx(N −1 2 ) + Wu(N −1 2 ))(45a) w N X j=0 ˜λ⊤ N(tj)˙b0,N(tj) − Z 1 0 λ⊤(t)˙b0,N(t)dt ≤¯C2(N −1 2 + Wλ(N −1 2 ))(45b) w N X j=0 ˜λ⊤ N(tj)fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 λ⊤(t)fx(x(t), u(t))b0,N(t)dt ≤ (45c) ¯C3(N −1 2 + Wλ(N −1 2 ) + Wx(N −1 2 ) + Wu(N −1 2 )) w N X j=0 ˜µ⊤ N(tj)hx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 µ⊤(t)hx(x(t), u(t))b0,N(t)dt ≤ (45d) ¯C4(N −1 2 + Wµ(N −1 2 ) + Wx(N −1 2 ) + Wu(N −1 2 )) , for some positive ¯C1, ¯C2, ¯C3 and ¯C4 independent of N. A proof of the above inequalities is given in Appendix A. Then the combination of Equations (44) and (45) yields the following inequality ∂LN ∂¯x0,N ≤ Ex(0)(x(0), x(1)) + Z 1 0 Fx(x(t), u(t))b0,N(t)dt − Z 1 0 λ⊤(t)˙b0,N(t)dt + Z 1 0 λ⊤(t)fx(x(t), u(t))b0,N(t)dt + Z 1 0 µ⊤(t)hx(x(t), u(t))b0,N(t)dt + ¯ν⊤ex(0)(x(0), x(1)) + ¯C max n N −1 2 , Wx(N −1 2 ), Wu(N −1 2 ), Wλ(N −1 2 ), Wµ(N −1 2 ) o , (46) with ¯C ≥4 max{ ¯C1, ¯C2, ¯C3, ¯C4}. Using integration by parts, we have R 1 0 λ⊤(t)˙b0,N(t)dt = − R 1 0 ˙λ⊤(t)b0,N(t)dt+ [λ⊤(t)b0,N(t)]1 0. Thus, since b0,N(0) = 1, bN,N(0) = 0, the above inequality becomes ∂LN ∂¯x0,N ≤ Ex(0)(x(0), x(1)) + λ⊤(0) + ν⊤ex(0)(x(0), x(1)) + Z 1 0 h ˙λ⊤(t) + Fx(x(t), u(t)) + λ⊤(t)fx(x(t), u(t)) + µ⊤(t)hx(x(t), u(t)) i b0,N(t)dt + ¯C max n N −1 2 , Wx(N −1 2 ), Wu(N −1 2 ), Wλ(N −1 2 ), Wµ(N −1 2 ) o . (47) Finally, using Equations (27) and (28), the above inequality reduces to the left condition in Equation (33) for k = 0, with δN D given by Equation (36) and CD ≥¯C. The same condition for k = 1, . . . , N can be shown to be satisfied using an identical argument. The stationarity condition in the right of Equation (33) can also be verified similarly, and the computations are thus omitted. To show that the closure condition (34) is satisfied we use the definitions in Equations (37) and (38) together with the end point values property of Bernstein polynomials, Property 1 in Section 2, which gives λ⊤ N(0) w + ¯ν⊤ex(0)(xN(0), xN(tN)) + Ex(0)(xN(0), xN(tN)) ≤ λ⊤(0) + ν⊤ex(0)(x(0), x(1)) + Ex(0)(x(0), x(1)) = 0 , where the last equality follows from Equation (28). An identical argument can be used to show that the closure condition (35) holds, thus completing the proof of Theorem 3. ♠ 15 Corollary 2 If solutions x∗(t), u∗(t), λ∗(t), µ∗(t) and ν∗of Problem Pλ exist and satisfy ˙x∗(t) ∈C2 nx, u∗(t) ∈C2 nu, ˙λ∗(t) ∈C2 nx, and µ∗(t) ∈C2 nh in [0, 1], then Theorem 3 holds with δN P = CP N −1 and δN D = CDN −1 , where CP and CD are positive constants independent of N. ■ Proof: The proof of Corollary 2 follows easily by applying Lemma 2 to the proof of Theorem 3. ♠ Remark 4 Similarly to Remark 2, for arbitrarily small scalar ǫD > 0 there exists N1 such that for all N ≥N1, we have δN D ≤ǫD; i.e., the relaxation bound in Problem P clos Nλ can be made arbitrarily small by choosing sufficiently large N. Theorem 4 (Consistency) Let {(¯x∗ N, ¯u∗ u, ¯λ∗ N, ¯µ∗ N, ¯ν∗)}∞ N=N1 be a sequence of solutions of Problem P clos Nλ . Consider the sequence of transformed solutions {(¯x∗ N, ¯u∗ N, ˜¯λ ∗ N, ˜¯µ ∗ N, ¯ν∗)}∞ N=N1, with ˜¯λ ∗ j,N = ¯λ∗ j,N w , ˜¯µ ∗ j,N = ¯λ∗ j,N w , and the corresponding polynomial approximation {(x∗ N(t), u∗ N(t), ˜λ∗ N(t), ˜µ∗ N(t), ¯ν∗)}∞ N=N1. Assume that the latter has a uniform accumulation point, i.e. lim N→∞(x∗ N(t), u∗ N(t), ˜λ∗ N(t), ˜µ∗ N(t), ¯ν∗) = (x∞(t), u∞(t), ˜λ∞(t), ˜µ∞(t), ¯ν∞) , ∀t ∈[0, 1], and assume ˙x∞(t), u∞(t), ˙˜λ∞(t) and ˜µ∞(t) are continuous on [0, 1]. Then, (x∞(t), u∞(t), ˜λ∞(t), ˜µ∞(t), ¯ν∞) is a solution of Problem Pλ. ■ Proof: The objective is to show that x∞(t), u∞(t), ˜λ∞(t), ˜µ∞(t) and ¯ν∞satisfy Equations (6)-(8) and (26)-(30). The satisfaction of Equations (6)-(8) has been demonstrated in the proof of Theorem 2. We start by showing Equation (26), and we do so using a proof by contradiction. Assume that x∞(t), u∞(t), ˜µ∞(t) do not satisfy Equation (26). Then there exists t′ ∈[0, 1], such that ∥˜µ∞⊤(t′)h(x∞(t′), u∞(t′))∥> 0. (48) Since the nodes {tk}N k=0 are dense in [0, 1], there exists a sequence of indices {kN}∞ N=0 such that lim N→∞tkN = t′, which implies lim N→∞∥˜µ∞(t′) −˜µ∞(tkN )∥= 0 , lim N→∞∥x∞(t′) −x∞(tkN )∥= 0 , lim N→∞∥u∞(t′) −u∞(tkN )∥= 0 . Then we have ||˜µ∞⊤(t′)h(x∞(t′), u∞(t′))|| ≤lim N→∞||(˜µ∗⊤ N (t′) −˜µ∗⊤ N (tkN ))h(x∗ N(t′), u∗ N(t′))|| + lim N→∞||˜µ∗⊤ N (tkN )(h(x∗ N(t′), u∗ N(t′)) −h(x∗ N(tkN ), u∗ N(tkN )))|| + lim N→∞||˜µ∗⊤ N (tkN )h(x∗ N(tkN ), u∗ N(tkN ))|| = lim N→∞ 1 w||µ∗⊤ N (tkN )h(x∗ N(tkN ), u∗ N(tkN ))|| = 0 , where we used Equation (32). This contradicts Equation (48). Similarly, we can show that ˜µ∞(t) ≥0, thus proving that x∞(t), u∞(t) and ˜µ∞(t) satisfy Equation (26). 16 Furthermore, we notice that if x∞(t), u∞(t), ˜λ∞(t), ˜µ∞(t) and ¯ν∞satisfy Equations (33)-(35), then the following holds for all k = 0, . . . , N: ˜λ∞⊤(0) + ¯ν∞⊤ex(0)(x∞(0), x∞(1)) + Ex(0)(x∞(0), x∞(1)) = 0 , λ∞⊤(1) −¯ν∞⊤ex(1)(x∞(0), x∞(1)) −Ex(1)(x∞(0), x∞(1)) = 0 , Z 1 0  ˙˜λ ∞⊤ (t) + Fx(x∞(t), u∞(t)) + ˜λ∞⊤(t)fx(x∞(t), u∞(t)) + ˜µ∞⊤(t)hx(x∞(t), u∞(t))  bk,N(t)dt = 0 , Z 1 0 h Fu(x∞(t), u∞(t)) + ˜λ∞⊤(t)fu(x∞(t), u∞(t)) + ˜µ∞⊤(t)hu(x∞(t), u∞(t)) i bk,N(t)dt = 0 . Since {bk,N(t)}N k=0 is a linearly independent basis set, the last two equations above imply ˙˜λ ∞⊤ (t) + Fx(x∞(t), u∞(t)) + ˜λ∞⊤(t)fx(x∞(t), u∞(t)) + ˜µ∞⊤(t)hx(x∞(t), u∞(t)) = 0 , Fu(x∞(t), u∞(t)) + ˜λ∞⊤(t)fu(x∞(t), u∞(t)) + ˜µ∞⊤(t)hu(x∞(t), u∞(t)) = 0 , for all t ∈[0, 1]. This proves that x∞(t), u∞(t), ˜λ∞(t), ˜µ∞(t) and ¯ν∞(t) satisfy Equations (27)-(30). ♠ Theorem 5 (Covector Mapping Theorem) Under the same assumptions of Theorems 3 and 4, when N →∞, the covector mapping x∗ N(t) 7→x∗(t) , u∗ N(t) 7→u∗(t) , λ∗ N w 7→λ∗(t) , µ∗ N(t) w 7→µ∗(t) , ¯ν∗7→ν∗ is a bijective mapping between the solution of Problem P clos Nλ and the solution of Problem Pλ. ■ Proof: The above result follows directly from Theorems 3 and 4. ♠ 8 Numerical examples This section presents four numerical examples aimed at validating the convergence properties of the proposed method based on Bernstein approximation. In the first example we consider a non-linear one-dimensional optimal control problem with smooth state, input and costate. In the second example we investigate the applicability and efficacy of our approach, when solving a bang-bang optimal control problem. In the third and fourth examples we demonstrate the benefits of the proposed method, when solving trajectory generation problems for single-vehicle and multi-vehicle missions, respectively. The results are obtained using MATLAB’s built in fmincon function. 8.1 Example 1 Consider the following optimal control problem taken from [37]: Example 1 Determine y : [0, 5] →R and u : [0, 5] →R that minimize I(y(t), u(t)) = 1 2 Z 5 0 (y(t) + u2(t))dt , 17 0 10 20 30 40 50 60 -10 -8 -6 -4 -2 0 error Figure 2: Error in Bernstein approximation method for Example 1. 0 10 20 30 40 50 60 -10 -8 -6 -4 -2 0 error Figure 3: Error in Radau Pseudospectral Method (RPM) for Example 1 (taken from [37]). subject to ˙y(t) = 2y(t) + 2u(t) p y(t) , ∀t ∈[0, 5], y(0) = 2 , y(5) = 1 . The above example was solved using the Bernstein approximation method for orders N = 5, 10, . . ., 55, 60. Similar to [37], we define the following errors ey = max k=0,...,N log10 |y∗ N(tk) −y∗(tk))| , eu = max k=0,...,N log10 |u∗ N(tk) −u∗(tk))| , eλ = max k=0,...,N log10 |λ∗ N(tk) −λ∗(tk))| , where y∗ N(tk), u∗ N(tk) and λ∗ N(tk) are the state, input and costate evaluated at the equidistant time nodes tk = 5k/N, k = 0, . . . , N, while y∗(tk), u∗(tk) and λ∗(tk) are the exact solutions. Figure 2 illustrates the convergence of the above errors to zero as the order of approximation increases. As expected, the Bernstein approximation method is outperformed by pseudospectral methods in terms of converge rate. For example, Figure 3 presents the results obtained using the Radau Pseudospectral Method (RPM) presented in [37]. The data depicted in Figure 3 are reported from [37], where the authors solved Example 1 using RPM with the software OptimalPrime and the NLP solver SNOPT. Finally, Figure 4 depicts the state, input, and costate obtained using the Bernstein approximation method for order N = 40 alongside the exact solution. 8.2 Example 2 - Bang-bang control This example investigates the efficacy of the proposed method when dealing with a bang-bang optimal control problem. Consider the following problem: 18 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time [s] -4 -3 -2 -1 0 1 2 solution Figure 4: Solution to Example 1 using the Bernstein approximation method for order N = 40. 0 0.5 1 1.5 2 time [s] -0.5 0 0.5 1 1.5 2 2.5 control input N =10 Bern exact 0 0.5 1 1.5 2 time [s] -0.5 0 0.5 1 1.5 2 2.5 control input N =15 Bern exact 0 0.5 1 1.5 2 time [s] -0.5 0 0.5 1 1.5 2 2.5 control input N =30 Bern exact 0 0.5 1 1.5 2 time [s] -0.5 0 0.5 1 1.5 2 2.5 control input N =55 Bern exact Figure 5: Solution to Example 2 using the Bernstein approximation method. Example 2 Determine y : [0, 2] →R and u : [0, 2] →R that minimize I(y(t), u(t)) = Z 2 0 (3u(t) −2y(t))dt , (49) subject to ˙y(t) = y(t) + u(t) , ∀t ∈[0, 2], (50) y(0) = 4 , (51) y(2) = 39.392 , (52) 0 ≤u(t) ≤2 ∀t ∈[0, 2] . (53) The optimal control for the above example is: u∗(t) = ( 2 0 ≤t ≤1.096 0 1.096 ≤t ≤2. Example 2 is solved using the Bernstein approximation method for orders of approximation N = 10, 15, 30, 55. The results are illustrated in Figure 5. It can be noted that the solutions resulting from the Bernstein approx- imation method converge, albeit slowly, to the exact solution, and behave nicely despite the discontinuity. The solutions have no jumps in the neighborhood of the discontinuities (the Gibbs phenomenon does not 19 occur for function approximation via Bernstein polynomials [24]), and the exact value of the discontinuity (t = 1.096s) is detected with reasonable accuracy even for low orders of approximation. The reader is referred to [20], where the discretization method presented in [40] is implemented to solve Example 2, emphasizing the inefficacy of pseudospectral methods when approximating bang-bang solutions. Additional references dis- cussing the performance of pseudospectral methods when dealing with bang-bang optimal control problems can be found in [19,21,22,41]. 8.3 Example 3 - Trajectory generation for a single vehicle In this section, the 2D trajectory generation problem for a single vehicle is considered. The vehicle, modelled as a single integrator, is required to navigate from the initial position x0 = [−500, −900]m to the final destination xf = [1500, −600]m, while minimizing the time of arrival. The algorithm must ensure a minimum separation of E = 50m with three obstacles positioned at po1 = [0 −800]⊤m, po2 = [450 −750]⊤m and po3 = [850 −730]⊤m. Finally, the norm of the input must remain within minimum and maximum saturation limits umin = 15m/s and umax = 32m/s. This problem can be formally stated as follows: Example 3 Determine x(t), u(t) and tf that minimize I(x(t), u(t)) = Z tf 0 dt , subject to ˙x(t) = u(t) , ∀t ∈[0, tf], x(0) = x0 , x(tf) = xf, ||x(t) −poi|| ≥E , ∀t ∈[0, tf], i = 1, 2, 3 , umin ≤||u(t)|| ≤umax , ∀t ∈[0, tf]. The discretization method proposed in this paper is compared to the Legendre pseudospectral method [42]. The results are enclosed in Figure 6. The top-left, top-center and bottom-left figures show the trajectories, obtained using the pseudospectral method with orders of approximation 5, 20 and 100, respectively. As discussed in the introduction, the pseudospectral method enforces the constraints only at the discretization nodes and not in between them. By increasing the number of nodes, the distance between the entire trajectory and the obstacles increases towards the desired value E = 50m. However, as demonstrated by the top- right figure, which depicts the distance between the trajectories and the obstacles for the three order of approximations indicated above, the minimum separation constraint is never satisfied. On the other hand, the bottom-center figure shows that with the proposed method, even by choosing a small number of nodes (N=5 in this example), the collision avoidance constraint can be computed for the entire curve using, for example, the minimum distance algorithm introduced in Section 2, Property 4. Thus, collision avoidance is satisfied for the entire trajectory. The bottom-right figure supports this claim by showing that the distance between the trajectory and the obstacles is always greater than the required value. 8.4 Example 4 - Trajectory generation for multi-vehicle missions As evidenced by the previous example, the possibility of choosing low order of approximations while guaran- teeing constraint satisfaction is the strength of the proposed approach, which prioritizes safety and feasibility of the trajectories over optimality. The advantage of the method becomes more evident in multi-vehicle mis- sions, where the trajectories assigned to the vehicles must be (spatially or temporally)1 separated. Consider for example a mission scenario in which n vehicles, starting from their initial positions, have to follow spa- tially separated trajectories to reach predefined final destinations. By adopting the pseudospectral method described above, spatial separation would have to be enforced by imposing separation constraints between 1Spatial separation is guaranteed if the minimum distance between any two points on two paths is greater than or equal to a minimum spatial clearance (the paths never intersect); temporal separation is achieved if for any time t the minimum distance between two vehicles is greater than or equal to the minimum spatial clearance. 20 X [m] -500 0 500 1000 1500 Y [m] -1500 -1000 -500 0 PS method - 5th order X [m] -500 0 500 1000 1500 Y [m] -1500 -1000 -500 0 PS method - 20th order X [m] -500 0 500 1000 1500 Y [m] -1500 -1000 -500 0 PS method - 100th order X [m] -500 0 500 1000 1500 Y [m] -1500 -1000 -500 0 Bernestein method - 5th order time [s] 0 10 20 30 40 50 60 Y [m] 20 40 60 80 100 120 140 PS method - Distance N=5 N=20 N=100 time [s] 0 10 20 30 40 50 60 Y [m] 20 40 60 80 100 120 140 Bernstein method - Distance Figure 6: Legendre pseudospectral vs Bernstein approximation method: collision avoidance with multiple obstacles. every node of every trajectory. Thus, the problem would have n 2  N 2 separation constraints, where N is the number of nodes, and  n 2  is the binomial coefficient. An increased number of nodes (dictated, per- haps, by reasons similar to the ones discussed in the previous example) would increase the complexity in the search for the optimal solution, making the pseudospectral approach practically infeasible for these types of applications. On the other hand, with the approach of this paper the constraint satisfaction is achieved independently of the number of nodes. This is discussed in the next simulation scenario. Figure 7 illustrates the results of a multi-vehicle mission in which n = 11 vehicles, starting from their initial positions, have to reach a ‘>’ shaped formation while minimizing the time of arrival. The Bernstein approximation method is employed with the order of approximation N = 8. The dynamics of the ith vehicle are governed by the following differential equations      ˙x1,i(t) = Vi(t) cos(x3,i(t)) ˙x2,i(t) = Vi(t) sin(x3,i(t)) ˙x3,i(t) = ωi(t) , (54) with input ui(t) = [Vi(t) , ωi(t)]⊤and state xi(t) = [x1,i(t) , x2,i(t) , x3,i(t)]⊤. The input constraints are: V 2 min ≤V 2 i (t) ≤V 2 max , (55) −ωmax ≤ωi(t) ≤ωmax , (56) with Vmin = 15m/s, Vmax = 32m/s and ωmax = 0.3rad/s. Finally, temporal separation constraints are imposed between each pair of trajectories, i.e. ||pi(t) −pj(t)|| ≥E , (57) ∀i, j = 1, . . . , 11, i ̸= j , ∀t ∈[0, tf], where E = 50m and pi(t) = [x1,i(t) , x2,i(t)]⊤. The constraints in Equations (55), (56) and (57) are computed by exploiting the properties of Bernstein polynomials. In particular, we used the minimum distance and the computation of extrema algorithms from [16,26]. Figures 8 and 9 show the time history of the speeds and angular rates, respectively, demonstrating that the input saturation constraints are satisfied for all times. 21 At last, the same simulation is repeated, but the temporal separation constraint given by Equation (57) is replaced by the (more stringent) spatial separation requirement ||pi(tk) −pj(tp)|| ≥E , (58) ∀i, j = 1, . . . , 11, i ̸= j ∀tk, tp ∈[0, tf]. Figure 10 depicts the 2D trajectories. Figures 11 and 12 illustrate the speed and angular rate commands, respectively. X [m] -500 0 500 1000 1500 2000 2500 Y [m] -500 0 500 1000 1500 2000 Figure 7: Multiple vehicles mission - temporal separation: 2D paths. time [s] 0 10 20 30 40 50 60 70 80 90 speed [m/s] 15 20 25 30 35 Figure 8: Multiple vehicles mission - temporal separation: speed profiles. time [s] 0 10 20 30 40 50 60 70 80 90 ang rate [rad/s] -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 Figure 9: Multiple vehicles mission - temporal separation: angular rates. 9 Conclusions This paper proposed a numerical method to approximate nonlinear constrained optimal control problems by nonlinear programming (NLPs) using Bernstein polynomial approximation. A rigorous analysis is provided that shows convergence of the solution of the NLP to the solution of the continuous-time problem. A set of 22 X [m] -500 0 500 1000 1500 2000 2500 Y [m] -500 0 500 1000 1500 2000 Figure 10: Multiple vehicles mission - spatial separation: 2D paths. time [s] 0 10 20 30 40 50 60 70 80 90 speed [m/s] 15 20 25 30 35 Figure 11: Multiple vehicles mission - spatial separation: speed profiles. conditions are derived under which the Karush-Kuhn-Tucker multipliers of the NLP converge to the costates of the optimal control problem. This led to the formulation of the Covector Mapping Theorem for Bernstein approximation, which enables numerical computation of the costates. The theoretical findings are validated through several numerical examples, and the advantages and disadvantages of the proposed method are discussed. Appendix A Proof of Equation (45) Let us focus on Equation (45a). Adding and subtracting R 1 0 Fx(xN(t), uN(t))b0,N(t)dt, we have w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(x(t), u(t))b0,N(t)dt ≤ w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt + Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt − Z 1 0 Fx(x(t), u(t))b0,N(t)dt ≤ w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt + Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt − Z 1 0 Fx(x(t), u(t))b0,N(t)dt . (59) 23 time [s] 0 10 20 30 40 50 60 70 80 90 ang rate [rad/s] -0.2 -0.1 0 0.1 0.2 0.3 Figure 12: Multiple vehicles mission - spatial separation: angular rates. Using Lemma 3 and continuity of Fx(xN(t), uN(t)) and b0,N(t), the first term on the right hand side of the inequality above satisfies w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt ≤CIWFxb0,N (N −1 2 ), where WFxb0,N (·) is used to denote the modulus of continuity of the product Fx(xN(t), uN(t))b0,N(t), with Fx(xN(t), uN(t)) being a bounded function due to its continuity over a bounded domain. Denote its bound as Fx,max. Notice that also b0,N(t) is bounded, as maxt∈[0,1] b0,N(t) ≤1. Then using the properties of the modulus of continuity, we get w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt ≤CIFx,maxWb0,N (N −1 2 ) + CIWFx(N −1 2 ) ≤CIFx,maxN −1 2 + CIWFx(N −1 2 ), (60) where WFx(·) is the modulus of continuity of Fx, and CI is a positive constant independent of N. Furthermore, we have Z 1 0 Fx(xN(t), uN(t))b0,N(t)dt − Z 1 0 Fx(x(t), u(t))b0,N(t)dt ≤ Z 1 0 ∥Fx(xN(t), uN(t))b0,N(t) −Fx(x(t), u(t))b0,N(t)∥dt ≤LFx(CxWx(N −1 2 ) + CuWu(N −1 2 )) , (61) where LFx is the Lipschitz constant of Fx, Cx < 5nx/4, Cu < 5nu/4, and Wx(·) and Wu(·) are the moduli of continuity of x and u, respectively. Combining Equations (60) and (61) with Equation (59), yields w N X j=0 Fx(xN(tj), uN(tj))b0,N(tj) − Z 1 0 Fx(x(t), u(t))b0,N(t)dt ≤CIFx,maxN −1 2 + CIWFx(N −1 2 ) + LFx(CxWx(N −1 2 ) + CuWu(N −1 2 )) , (62) which proves the bound in Equation (45a). The bounds in Equations (45b)-(45d) follow easily using an identical argument. References [1] A. V. Rao, “A survey of numerical methods for optimal control,” Advances in the Astronautical Sciences, vol. 135, no. 1, pp. 497–528, 2009. [2] J. T. Betts, Practical methods for optimal control and estimation using nonlinear programming. SIAM, 2010. [3] ——, “Survey of numerical methods for trajectory optimization,” Journal of guidance, control, and dynamics, vol. 21, no. 2, pp. 193–207, 1998. [4] B. A. Conway, “A survey of methods available for the numerical optimization of continuous dynamic systems,” Journal of Optimization Theory and Applications, vol. 152, no. 2, pp. 271–306, 2012. 24 [5] E. Polak, “Optimization: Algorithms and consistent approximations,” 1997. [6] A. Schwartz and E. Polak, “Consistent approximations for optimal control problems based on runge–kutta integration,” SIAM Journal on Control and Optimization, vol. 34, no. 4, pp. 1235–1269, 1996. [7] I. M. Ross and M. Karpenko, “A review of pseudospectral optimal control: From theory to flight,” Annual Reviews in Control, vol. 36, no. 2, pp. 182–197, 2012. [8] F. Fahroo and I. M. Ross, “On discrete-time optimality conditions for pseudospectral methods,” in AIAA/AAS Astrodynamics Specialist Conference and Exhibit, 2006, p. 6304. [9] K. Bollino, L. R. Lewis, P. Sekhavat, and I. M. Ross, “Pseudospectral optimal control: a clear road for autonomous intelligent path planning,” in AIAA Infotech@ Aerospace 2007 Conference and Exhibit, 2007, p. 2831. [10] Q. Gong, R. Lewis, and M. Ross, “Pseudospectral motion planning for autonomous vehicles,” Journal of Guid- ance, Control, and Dynamics, vol. 32, no. 3, pp. 1039–1045, 2009. [11] N. S. Bedrossian, S. Bhatt, W. Kang, and I. M. Ross, “Zero-propellant maneuver guidance,” IEEE Control Systems, vol. 29, no. 5, 2009. [12] K. Bollino and L. R. Lewis, “Collision-free multi-uav optimal path planning and cooperative control for tactical applications,” in AIAA Guidance, Navigation and Control Conference and Exhibit, 2008, p. 7134. [13] N. Bedrossian, S. Bhatt, M. Lammers, L. Nguyen, and Y. Zhang, “First ever flight demonstration of zero propellant maneuver (tm) attitute control concept,” in AIAA Guidance, Navigation and Control Conference and Exhibit, 2007, p. 6734. [14] Y. Chen, M. Cutler, and J. P. How, “Decoupled multiagent path planning via incremental sequential convex programming,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 5954–5961. [15] F. Augugliaro, A. P. Schoellig, and R. D’Andrea, “Generation of collision-free trajectories for a quadrocopter fleet: A sequential convex programming approach,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012, pp. 1917–1922. [16] R. Choe, “Distributed cooperative trajectory generation for multiple autonomous vehicles using Pythagorean Hodograph B´ezier curves,” Ph.D. dissertation, University of Illinois at Urbana-Champaign, Urbana, IL, United States, 2017, submitted. [17] V. Cichella, I. Kaminer, C. Walton, and N. Hovakimyan, “Optimal motion planning for differentially flat systems using Bernstein approximation,” IEEE Control Systems Letters, vol. 2, no. 1, pp. 181–186, 2018. [18] E. Hewitt and R. E. Hewitt, “The Gibbs-Wilbraham phenomenon: an episode in fourier analysis,” Archive for history of Exact Sciences, vol. 21, no. 2, pp. 129–160, 1979. [19] A. Gelb and J. Tanner, “Robust reprojection methods for the resolution of the gibbs phenomenon,” Applied and Computational Harmonic Analysis, vol. 20, no. 1, pp. 3–25, 2006. [20] E. Tohidi, A. Pasban, A. Kilicman, and S. L. Noghabi, “An efficient pseudospectral method for solving a class of nonlinear optimal control problems,” in Abstract and Applied Analysis, vol. 2013. Hindawi, 2013. [21] I. M. Ross and F. Fahroo, “Pseudospectral knotting methods for solving nonsmooth optimal control problems,” Journal of Guidance, Control, and Dynamics, vol. 27, no. 3, pp. 397–405, 2004. [22] C. L. Darby, W. W. Hager, and A. V. Rao, “An hp-adaptive pseudospectral method for solving optimal control problems,” Optimal Control Applications and Methods, vol. 32, no. 4, pp. 476–502, 2011. [23] R. T. Farouki, “The Bernstein polynomial basis: a centennial retrospective,” Computer Aided Geometric Design, vol. 29, no. 6, pp. 379–419, 2012. [24] H. Gzyl and J. L. Palacios, “On the approximation properties of Bernstein polynomials via probabilistic tools,” Boletın de la Asociaci´on Matem´atica Venezolana, vol. 10, no. 1, pp. 5–13, 2003. [25] L. A. Ricciardi and M. Vasile, “Direct transcription of optimal control problems with finite elements on Bernstein basis,” Journal of Guidance, Control and Dynamics, 2018. [26] J.-W. Chang, Y.-K. Choi, M.-S. Kim, and W. Wang, “Computation of the minimum distance between two B´ezier curves/surfaces,” Computers & Graphics, vol. 35, no. 3, pp. 677–684, 2011. [27] P. J. Davis, “Interpolation and approximation,” Blaisdell Pub, Co., New York, vol. 1, 1963. [28] S. N. Bernstein, “D´emonstration du th´eor`eme de Weierstrass fond´ee sur le calcul des probabilit´es (demonstration of a theorem of Weierstrass based on the calculus of probabilities),” Communications de la Soci´et´e Math´ematique de Kharkov, vol. 13, no. 1, pp. 1–2, 1912. 25 [29] R. Bojanic and F. Cheng, “Rate of convergence of Bernstein polynomials for functions with derivatives of bounded variation,” Journal of Mathematical Analysis and Applications, vol. 141, no. 1, pp. 136–151, 1989. [30] T. Popoviciu, “Sur lapproximation des fonctions convexes dordre sup´erieur,” Mathematica (Cluj), vol. 10, pp. 49–54, 1935. [31] P. Sikkema, “Der wert einiger konstanten in der theorie der approximation mit bernstein-polynomen,” Numerische Mathematik, vol. 3, no. 1, pp. 107–116, 1961. [32] M. J. D. Powell, Approximation theory and methods. Cambridge university press, 1981. [33] M. S. Floater, “On the convergence of derivatives of Bernstein approximation,” Journal of Approximation Theory, vol. 134, no. 1, pp. 130–135, 2005. [34] E. G. Gilbert, D. W. Johnson, and S. S. Keerthi, “A fast procedure for computing the distance between complex objects in three-dimensional space,” IEEE Journal on Robotics and Automation, vol. 4, no. 2, pp. 193–203, 1988. [35] R. Choe, “Distributed cooperative trajectory generation for multiple autonomous vehicles using pythagorean hodograph B´ezier curves,” Ph.D. dissertation, University of Illinois at Urbana-Champaign, 2017. [36] R. F. Hartl, S. P. Sethi, and R. G. Vickson, “A survey of the maximum principles for optimal control problems with state constraints,” SIAM review, vol. 37, no. 2, pp. 181–218, 1995. [37] D. Garg, M. A. Patterson, C. Francolin, C. L. Darby, G. T. Huntington, W. W. Hager, and A. V. Rao, “Direct trajectory optimization and costate estimation of finite-horizon and infinite-horizon optimal control problems using a radau pseudospectral method,” Computational Optimization and Applications, vol. 49, no. 2, pp. 335– 358, 2011. [38] Q. Gong, I. M. Ross, W. Kang, and F. Fahroo, “Connections between the covector mapping theorem and con- vergence of pseudospectral methods for optimal control,” Computational Optimization and Applications, vol. 41, no. 3, pp. 307–335, 2008. [39] B. Singh, R. Bhattacharya, and S. R. Vadali, “Verification of optimality and costate estimation using hilbert space projection,” Journal of guidance, control, and dynamics, vol. 32, no. 4, pp. 1345–1355, 2009. [40] E. Tohidi, O. R. N. Samadi, and M. H. Farahi, “Legendre approximation for solving a class of nonlinear optimal control problems,” Journal of Mathematical Finance, vol. 1, no. 01, p. 8, 2011. [41] W. Kang, Q. Gong, and I. M. Ross, “Convergence of Pseudospectral Methods for a Class of Discontinuous Optimal Control,” in 2005 IEEE 44th Conference on Decision and Control (CDC), Seville, Spain, 2005, pp. 2799–2804. [42] Q. Gong, W. Kang, N. S. Bedrossian, F. Fahroo, P. Sekhavat, and K. Bollino, “Pseudospectral optimal control for military and industrial applications,” in Decision and Control, 2007 46th IEEE Conference on. IEEE, 2007, pp. 4128–4142. 26