Towards Cooperative Motion Planning for Automated Vehicles in Mixed Traffic Maximilian Naumann1 and Christoph Stiller1,2 Abstract— While motion planning techniques for au- tomated vehicles in a reactive and anticipatory manner are already widely presented, approaches to cooperative motion planning are still remaining. In this paper, we present an approach to enhance common motion planning algorithms, that allows for cooperation with human-driven vehicles. Unlike previous approaches, we integrate the prediction of other traffic participants into the motion planning, such that the influence of the ego vehicle’s behavior on the other traffic participants can be taken into account. For this purpose, a new cost functional is presented, containing the cost for all relevant traffic participants in the scene. Finally, we propose a path-velocity-decomposing sampling-based implementation of our approach for selected scenarios, which is evaluated in a simulation. I. INTRODUCTION In the field of intelligent vehicles, tremendous progress has been achieved in the last decades [1]. With the first successful experiments of close-to-production cars in real traffic [2], automated driving has gained more and more attention in public. In order to improve the reliability and thus the safety of automated vehicles, but also to increase their effi- ciency, cooperation is focused on in recent research. Here, cooperation through explicit communication of (fused) sensor information and desired driving behaviour [3] as well as negotiation of possible solutions [4], [5], [6] or centralized approaches [7] are frequently addressed. However, as reported in [8], cooperative behavior does not require V2X-communication. Furthermore, as auto- mated vehicles will share the road with human-driven cars at least at the beginning, cooperation with human drivers in non-V2X-equipped cars is essential. Also, a natural, cooperative, human-like behavior of automated vehicles potentially increases their social acceptance. To the best of the authors’ knowledge, previous motion planning approaches treated other traffic participants as obstacles which are to be avoided, similar to static obstacles like parked cars [2], [9]. While such approaches can deal with many everyday situations, such as driv- ing autonomously or following other vehicles, some ma- neuvers, such as overtaking with oncoming traffic or *We gratefully acknowledge support of this work by the Tech Center a-drive 1The authors are with FZI Research Center for Information Technology, Mobile Perception Systems, 76131 Karlsruhe, Germany naumann@fzi.de 2The author is also with Karlsruhe Institute of Technology (KIT), Institute of Measurement and Control, 76131 Karlsruhe, Germany stiller@kit.edu (a) without signposted right of way (b) with signposted right of way Fig. 1: Narrowing, with and without signposted right of way. passing a narrowing (cf. Figure 1), require combinato- rial approaches, as already reported by [2]. Still, even with combinatorial considerations as proposed by [10], [11], cooperative behavior cannot be implemented: If the motion prediction of other traffic participants is done isolated from the motion planning for the ego-vehicle, the behavior can be foresighted, but not cooperative in a bidirectional manner [8]. According to a study about German road traffic, cooperative behavior on average only occurs in the scale of one cooperative action per hour per traffic participant [12]. Thus, their treatment by a separate method, besides the conventional motion planning, is reasonable. This paper addresses the problem of cooperative mo- tion planning without V2X-communication. We propose a cost functional for trajectory ensembles, consisting of one trajectory per vehicle. Thereby, we acknowledge the fact that not only the behavior of other traffic participants affects us, but also our behavior affects the others in a closed loop. We consider the motion planning problem as the problem to find a globally optimal solution for a spe- cific situation, knowing that every traffic participant has a different viewpoint considering optimality. The costs depend on vehicle dynamics, passenger comfort, driving intention and trajectory clearance, as well as the traffic regulations, as further outlined in Section II. In this approach, the prediction of other traffic participants is integrated into the motion planning. As the assumption of cooperative behavior might be violated by some traffic participants, this risk is assessed and the trajectory is only driven if a safe "plan B" [13] trajectory is still pos- sible in case of unexpected behavior. An implementation of this approach is presented in Section III. The proposed algorithm is finally evaluated in Section IV. arXiv:1708.06962v1 [cs.RO] 23 Aug 2017 II. GLOBAL OPTIMUM APPROACH This section introduces the main building blocks of our approach to cooperative motion planning. Central to this approach is the assumption that all traffic par- ticipants are aware of each other and therefore react on each other’s behavior in a closed loop. Subsequently, the trajectories for all relevant traffic participants are considered as one trajectory ensemble, and the quality of the solution depends on the trajectory of every partic- ipant separately as well as on the pairwise relation of the trajectories among each other. This section is structured as follows: First, the repre- sentation of one trajectory in the ensemble is introduced. Subsequently, the cost functional is introduced. Next, before a solution is selected, the limitations to this approach are treated by a "plan B". A. Behavior Policy Cooperative motion planning is aware of the inter- action of traffic participants. Therefore, wrong assump- tions concerning the behavior of other traffic participants might cause undesired behavior. Even though, theoret- ically, any feasible behavior is possible, the authors make the following assumption: Every traffic participant follows the traffic regulations, as long as this compliant behavior is physically feasible. Consequently, assuming perfect perception, a collision involving our vehicle can only be caused by violating the traffic regulation without foreseeable reason while our reaction at the time of violation is insufficient to avoid the collision. Arising from this assumption, we pursue the following policies: • If we have to give way, we can exclude a collision independent of others’ behavior. • If we have the right of way or the situation is not clearly regulated, we can exclude a collision if others behave rule compliant. B. Trajectory Representation For the representation of a single, deterministic tra- jectory, the established method of [14] is chosen: The trajectory x(t) = (x(t), y(t))T is a mapping R →R2, with tangent angle ψ and curvature κ. A trajectory ensemble consists of one trajectory per traffic participant: X = (x1,x2,...), where the superscript describes the partici- pants identifier. C. Cost Functional As proposed in [8], the quality of a solution, given by a trajectory ensemble, is determined by a cost functional. The lowest costs denote the best solution. Costs exceed- ing a certain value represent an infeasible solution. The cost functional is the sum of the costs of every traffic participant i Gtotal = X i Gi. The costs Gi pursue two main goals: They ensure the feasibility of the trajectory but also rate its comfort and effectiveness for a single car. For this reason, the prop- erties of the trajectory, such as velocity and acceleration, are rated with multiple evaluation functionals: The feasibility costs exceed a certain bound if a tra- jectory is physically not feasible. The pleasantness costs reflect the wish of the passenger to travel steady and comfortable, including the perceived safety of the jour- ney. Furthermore, the costs should motivate compliance with the traffic regulations. Not yielding is avoided by upscaling the costs of the vehicle that has the right of way in the pairwise trajectory costs. In this approach, the ability to cooperate is associated with the ability to estimate the cost or quality of a solution for other traffic participants. With the above information, the costs Gi per partic- ipant can be split into costs Gi,0 that only concern the own trajectory and costs Gi,j that consider the relation to other trajectories: Gi = Gi,0 + X j Gi,j 1) Formulation of the trajectory properties: Analog to [14] the properties of the trajectory that are examined by the evaluation functionals are • the velocity v(t) = ˙x(t) • the acceleration a(t) = ¨x(t) • the jerk j(t) = ...x(t) • the distance to the left and right driving corridor bound dleft(x(t)) and dright(x(t)) • the yaw rate ω(t) = ˙ψ(t) and • the curvature κ(t) . Additionally, properties of trajectory pairs describe their distance to each other. The shortest spatial distance is described by dmin(x1(t),x2(t)) = min t ¡ d(x1(t),x2(t),t) ¢ , where d denotes a distance measure between states of different vehicles. To account for the perceived safety, but also to obey the traffic regulations, another property is introduced. Here, we can make use of time-referenced measures, as they equal a velocity-referenced spatial distance measure. In general, a collision is only possible if paths overlap. When determining the criticality, respectively the collision risk, of two trajectories, their closest point in time and space is crucial. Regarding a violation of the right of way, Cooper investigated the post encroachment time (PET) for specific scenarios [15]. Based on the latter, also regarding the potential collision zone, we propose the time of zone clearance (TZC) as a measure for the criticality of two trajectories with overlapping paths: The TZC is the time that elapses between the first vehicle leaving potential collision zone and the second vehicle entering this area, independent of the right of way (cf. Figure 2). (a) blue vehicle drove first (b) black vehicle drove first Fig. 2: The TZC is the time that the second vehicle takes to enter the red potential collision zone, assuming constant velocity. Given the paths are overlapping and given the trajec- tories are not colliding, the TZC is calculated as follows: TZC = TZC(xfirst(t),xsecond(t)) = gap along path velocity of the second vehicle = ssecond(tsecond,in)−ssecond(tfirst,out) vsecond(tfirst,out) with ssecond being the path of the vehicle that passes the collision zone second, vsecond being the scalar velocity along this path, tfirst,out being the time at which the first vehicle clears the collision zone and tsecond,in being the time at which the second vehicle enters the collision zone. Constant velocity is chosen as passengers cannot foresee the planned trajectory and as it reflects possible actions (maximum deceleration or acceleration) best. If the paths do not overlap, the TZC is defined to be infinite, if the trajectories collide, it is less or equal zero. 2) Formulation of the evaluation functionals: As in this work the costs are also calculated for human-driven cars in order to predict their behavioral decisions, they should reflect humans’ understanding of the quality of a trajectory. Therefore, the previously introduced scalar trajectory properties f (X) are investigated. Vectorial properties, such as the acceleration, are therefore split into their longitudinal and lateral part, using a motion model. The costs of a trajectory are subdivided into three zones: • comfort zone Zcomf • discomfort zone Zdisc • infeasibility zone Zinf each for positive (+) and negative (−) deviation from the optimum fopt. The functionals G(f ) expressing the costs induced by a trajectory property f are called evaluation functionals. For the sake of steadiness and piecewise differentia- bility, all costs are starting from zero at their lower bound but do not vanish at the start of the next zone. Accordingly, the total costs G are defined as G(f ) =      Gcomf , f ∈Zcomf Gcomf +Gdisc , f ∈Zdisc Gcomf +Gdisc +Ginf , f ∈Zinf. The comfort component induces only little costs G+ comf (f ) = a+ · ¡ ∆f + comfort ¢2 depending on the distance of f to the optimal value ∆f + comf = ∆f + comf(X) = f (X)−fopt. Consequently, given a comfort threshold Tcomf and as- suming a comfortable deviation ∆f + cmargin, the parameter a+ is to be set to a+ = Tcomf ³ ∆f + cmargin ´2 . The costs G− comf for comfortable negative deviation are calculated correspondingly with the parameter a−. The discomfort costs rise quadratic, but direction- dependent: G+ disc (f ) = b+ · ¡ ∆f + disc ¢2 depending on the distance of f to the upper start of the discomfort zone f + disc,start ∆f + disc = ∆f + disc(X) = f (X)−f + disc. For logical reasons, the parameter b+ should be notably higher than a+. Negative deviations are treated corre- spondingly with the parameter b−. Before the property represents the infeasibility of a trajectory, the infeasibility costs rise exponentially G+ inf (f ) = c+ · ¡ ∆f + inf ¢2 · e|∆f + inf| depending on the distance of f to the upper infeasible value f + inf minus a margin f + margin from which the costs start rising ∆f + inf = ∆f + inf(X) = f (X)− ³ f + inf −∆f + margin ´ . Consequently, given an infeasibility threshold Tinf and assuming a margin ∆f + imargin, the parameter c+ is to be set to c+ = Tinf ³ ∆f + imargin ´2 · e|∆f + imargin| . Further, the infeasibility zone Zinf includes the margin in this notation. Again, negative deviations are treated correspondingly with the parameter c−. Gcomfort Gdiscomfort Ginfeasible G fopt f + disc, start f + inf, start Fig. 3: Composition of the cost function G for a single trajectory property f : Very low costs around the optimum value fopt, increasing rapidly in close vicinity of finf. 3) Formulation of the cost functional: With the evalua- tion functionals, the cost functional for a single property f is composed as follows (cf. Figure 3): G (f ) = G+ comf ·σ(∆f + comf)+G− comf ·σ(−∆f − comf) + G+ disc ·σ(∆f + disc)+G− disc ·σ(−∆f − disc) + G+ inf ·σ(∆f + inf)+G− inf ·σ(−∆f − inf), where σ denotes the step function. The right of way of i over j is acknowledged by adding the comfort-related costs of vehicle i, upscaled with factor u, to the pairwise trajectory costs, if i has the right of way: Gi,j,row = u ¡ Gi,0,comf +Gi,0,disc ¢ . A suitable choice of u ensures that the right of way is heeded, but its violation is still feasible, as stated in Section II-A. The full cost functional is composed as follows: Gtotal(X) = X i à Gi,0(xi)+ X j Gi,j(xi,xj) ! with singleton trajectory costs for vehicle i Gi,0(xi) = Gv +Ga +Gj +Gω +Gκ +Goffset and pairwise trajectory costs for vehicle i due to vehicle j Gi,j(xi,xj) = GTZC +Gdmin +Grow. D. Plan B In order to obey our previously introduced policy, plan B trajectories are to be checked, as proposed in [13]. By doing so, we avoid maneuvering into situations that lead to collisions, if we made wrong assumptions concerning the behavior of other traffic participants. As their execution is unlikely, we accept discomfortable but feasible trajectories. This corresponds to a neglection of the comfort terms in the upper cost functional. As with the previous trajectories, plan B trajectories can be cal- culated via a local continuous method [14], a sampling- based method such as RRT∗[16] or other approaches. E. Selection of Solution As for passenger comfort, the evaluation of the TZC should already cause high discomfort costs at around 2 s, a security margin is induced intrinsically by this approach. Thus, even a very small optimum, represented by a small range of minimal costs, does not equal a physically optimal trajectory, that would pass objects as close as possible in space-time. Rather, it already contains those security margins that are considered comfortable by humans and that consequently should be feasible with measurement uncertainties in the range of human perception errors. Hence, the optimum point can be chosen independent of its wideness, as long as a valid plan B protects the approach against consequences of wrong assumptions. III. IMPLEMENTATION In the following, a first approach for cooperative mo- tion planning in specific situations, based on the previ- ously introduced cost functional, is presented. A. Path-Velocity Decomposition Several potentially cooperative situations have highly constraint driving corridors for the traffic participants, independent of the order and number of traffic partici- pants. Consequently, we make use of the path-velocity decomposition (PVD), as introduced by [17]. The calcu- lation of paths in static environments has already been widely investigated. Hence, valid paths are considered predefined (cf. Figure 4) and the implementation focuses on the velocity profiles along the paths. B. Sampling As the optimization problem is non-convex, but the control variable for the velocity of each vehicle is only one-dimensional, a classical sampling approach is cho- sen. Therefore, the trajectories x(t) are approximated by discretization in equidistant time steps: xi = x(ti), ti = t0 + i∆t. For each car, multiple trajectories are sampled: Start- ing with an initial position and velocity, a random jerk sequence determines the velocity profiles and thus the trajectory. Next, the overall costs of each trajectory en- semble are calculated. For the solutions with the lowest costs, the plan B trajectory is checked until a valid plan B is found. C. Plan B Instead of using a different planning method with the assumption or classical prediction of disadvantageous behavior of others, we again make use of the PVD: Given the paths, a collision is only possible in particular areas that can be determined a priori. Thus, unlike in [13], no trajectory has to be planned. Rather, the plan B- consideration can be seen as a “what could I do if”- consideration. The key questions are: In every time step, what could the other vehicle do that leads to a collision with us? And what could we do to avoid this? This consideration can be split into the following cases: 1) Other vehicle drives first: If the other vehicle drives first, it can only cause a collision by deceleration. In reaction, we can decelerate as well. If we can manage to stop before the collision zone, we have a valid plan B. 2) Ego vehicle drives first: If the ego vehicle drives first, the other vehicle can only cause a collision by acceleration. In reaction, we can also accelerate, to still drive first, or decelerate to stop before the other vehicle collides with us. As the path is regarded as predefined, changing the path is not considered. D. Implications on the cost functional Since in this implementation, trajectories are dis- cretized in time, derivatives are approximated by finite differences. Thus, the functionals of section II-C turn into functions. Consequently, trajectory properties that depend on a single minimum, such as TZC, can be largely affected if this minimum is not sampled. In order to avoid this, either the sampling rate must be sufficiently high, or the point of the exact minimum has to be interpolated. As this implementation is not based on linear optimization but on sampling, we interpolate the crucial points. The jerk is not considered to avoid high order deriva- tives. Also, the curvature itself is not considered as the predefined path guarantees the compliance with the steering geometry. However, it is used to calculate the lateral acceleration values. Furthermore, the shortest spatial distance dmin either lies in the collision zone and is considered by the TZC, or it is not relevant. Hence, it is neglected as well. E. Selection of Solutions As explained in section II-E, criticality protection is ensured via the costs of the TZC and the check for a plan B. Consequently, the solution with the lowest costs and a valid plan B is selected and its ego trajectory is executed, as long as its costs do not exceed the feasibility- threshold. In case no solution has a valid plan B, an emergency braking maneuver is triggered. Note: In in- car applications, the parallel running classical, reactive motion-planner would have to take over control in this case. IV. EVALUATION In this section, the method outlined in section III is evaluated for two scenarios, a left turn at a T-intersection and passing through a narrowing of the road (cf. Figures 1 and 4). (a) (b) Fig. 4: Left turn at T-junction, with and without signposted right of way and predefined paths. A. Simulation For both scenarios, each with and without signposted right of way, but sharing the same paths, velocity profiles were sampled. From the resulting trajectories, ensembles with one trajectory per vehicle were generated. In order to reduce computational cost, trajectories that did not reach the end of the collision zone were excluded from the cost calculation. Furthermore, colliding trajectory ensembles were excluded. The remaining ensembles were analyzed with respect to • comfort costs • discomfort costs • infeasibility costs • traffic regulation costs. B. Analysis As depicted in Figure 5 and 6, the initial states were chosen in a way that the optima of both vehicles overlap in the collision zone. In the T-junction scenario, the right of way is regulated with and without traffic signs. A violation of the right of way causes high costs so the optimal solution is following the rules. The trajectory of the vehicle that has right of way is not interfered (cf. Figure 5 (2) and (3)). In the narrowing scenario, the right of way is not regulated without traffic signs. Here, due to equal cost parameters, the vehicle that is closer to the narrowing passes first. Still, traffic signs can overrule this globally most comfortable solution and shift the optimum (cf. Figure 1 and 6 (3)). If a collision can only be avoided by one of the vehicles, as the other is too close to the collision zone, the optimal solution is the collision avoidance. Even though this violates the traffic regulations, the infeasibility costs overrule discomfort costs and traffic regulation costs. Further, if we do not interfere a vehicle that has the right of way, its costs Grow are constantly high, but not raised by our behavior. In this case, the optimal solution is that we pass first, without violation of traffic rules. V. CONCLUSIONS AND FUTURE WORK In this paper, we presented a new approach to coop- erative motion planning, able to cooperate with human Fig. 5: Minimum cost trajectories in the T-junction scenario with the collision zone marked in grey: (1) for each vehicle solely on the road, (2) when upper vehicle has right of way (Fig. 4a), (3) when lower vehicle has right of way (Fig. 4b). Fig. 6: Minimum cost trajectories in the narrowing scenario with the collision zone marked in grey: (1) for each vehicle solely on the road, (2) when no right of way predefined (Fig. 1a), (3) when left vehicle has right of way (Fig. 1b). drivers and automated vehicles without requiring V2X- communication. While the approach is valid for two- dimensional motion planning, our first implementation covers several scenarios deploying PVD. The preliminary results for the simulated scenarios demonstrate that the method produces safe and com- fortable cooperative trajectories in a narrowing and a typical intersection scenario. Individual trajectory costs have been extended by costs accounting for mutual comfort and safety of any pair of trajectories. Other traffic participants have been taken into account by incorporating their individual costs. The total trajectory costs for each participant have been segmented into three areas representing comfortable driving, uncomfortable driving and collision/infeasibility. Future work includes real time implementation and on-road experiments with our vehicle "BerthaOne". Sev- eral parametrizations will be used for the cost functional, considering different vehicle types and driver behaviors. Furthermore, probabilistic trajectories will be accommo- dated to account for inherent uncertainties in perception and behavior. REFERENCES [1] K. Bengler, K. Dietmayer, B. Färber, M. Maurer et al., “Three Decades of Driver Assistance Systems - Review and Future Perspectives,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 4, pp. 6–22, 2014. [2] J. Ziegler, P. Bender, M. Schreiber, H. Lategahn et al., “Making Bertha Drive - An Autonomous Journey on a Historic Route,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 2, pp. 8–20, 2014. [3] C. Englund, L. Chen, J. Ploeg, E. Semsar-Kazerooni et al., “The Grand Cooperative Driving Challenge 2016: boosting the introduction of cooperative automated vehicles,” IEEE Wireless Communications, vol. 23, no. 4, pp. 146–152, August 2016. [4] D. Carlino, S. D. Boyles, and P. Stone, “Auction-based autonomous intersection management,” in IEEE Int. Conf. on Intell. Transp. Syst., Oct 2013, pp. 529–534. [5] M. Elhenawy, A. A. Elbery, A. A. Hassan, and H. A. Rakha, “An Intersection Game-Theory-Based Traffic Control Algorithm in a Connected Vehicle Environment,” in IEEE Int. Conf. on Intell. Transp. Syst., Sept 2015, pp. 343–347. [6] H. Rewald and O. Stursberg, “Cooperation of autonomous vehicles using a hierarchy of auction-based and model-predictive control,” in Proc. of the IEEE Intell. Vehicles Symposium, June 2016, pp. 1078–1084. [7] S. Manzinger, M. Leibold, and M. Althoff, “Driving Strategy Selection for Cooperative Vehicles using Maneuver Templates,” in Proc. of the IEEE Intell. Vehicles Symposium, 2017. [8] M. Naumann, P. Orzechowski, C. Burger, O. S. Tas, and C. Stiller, “Herausforderungen für die Verhaltensplanung kooperativer automatischer Fahrzeuge,” in AAET Automatisiertes und vernetztes Fahren. Braunschweig, Germany: ITS automotive nord e.V., Feb 2017, pp. 287–307. [Online]. Available: http: //www.mrt.kit.edu/z/publ/download/2017/Naumann2017AAET.pdf [9] D. González, J. Pérez, V. Milanés, and F. Nashashibi, “A Review of Motion Planning Techniques for Automated Vehicles,” IEEE Transactions on Intell. Transp. Syst., vol. 17, no. 4, pp. 1135–1145, April 2016. [10] P. Bender, Ö. ¸S. Ta¸s, J. Ziegler, and C. Stiller, “The combinatorial aspect of motion planning: Maneuver variants in structured en- vironments,” in IEEE Intell. Vehicles Symposium (IV), June 2015, pp. 1386–1392. [11] X. Qian, F. Altché, P. Bender, C. Stiller, and A. de La Fortelle, “Optimal trajectory planning for autonomous driving integrating logical constraints: An MIQP perspective,” in IEEE Int. Conf. on Intell. Transp. Syst., Nov 2016, pp. 205–210. [12] A. Benmimoun, D. Neunzig, and C. Maag, “Effizienzsteigerung durch professionnelles/partnerschaftliches Verhalten im Strassen- verkehr,” FAT-Schriftenreihe, no. 181, 2004. [13] F. Damerow and J. Eggert, “Risk-Aversive Behavior Planning under Multiple Situations with Uncertainty,” in IEEE Int. Conf. on Intell. Transp. Syst., Sept 2015, pp. 656–663. [14] J. Ziegler, P. Bender, T. Dang, and C. Stiller, “Trajectory planning for Bertha - A local, continuous method,” in IEEE Intell. Vehicles Symposium Proc., June 2014, pp. 450–457. [15] P. Cooper, “Experience with traffic conflicts in Canada with em- phasis on “post encroachment time” techniques,” in International calibration study of traffic conflict techniques. Springer, 1984, pp. 75–96. [16] S. Karaman and E. Frazzoli, “Sampling-based Algorithms for Optimal Motion Planning,” Int. J. Rob. Res., vol. 30, no. 7, pp. 846–894, Jun. 2011. [17] K. Kant and S. W. Zucker, “Toward efficient trajectory planning: The path-velocity decomposition,” Int. J. Rob. Res., vol. 5, no. 3, pp. 72–89, 1986.