arXiv:1703.08922v5 [cs.AI] 17 Jul 2017 On Automating the Doctrine of Double Effect Naveen Sundar Govindarajulu and Selmer Bringsjord Rensselaer Polytechnic Institute, Troy, NY { naveensundarg,selmer.bringsjord } @gmail.com Abstract The doctrine of double effect ( DDE ) is a long- studied ethical principle that governs when actions that have both positive and negative effects are to be allowed. The goal in this paper is to automate DDE . We briefly present DDE , and use a first- order modal logic, the deontic cognitive event cal- culus , as our framework to formalize the doctrine. We present formalizations of increasingly stronger versions of the principle, including what is known as the doctrine of triple effect . We then use our framework to successfully simulate scenarios that have been used to test for the presence of the prin- ciple in human subjects. Our framework can be used in two different modes: One can use it to build DDE -compliant autonomous systems from scratch; or one can use it to verify that a given AI system is DDE -compliant, by applying a DDE layer on an existing system or model. For the latter mode, the underlying AI system can be built using any architecture (planners, deep neural networks, bayesian networks, knowledge-representation sys- tems, or a hybrid); as long as the system exposes a few parameters in its model, such verification is possible. The role of the DDE layer here is akin to a (dynamic or static) software verifier that ex- amines existing software modules. Finally, we end by sketching initial work on how one can apply our DDE layer to the STRIPS-style planning model, and to a modified POMDP model. This is prelimi- nary work to illustrate the feasibility of the second mode, and we hope that our initial sketches can be useful for other researchers in incorporating DDE in their own frameworks. 1 Introduction The doctrine of double effect ( DDE ) is a long-studied ethi- cal principle that enables adjudication of ethically “thorny” situations in which actions that have both positive and negative effects appear unavoidable for autonomous agents [McIntyre, 2014]. Such situations are commonly called moral dilemmas . The simple version of DDE states that such actions, performed to “escape” such dilemmas, are allowed — provided that 1) the harmful effects are not intended; 2) the harmful effects are not used to achieve the beneficial effects (harm is merely a side -effect); and 3) benefits outweigh the harm by a significant amount. What distinguishes DDE from, say, na ̈ ıve forms of consequentialism in ethics (e.g. act utilitarianism, which holds that an action is obligatory for an autonomous agent if and only if it produces the most util- ity among all competing actions) is that purely mental inten- tions in and of themselves, independent of consequences, are considered crucial (as condition 2 immediately above con- veys). Of course, every major ethical theory, not just con- sequentialism, has its passionate proponents; cogent surveys of such theories make this plain (e.g. see [Feldman, 1978]). Even in machine ethics, some AI researchers have explored not just consequentialism and the second of the two domi- nant ethical theories, deontological ethics (marked by an em- phasis on fixed and inviolable principles said by their de- fenders to hold no matter what the consequences of abrogat- ing them), but more exotic ones, for example contractualism (e.g. see [Pereira and Saptawijaya, 2016b]) and even divine- command ethics (e.g. see [Bringsjord and Taylor, 2012]). DDE in a sense rises above philosophical debates about which ethical theory is preferred. The first reason is that empirical studies have found that DDE plays a prominent role in an ordinary person’s ethical decisions and judgments [Cushman et al. , 2006]. For example, in [Hauser et al. , 2007], a large number of participants were asked to decide between action and inaction on a series of moral dilemmas, and their choices adhered to DDE , irre- spective of their ethical persuasions and backgrounds, and no matter what the order in which the dilemmas were pre- sented. In addition, in legal systems, criminality requires the presence of malicious intentions [Fletcher, 1998], and DDE plays a central role in many legal systems [Allsopp, 2011; Huxtable, 2004]. 1 Assuming that autonomous systems will be expected to adjudicate moral dilemmas in human-like ways, and to justify such adjudication, it seems desirable to seek science and engineering that allows DDE , indeed even nuanced, robust versions thereof, to be quickly computed. 1 On the surface, criminal negligence might seem to require no intentions. While that might be true, even in criminal negligence it seems rational to ask whether the negligence was accidental or something the “suspect” had control over. This suggests a milder form of intention, or something similar, but not exactly intention. 2 Prior Work We quickly review prior rigorous modeling of DDE . Mikhail in [Mikhail, 2011] presents one of the first careful treatments of the doctrine. While the presentation of the doc- trine makes use of some symbolism, the level of formalization is not amenable to automation. [Bentzen, 2016] presents a model-theoretic formalization of a simple version of the doc- trine. While this is an important first step, the calculus pre- sented by Bentzen does not have any computational realiza- tion. However, there are two independent strands of research with implementations for DDE : that of Berreby et al. [2015] and Pereira and Saptawijaya [2016a]; both use logic program- ming. Notably, while the Berreby et al. explicitly eschew counterfactuals for modeling DDE , Pereira and Saptawijaya model DDE using counterfactuals. To our knowledge, both the projects present one of the first formal models of DDE that can be implemented. It should be noted, however, that both of these formal sys- tems are extensional , and it is well-known that when dealing with intensional states such as knowledge, belief, intention etc., extensional systems can quickly generate inconsisten- cies [Bringsjord and Govindarajulu, 2012] (see the appendix for more details). The expressivity challenge is both quantifi- cational and intensional; this challenge is acute for the logic- programming paradigm, as opposed to one based — as is ours — on formal languages beyond first-order logic and its vari- ants, and proof theories beyond resolution and its derivatives. In particular, DDE requires elaborate structures for quantifi- cation (including, inevitably, first-order numerical quantifiers such as ∃ k : k ∈ R , since quantification over utilities is essen- tial), and many intensional operators that range over quanti- fiers, starting with the epistemic ones. Needless to say, mod- eling and simulation at the propositional level, while truly excellent in the case of [Pereira and Saptawijaya, 2016a], is insufficiently expressive. Among the many empirical experiments centered around DDE , the one in [Malle et al. , 2015] deserves a mention. Malle et al. devise an experiment in which they place either a human or a robot as the central actor in a hypothetical DDE scenario, and study an external viewer’s moral judgement of action or inaction by the human or robot. This study shows that humans view ethical situations differently when robots participate in such situations; and the study demonstrates the need for rigorous modeling of DDE to build well-behaved autonomous systems that function in DDE -relevant scenar- ios. 3 The Calculus In this section, we present the deontic cognitive event calcu- lus ( DC EC ). Dialects of this calculus have been used to for- malize and automate highly intensional reasoning processes, such as the false-belief task [Arkoudas and Bringsjord, 2008] and akrasia (succumbing to temptation to violate moral prin- ciples) [Bringsjord et al. , 2014]. 2 DC EC is a sorted (i.e. typed) quantified modal logic (also known as sorted first- order modal logic). The calculus has a well-defined syntax 2 Arkoudas and Bringsjord [2008] introduced the general family of cognitive event calculi to which DC EC belongs. and proof calculus; see [Bringsjord et al. , 2014]. The proof calculus is based on natural deduction [Gentzen, 1935], and includes all the introduction and elimination rules for first- order logic, as well as inference schemata for the modal op- erators and related structures. A snippet of DC EC is shown in the Appendix. 3.1 Syntax First-order Fragment The first-order core of DC EC is the event calculus [Mueller, 2006]. Though we use the event calculus, our ap- proach is compatible with other calculi (e.g. the situation cal- culus ) for modeling events and their effects. Modal Fragment The modal operators present in the calculus include the stan- dard operators for knowledge K , belief B , desire D , inten- tion I , etc. The general format of an intensional operator is K ( a , t , φ ) , which says that agent a knows at time t the propo- sition φ . Here φ can in turn be any arbitrary formula. The calculus also includes a dyadic deontic operator O . The unary ought in standard deontic logic is known to lead to contradictions. Our dyadic version of the operator blocks the standard list of such contradictions, and beyond. 3 3.2 Semantics First-order Fragment The semantics for the first-order fragment is the standard first-order semantics. The truth-functional connectives ∧ , ∨ , → , ¬ and quantifiers ∀ , ∃ for pure first-order formulae all have the standard first-order semantics. Modal Fragment The semantics of the modal operators differs from what is available in the so-called Belief-Desire-Intention (BDI) log- ics [Rao and Georgeff, 1991] in many important ways. For example, DC EC explicitly rejects possible-worlds seman- tics and model-based reasoning, instead opting for a proof- theoretic semantics and the associated type of reasoning commonly referred to as natural deduction [Gentzen, 1935; Francez and Dyckhoff, 2010]. Briefly, in this approach, meanings of modal operators are defined via arbitrary com- putations over proofs, as we will soon see. Reasoner (Theorem Prover) Reasoning is performed through a novel first-order modal logic theorem prover, ShadowProver , which uses a tech- nique called shadowing to achieve speed without sacrific- ing consistency in the system. Extant first-order modal logic theorem provers that can work with arbitrary inference schemata are built upon first-order theorem provers. They achieve the reduction to first-order logic via two methods. In the first method, modal operators are simply represented by first-order predicates. This approach is the fastest but 3 A nice version of the list is given lucidly in [McNamara, 2010]. can quickly lead to well-known inconsistencies as demon- strated in [Bringsjord and Govindarajulu, 2012]. In the sec- ond method, the entire proof theory is implemented intri- cately in first-order logic, and the reasoning is carried out within first-order logic. Here, the first-order theorem prover simply functions as a declarative programming system. This approach, while accurate, can be excruciatingly slow. We use a different approach, in which we alternate between calling a first-order theorem prover and applying modal inference schemata. When we call the first-order prover, all modal atoms are converted into propositional atoms (i.e., shadow- ing), to prevent substitution into modal contexts. This ap- proach achieves speed without sacrificing consistency. The prover also lets us add arbitrary inference schemata to the calculus by using a special-purpose language. While we use the prover in our simulations, describing the prover in more detail is out of scope for the present paper. 4 4 Informal DDE We now informally but rigorously present DDE . We as- sume we have at hand an ethical hierarchy of actions as in the deontological case (e.g. forbidden, neutral, obligatory); see [Bringsjord, 2017]. We also assume that we have a utility or goodness function for states of the world or effects as in the consequentialist case. For an autonomous agent a , an ac- tion α in a situation σ at time t is said to be DDE -compliant iff : C 1 the action is not forbidden (where we assume an ethical hier- archy such as the one given by Bringsjord [2017], and require that the action be neutral or above neutral in such a hierarchy); C 2 The net utility or goodness of the action is greater than some positive amount γ ; C 3 a the agent performing the action intends only the good effects; C 3 b the agent does not intend any of the bad effects; C 4 the bad effects are not used as a means to obtain the good ef- fects; and C 5 if there are bad effects, the agent would rather the situation be different and the agent not have to perform the action. That is, the action is unavoidable. See Clause 6 of Principle III in [Khatchadourian, 1988] for a justification of of C 5 . This clause has not been discussed in any prior rigorous treatments of DDE , but we feel C 5 cap- tures an important part of when DDE is normally used, e.g. in unavoidable ethically thorny situations one would rather not be present in. C 5 is necessary, as the condition is subjunc- tive/counterfactual in nature and hence may not always follow from C 1 − C 4 , since there is no subjunctive content in those conditions. Note that while [Pereira and Saptawijaya, 2016a] model DDE using counterfactuals, they use counterfactuals to model C 4 rather than C 5 . That said, the formalization of C 5 is quite difficult, requir- ing the use of computationally hard counterfactual and sub- 4 The prover is available in both Java and Common Lisp and can be obtained at: https://github.com/naveensundarg/prover. The underlying first-order prover is SNARK available at: http://www.ai.sri.com/ ∼ stickel/snark.html. junctive reasoning. We leave this aside here, reserved for fu- ture work. 5 Formal DDE The formalization is straightforward given the machinery of DC EC . Let Γ be a set of background axioms, which could include whatever the given autonomous agent under consid- eration knows about the world; e.g., its understanding of physics, knowledge and beliefs about other agents and itself, etc. The particular situation that might be in play, e.g., “the autonomous agent is driving,” is represented by a formula σ . We use ground fluents for effects. We assume that we have a utility function μ that maps from fluents and times to real-number utility values. μ needs to be defined only for ground fluents: μ : Fluent × Moment → R Good effects are fluents with positive utility, and bad ef- fects are fluents that have negative utility. Zero-utility fluents could be neutral fluents (which do not have a use at the mo- ment). 5.1 Defining means ⊲ The standard event calculus and DC EC don’t have any mechanism to say when an effect is used as a means for an- other effect. While we could employ a first-order predicate and define axiomatically when an effect is used as a means for another effect, we take a modal approach that does not re- quire any additional axioms beyond what is needed for mod- eling a given situation. Intuitively, we could say an effect e 1 is a mere side effect for achieving another effect e 2 if by re- moving the entities involved in e 1 we can still achieve e 2 ; oth- erwise we say e 1 is a means for e 2 . Our approach is inspired by Pollock’s [1976] treatment, and while similarities can be found with the approach in [Pereira and Saptawijaya, 2016a], we note that our definition requires at least first-order logic. Given a fluent f , we denote by ⊙ the set of all constants and function expressions in f . For example: ⊙ ( hungry ( jack )) = { jack } ⊙ ( married ( jack , sister ( mary ) )) = { jack , sister ( mary ) , mary } We need one more definition: the state of the world with- out a given set of entities. Let ⊗ ( Γ , θ ) , where Γ is a set of formulae and θ is a set of ground terms, be defined as below: ⊗ ( Γ , θ ) = { ψ ∈ Γ | ψ does not contain any term in θ } Note that the above definition relies on the Unique Names Assumption commonly used in most formulations of the event calculus. This assumption ensures that every object in the domain has at most one name or expression referring to it. If this assumption does not hold, we can have the following slightly more complicated definition for ⊗ . ⊗ ( Γ , θ ) = { ψ ∈ Γ ∣ ∣ ∣ ∣ ψ does not contain any s such that ∃ t ∈ θ : Γ ⊢ s = t } We introduce a new modal operator ⊲ , means , that says when an effect is a means for another effect. ⊲ : Formula × Formula → Formula The meaning of the operator is defined computationally be- low. The definition states that, given Γ , a fluent f holding true at t 1 causes or is used as a means for another fluent g at time t 2 , with t 2 > t 1 , iff the truth condition for g changes when we remove formulae that contain entities involved in f . While this definition is far from perfect, it suffices as a first cut and lets us simulate experimental scenarios that have been used to test DDE ’s presence in humans. (Three other similar defini- tions hold when we look at combinations of fluents holding and not holding.) The equation below follows (Note that ⊢ is non-monotonic, as it includes the event calculus): Γ ⊢ ⊲ ( Holds ( f , t 1 ) , Holds ( g , t 2 )) iff      Γ ⊢ t 2 > t 1 ∧ [ Γ ⊢ Holds ( f , t 1 ) ∧ Γ ⊢ Holds ( g , t 2 ) ] ⇒ [ Γ − ⊗ ( Γ , ⊙ ( f ) ) ⊢ ¬ Holds ( g , t 2 )]      For example, let e 1 be “ throwing a stone s at a window w ” and e 2 be “ the window w getting broken .” We can see that e 2 is not just a mere side effect of e 1 , and the definition works, since, if the stone is removed, e 2 wouldn’t happen. This definition is not perfect. For instance, consider when there are common objects in both the events: the intuitiveness breaks down (but the definition still works). We might for example let e 1 be “ hitting a window w with a bat b . ” If the window and bat are not present, e 2 would not happen. 5.2 The Formalization Note that the DDE ( Γ , σ , a , α , t , H ) predicate defined below, though defined using DC EC , lies outside of the formal lan- guage of DC EC . While DDE is not fully formalized in DC EC , the individual clauses F 1 − F 4 are. This is how we can verify the conditions in the simulations described later. It is trivial to define a new symbol and formalize the predicate in DC EC : DDE 1 ⇔ F 1 ∧ F 2 ∧ F 3 ∧ F 4 . What is not trivial, we concede, is how this works with other modalities. For example, can we efficiently derive K ( a , t 1 , K ( b , t 2 , DDE ( Γ , σ , a , α , t , H ))) given some other for- mulae Γ ? This could be difficult because the predicate’s def- inition below involves provability, and one has to be careful when including a provability predicate. That said, for future work, we plan on incorporating this within an extended dialect of DC EC . One immediate draw- back is that while we can have a system -level view of whether an action is DDE -sanctioned, agents themselves might not know that. For example, we would like to able to write down “a knows that b knows that c’s action is DDE -sanctioned.” Given the machinery defined above, we now proceed to the formalization. Assume, for any action type α carried out by an agent a at time t , that it initiates the set of fluents α a , t I , and terminates the set of fluents α a , t T . Then, for any action α taken by an autonomous agent a at time t with background information Γ in situation σ , the action adheres to the doc- trine of double effect up to a given time horizon H , that is DDE ( Γ , σ , a , α , t , H ) iff the conditions below hold: Formal Conditions for DDE F 1 α carried out at t is not forbidden. That is: Γ 6 ⊢ ¬ O ( a , t , σ , ¬ happens ( action ( a , α ) , t )) F 2 The net utility is greater than a given positive real γ : Γ ⊢ H ∑ y = t + 1 ( ∑ f ∈ α a , t I μ ( f , y ) − ∑ f ∈ α a , t T μ ( f , y ) ) > γ F 3a The agent a intends at least one good effect. ( F 2 should still hold after removing all other good effects.) There is at least one fluent f g in α a , t I with μ ( f g , y ) > 0, or f b in α a , t T with μ ( f b , y ) < 0, and some y with t < y ≤ H such that the following holds: Γ ⊢      ∃ f g ∈ α a , t I I ( a , t , Holds ( f g , y )) ∨ ∃ f b ∈ α a , t T I ( a , t , ¬ Holds ( f b , y ))      F 3b The agent a does not intend any bad effect. For all fluents f b in α a , t I with μ ( f b , y ) < 0, or f g in α a , t T with μ ( f g , y ) > 0, and for all y such that t < y ≤ H the following holds: Γ 6 ⊢ I ( a , t , Holds ( f b , y )) and Γ 6 ⊢ I ( a , t , ¬ Holds ( f g , y )) F 4 The harmful effects don’t cause the good effects. Four permutations, paralleling the definition of ⊲ above, hold here. One such permutation is shown below. For any bad fluent f b holding at t 1 , and any good fluent f g holding at some t 2 , such that t < t 1 , t 2 ≤ H , the following holds: Γ ⊢ ¬ ⊲ ( Holds ( f b , t 1 ) , Holds ( f g , t 2 )) F 5 This clause requires subjunctive reasoning. The current formalization ignores this stronger clause. There has been some work in computational subjunctive reasoning that we hope to use in the future; see [Pollock, 1976]. Doctrine of Triple Effect The doctrine of triple effect ( DT E ) was proposed in [Kamm, 2007] to account for scenarios where actions that are viewed as permissible by most philosophers and deemed as such by empirical studies (e.g. the switch action in the third scenario in [Hauser et al. , 2007]) are not sanctioned by DDE , as they involve harm being used as a means to achieve an action. DT E allows such actions as long as the harm is not explicitly intended by the agent. Note that our version of DDE subsumes DT E through condition C 4 . 6 Scenarios The trolley problems are quite popular in both philosophical and empirical studies in ethics. Hauser et al. [2007] found empirical support that DDE is used by humans, courtesy of experiments based on a set of trolley problems. They use a set of 19 trolley problems in their experimentation, and de- scribe in detail four of these. We consider the first two of these problems in our study here. The problem scenarios are briefly summarized below; common to both this setup: There are two tracks track 1 and track 2 . There is a trolley loose on track 1 heading toward two people P 1 and P 2 on track 1 ; nei- ther person can move in time. If the trolley hits them, they die. The goal is to save this pair. 5 Scenario 1 There is a switch that can route the trolley to track 2 . There is a person P 3 on track 2 . Switching the trolley to track 2 will kill P 3 . Is it okay to switch the trol- ley to track 2 ? Scenario 2 There is no switch now, but we can push P 3 onto the track in front of the trolley. This action will damage the trolley and stop it; it will also kill P 3 . Is it okay to push P 3 onto the track? DDE -based analysis tells us it is okay to switch the trolley in Scenario 1 , as we are killing the person merely as a side effect of saving P 1 and P 2 . In Scenario 2 , similar analysis tells us it is not okay to push P 3 , because we are using that person as a means toward our goal. 7 Simulations At the core of our simulation is a formalization of the basic trolley scenario based on the event calculus. We use a discrete version of the event calculus, in which time is discrete, but other quantities and measures, such as the utility function, can be continuous. We have the following additional sorts: Trolley and Track . We also declare that the Agent and the Trolley sorts are subsorts of a Moveable sort, the instances of which are objects that can be placed on tracks and moved. We use the following additional core symbols: position : Moveable × Track × Number → Fluent dead : Agent → Fluent onrails : Trolley × Track → Fluent switch : Trolley × Track × Track → ActionType push : Agent × Track × Number → ActionType The utility function μ is defined as follows: μ ( f , t ) = { − 1 if f ≡ dead ( P ) 0 otherwise 5 For computational purposes, the exact number of persons is not important as long as it is greater than one. We set the threshold γ at 0 . 5. The simulation starts at time t = 0 with the only trolley, denoted by trolley , on track 1 . We have an event-calculus trajectory axiom shown below as part of Γ : ∀ t : Trolley , track : Track , s : Moment [ Trajectory ( onrails ( t , track ) , s , position ( t , track , ∆ ) , ∆ )] The above axiom gives us the trolley’s position at different points of time. Γ also includes axioms that account for non effects. For example, in the absence of any actions, we can derive: Γ ⊢ Holds ( position ( trolley , track 1 , 23 ) , 23 ) We also have in the background Γ a formula stating that in the given trolley scenario the agent ought to save both P 1 and P 2 . Ideally, while we would like the agent to arrive at this obligation from a more primitive set of premises, this setup is closer to experiments with human subjects in which they are asked explicitly to save the persons. Note the agent performing the action is simply denoted by I , and let the time of the test be denoted by now . O ( I , now , σ trolley , [ ¬∃ t : Moment Holds ( dead ( P 1 , t )) ∧ ¬∃ t : Moment Holds ( dead ( P 2 , t )) ]) Given that the agent knows that it is now in situation σ trolley , and the agent believes that it has the above obliga- tion, we can derive from DC EC ’s inference schemata what the agent intends:                                  K ( I , now , σ trolley ) , B         I , now , O         I , now , σ trolley ,      ¬∃ t : Moment Holds ( dead ( P 1 , t )) ∧ ¬∃ t : Moment Holds ( dead ( P 2 , t ))                      , O ( I , now , σ trolley , [ ¬∃ t : Moment Holds ( dead ( P 1 , t )) ∧ ¬∃ t : Moment Holds ( dead ( P 2 , t )) ] )                                  ⊢ I    I , now ,    ¬∃ t : Moment Holds ( dead ( P 1 , t )) ∧ ¬∃ t : Moment Holds ( dead ( P 2 , t ))       In both the simulations, P 1 is at position 4 and P 2 is at posi- tion 5 on track 1 . In Scenario 1 , P 3 is at position 3 on track 2 , and the train can be switched from position 3 on track 1 to position 0 on track 2 . In Scenario 2 , we push P 3 onto position 3 on track 1 . The total number of formulae and run times for simulating the two scenarios with and without the actions are shown below. Note these are merely event-calculus simulation times. These are then used in computing DDE ( Γ , σ , a , α , t , H ) . The event- calculus simulation helps us compute F 2 . Simulation Time (s) Scenario | Γ | No action Action performed Scenario 1 39 0.591 1.116 Scenario 2 38 0.602 0.801 ShadowProver was then used to verify that F 1 , F 3 a , and F 3 b hold. Both the scenarios combined take 0 . 57 seconds for F 1 , F 3 a , and F 3 b . The scenarios differ only in F 4 . The pushing action fails to be DDE -compliant due to F 4 . For verifying that F 4 holds in Scenario 1 and doesn’t hold in Scenario 2 , it takes 0 . 49 seconds and 0 . 055 seconds, respectively. 6 8 On Operationalizing the Principle Given the above formalization, it’s quite straightforward to build logic-based systems that are DDE -compliant. 7 But how do we apply the above formalization to existing models and systems that are not explicitly logic-based? We lay down a set of conditions such models must satisfy to be able to ver- ify that they are DDE -compliant. We then sketch how we could use DDE in two such modified systems: a STRIPS- like planner and a POMDP type model. The problem now before us is: Given a system and a utility function, can we say that the system is DDE -compliant? No, we need more information from the system. For example, we can have two systems in the same situation, the same util- ity functions and same set of available actions. 8 One system can be DDE -compliant while the other is not. For exam- ple, assume that we have two autonomous driving systems d 1 and d 2 . Assume that d 2 has learned to like killing dogs and intends to do so if possible during its normal course of oper- ation. While driving, both come across a situation where the system has to hit either a human or a dog. In this scenario, d 1 ’s action to hit the dog would be DDE -compliant while d 2 ’s action will not be. Therefore, the formalization requires that we have access to an agent’s intentions at all times. One common objection to requiring that intentions be sep- arate from utilities states that utilities can be used to derive intentions. This is mistaken: it is not always possible to de- rive intentions from a utility function. For example, there might be a state that has high utility but the agent might not intend to realize that state, as it could be out of reach for that agent (low perceived probability of success). For instance, winning a million dollars ( w ) has high utility, but most rational agents might not intend w , as they know this event is (alas) out of their reach. This holds for similar high-utility states. 6 All the axioms for the two simulations, ShadowProver , and the combined DDE implementation can be obtained here: https://goo.gl/9KU2L9. 7 For examples of logic-based systems in pure first-order logic, see [Mueller, 2006]. 8 Where does a utility function come from? The obvious way to get a utility function seems to be to learn such a function. There are good arguments that such learning can be very hard [Arnold et al. , 2017]. For now, we are not concerned with how such a utility function is given to us. For exposition and economy assume that it already exists. At a minimum, we believe utility and perceived probability of success go into an agent’s intentions. This seems to align with the human case when we are looking at motivations, i.e. expectancy-value theory. How motivations could transform into intentions is another open research question. 8.1 Requirements Practically speaking, there is a spectrum of systems that our techniques will be dealing with. At one end, we will en- counter systems that are complete black boxes taking in per- cepts from the environment and spitting out actions. Since DDE requires us to look at intentions of systems, such black- box systems will be impossible to verify. We can of course ask the system to output its intention through language as one of its possible actions, but this means that we are relying on the system’s honest reporting of its internal states. At the other end of the spectrum, we have complete white-box sys- tems. We can be fully confident that we can get what the sys- tem intends, believes, knows, etc. at any point in time. Ver- ifying such systems is possible, in theory at least. While we don’t know what kind of shape autonomous systems will take and where they will fall in the spectrum, we can explicitly list information we need from such systems before we can start the verification process. One such specification follows. Gray Box Requirement Given any autonomous system a , at any point of time t , we should at least be able to assert the following, if true, in order to verify that it is DDE -compliant: 1. The system’s intentions: ( ¬ ) I ( a , t , φ ) 2. Prohibitions: ¬ O ( a , t , σ , ¬ φ ) How would we go about applying the formalization to other formal systems? We very briefly sketch two examples. STRIP-like Planner We first look at a STRIPS-style planning system. Briefly, a STRIPS-style planner has a set of actions { a i } and a set of states { s i } . The states are nothing but sets of formulae or atoms. The individual formulae would be our effects. Each action a has a set of preconditions pre ( a ) , a set of formulae that should hold in a given state to execute that action in that state. After executing an action a in a state s , the new state is given by s ∪ additions ( a ) − deletions ( a ) . The planner is given an explicit goal φ . This means that we know ( ¬ ) I ( a , t , φ ) triv- ially. If we have an ethical hierarchy for the available set of actions, we then satisfy the gray-box requirement. What is then needed is a definition for ⊲ , an effect used as means for another effect. The formalism gives us one possible way to define ⊲ . A plan ρ is nothing but a sequence of actions. Given a plan ρ , we say an effect e 1 is used as means for an- other effect e 2 , if e 1 ∈ pre ( a 1 ) , a 1 is an action in the plan and e 2 ∈ additions ( a 2 ) , and a 1 comes before a 2 . POMDP-derived System Partially observable Markov decision process (POMDP) models have been quite successful in a large number of do- mains. It is highly likely that some of the first autonomous systems might be based on POMDPs. We note that in such models, the only goal is to maximize a reward function. An- other issue is that states are atomic. In order to discern be- tween good and bad effects, we would need states to be de- composed into smaller components. One possible approach could use factored markov decision processes , which are MDPs in which states are represented as a mapping m from a set of state variables Θ = { s 1 , s 2 , . . . , s n } to a set of values V . Here the utility and reward function could be defined on the assignments; i.e., reward ( s ) = ∑ μ ( s i ν ) , where μ assigns a utility value to a particular assignment of a state variable. Additionaly, the formalism could specify one or more goal states that the model seeks to attain while maximizing the re- ward along the way, giving us ( ¬ ) I ( a , t , φ ) . 9 Heirarchies of Doctrines Our formalization, summarized in the equation below, gives rise to multiple hierarchies of the doctrine. We discuss some of the hierarchies below. DDE ( Γ , σ , a , α , t , H ) ⇔ F 1 ∧ F 2 ∧ F 3 ∧ F 4 Horizon One obvious knob in the above equation is the horizon H . Increasing H will give us stronger versions of the doctrine. Since our formalization is in first-order modal logic, the hori- zon need not be finite: we could set the horizon to infinity, H = ω , and still obtain a tractable model, as long as we care- fully develop our formalization. 9 Agent Generality Instead of just checking whether an action at a given time is DDE -compliant, we could ask whether an autonomous agent a in a given situation σ will be DDE - compliant at all times. This gives us the following condition: ∀ α : ActionType , t : Moment . DDE ( Γ , σ , a , α , t , H ) Situation Generality In the hierarchy above, the quantification was over objects. We could ask whether an autonomous agent would be DDE -compliant in all situations. That would corre- spond to a quantification over formulae (see centered formula immediately below), something not supported in the version of DC EC used herein. ∀ σ : Formula , α : ActionType , t : Moment . DDE ( Γ , σ , a , α , t , H ) Counterfactual Reasoning The presence or absence of counter- factual reasoning in F 5 would correspond to a very strong ver- sion of the doctrine, but one that would also be very hard to au- tomate in the general case. We note that there are hierarchies of counterfactual reasoning (see [Pollock, 1976]) that could cor- respond to hierarchies of versions of DDE . 10 Conclusion We now quickly summarize the chief contributions of the foregoing, and end by presenting future lines of work. Our primary contribution is the presentation of a novel compu- tational logic, or cognitive calculus, in which important ver- sions of DDE are formalized. As a part of this calculus, we formalized an effect being used as a means for another effect via the modal operator ⊲ . We also supplied an in- formal but rigorous version, C 1 − C 4 , of the doctrine itself, 9 It’s a well-known fundamental result that first-order logic can handle infinite models with a finite number of axioms; see e.g. Ch. 12 in [Boolos et al. , 2003]. from which we built our formalization F 1 − F 4 . Included in this formalization is the clause C 5 / F 5 , which requires sub- junctive and counterfactual reasoning, an aspect that hitherto has simply not been considered in any systematic treatment of DDE . Our formalization subsumes the doctrine of triple effect, DT E ; we have achieved the first computational simu- lations of the doctrine. A byproduct of these simulations is an event-calculus formalization of a demanding class of trolley problems (widely used in empirical and philosophical studies of ethics). We noted that our formalization gives rise to hier- archies of doctrines with varying strengths. Our readers can choose a particular strength doctrine that fits their needs. Future work includes simulating more intricate “ethically thorny” scenarios. Despite our progress, we note that our formalization is devoid of any mechanisms for handling uncertainty, and we are in the process of extending our work to include reasoning based on probabilistic versions of DC EC . 10 We also note that we have not said much about how our formalization could interact with an autonomous learning agent. We observe that even the possibility that such an intricate principle as DDE / DT E is learnable us- ing existing learning frameworks remains open to question [Arnold et al. , 2017]. In the short term, a guaranteed-to- be-fruitful but less ambitious area of development will be the deployment of our mechanization of DDE in existing systems, and adapting existing formal models, as briefly discussed above, to exploit this mechanization. Finally, we note that since we are using first-order (multi) modal logic, we will eventually run into efficiency issues, as even vanilla first-order logic’s decision problem, Γ ⊢ γ , is Turing- undecidable. There are a number of techniques to mitigate this issue. One approach is to exploit a library of com- monly used proof patterns codified in a denotational proof language; see [Arkoudas and Bringsjord, 2008]. We are cau- tiously optimistic, as many formal enterprises outside of AI (e.g. software verification [Khasidashvili et al. , 2009] and formal physics [Stannett and N ́ emeti, 2014]) routinely face such challenges and surmount them. Acknowledgements We are grateful to the Office of Naval Research for funding that enabled the research presented in this paper. We also thank Dr. Daniel Thero for reading a draft of the paper and providing valuable feedback. We are also grateful for the in- sightful reviews provided by the five anonymous referees. 10 There exist probabilistic versions of the event calculus. We will leverage similar work. A Deontic Cognitive Event Calculus We provide here a short primer on the deontic cognitive event calculus ( DC EC ). A calculus is a set of axioms in a for- mal logic. For example, the event calculus is a set of axioms couched in first-order logic. DC EC is a set of axioms in sorted first-order modal logic (also known as sorted quanti- fied modal logic) that subsumes the event calculus. While first-order logic is an extensional system, modal logics are intensional systems. Note that there is a profound difference between intension vs. intention. One can have an intention to bring something about; this is traditionally cap- tured by particular intensional operators. In other words, put concretely, the intention operator I is an intensional operator, but so is D for desire, B for believes, and P for perceives, etc. DC EC is intensional in the sense that it includes inten- sional operators. Unfortunately, the situation is further con- fused by the fact that traditionally in philosophy of mind, in- tentionality means the so-called “aboutness” of some men- tal states, so that my belief that Melbourne is beautiful is in this sense intentional, while my mental state has nothing to do with intending something. Most logicians working in formal intensional systems believe that at least intensional logic is required to formalize intentional states [Zalta, 1988]. One simple reason is that using plain first-order logic leads to unsound inferences as shown below. In the inference be- low, we have an agent a that knows that the killer in a par- ticular situation is the person that owns the knife. Agent a does not know that the Moe is the killer, but it’s true that Moe is the owner of the knife. If the knowledge operator K is a simple first-order predicate, we will get the proof shown below, which produces a contradiction from sound premises. See [Bringsjord and Govindarajulu, 2012] for a se- quence of stronger representation schemes in first-order logic for knowledge and belief that still result in inconsistencies. Modeling Knowlege (or any Intension) in First-order Logic 1 K ( a , Killer ( owner ( knife ))) ; given 2 ¬ K ( a , Killer ( Moe )) ; given 3 Moe = owner ( knife ) ; given 4 K ( a , Killer ( Moe )) ; first-order inference from 3 and 1 5 ⊥ ; first-order inference from 4 and 2 A.1 Syntax of Deontic Cognitive Event Calculus DC EC is a sorted calculus. A sorted system can be thought of as being analogous to a typed single-inheritance program- ming language. We show below some of the important sorts used in DC EC . Among these, the Agent , Action and Action- Type sorts are not native to the event calculus. Sort Description Agent Human and non-human actors. Time The Time type stands for time in the domain. E.g. simple, such as t i , or complex, such as birthday ( son ( jack )) . Event Used for events in the domain. ActionType Action types are abstract actions. They are instantiated at particular times by actors. Example: eating. Action A subtype of Event for events that occur as actions by agents. Fluent Used for representing states of the world in the event calcu- lus. The figures below show the syntax and inference schemata of DC EC . The syntax is quantified modal logic. Commonly used function and relation symbols of the event calculus are included. Particularly, note the following modal operators: P for perceiving a state, K for knowledge, B for belief, C for common knowledge, S for agent-to-agent communication and public announcements, B for belief, D for desire, I for intention, and finally and crucially, a dyadic deontic opera- tor O that states when an action is obligatory or forbidden for agents. It should be noted that DC EC is one specimen in a family of easily extensible cognitive calculi. Since the semantics of DC EC is proof-theoretic, as long as a new con- struct has appropriate inference schemata, the extension is sanctioned. Syntax S :: = Object | Agent | ActionType | Action ⊑ Event | Moment | Formula | Fluent f :: =                                action : Agent × ActionType → Action initially : Fluent → Formula Holds : Fluent × Moment → Formula happens : Event × Moment → Formula clipped : Moment × Fluent × Moment → Formula initiates : Event × Fluent × Moment → Formula terminates : Event × Fluent × Moment → Formula prior : Moment × Moment → Formula t :: = x : S | c : S | f ( t 1 , . . . , t n ) φ :: =      t : Formula | ¬ φ | φ ∧ ψ | φ ∨ ψ | P ( a , t , φ ) | K ( a , t , φ ) | C ( t , φ ) S ( a , b , t , φ ) | S ( a , t , φ ) | B ( a , t , φ ) | D ( a , t , Holds ( f , t ′ )) | I ( a , t , φ ) O ( a , t , φ , ( ¬ ) happens ( action ( a ∗ , α ) , t ′ )) The figure below shows the inference schemata for DC EC . R K and R B are inference schemata that let us model idealized agents that have their knowledge and belief closed under the DC EC proof theory. While normal humans are not dedcutively closed, this lets us model more closely how deliberate agents such as organizations and more strategic ac- tors reason. (Some dialects of cognitive calculi restrict the number of iterations on intensional operators.) R 1 and R 2 state respectively that it is common knowledge that percep- tion leads to knowledge, and that it is common knowledge that knowledge leads to belief. R 3 lets us expand out com- mon knowledge as unbounded iterated knowledge. R 4 states that knowledge of a proposition implies that the proposition holds. R 5 to R 10 provide for a more restricted form of rea- soning for propositions that are common knowledge, unlike propositions that are known or believed. R 12 states that if an agent s communicates a proposition φ to h , then h believes that s believes φ . R 14 dictates how obligations get translated into intentions. Inference Schemata K ( a , t 1 , Γ ) , Γ ⊢ φ , t 1 ≤ t 2 K ( a , t 2 , φ ) [ R K ] B ( a , t 1 , Γ ) , Γ ⊢ φ , t 1 ≤ t 2 B ( a , t 2 , φ ) [ R B ] C ( t , P ( a , t , φ ) → K ( a , t , φ )) [ R 1 ] C ( t , K ( a , t , φ ) → B ( a , t , φ )) [ R 2 ] C ( t , φ ) t ≤ t 1 . . . t ≤ t n K ( a 1 , t 1 , . . . K ( a n , t n , φ ) . . . ) [ R 3 ] K ( a , t , φ ) φ [ R 4 ] C ( t , K ( a , t 1 , φ 1 → φ 2 )) → K ( a , t 2 , φ 1 ) → K ( a , t 3 , φ 2 ) [ R 5 ] C ( t , B ( a , t 1 , φ 1 → φ 2 )) → B ( a , t 2 , φ 1 ) → B ( a , t 3 , φ 2 ) [ R 6 ] C ( t , C ( t 1 , φ 1 → φ 2 )) → C ( t 2 , φ 1 ) → C ( t 3 , φ 2 ) [ R 7 ] C ( t , ∀ x . φ → φ [ x 7 → t ]) [ R 8 ] C ( t , φ 1 ↔ φ 2 → ¬ φ 2 → ¬ φ 1 ) [ R 9 ] C ( t , [ φ 1 ∧ . . . ∧ φ n → φ ] → [ φ 1 → . . . → φ n → ψ ]) [ R 10 ] S ( s , h , t , φ ) B ( h , t , B ( s , t , φ )) [ R 12 ] I ( a , t , happens ( action ( a ∗ , α ) , t ′ )) P ( a , t , happens ( action ( a ∗ , α ) , t )) [ R 13 ] B ( a , t , φ ) B ( a , t , O ( a , t , φ , χ )) O ( a , t , φ , χ ) K ( a , t , I ( a , t , χ )) [ R 14 ] References [Allsopp, 2011] Michael E. Allsopp. The Doctrine of Double Effect in US Law: Exploring Neil Gorsuchs Analyses. The National Catholic Bioethics Quarterly , 11(1):31–40, 2011. [Arkoudas and Bringsjord, 2008] Konstantine Arkoudas and Selmer Bringsjord. Toward Formalizing Common-Sense Psy- chology: An Analysis of the False-Belief Task. In T.-B. Ho and Z.-H. Zhou, editors, Proceedings of the Tenth Pacific Rim International Conference on Artificial Intelligence (PRICAI 2008) , number 5351 in Lecture Notes in Artificial Intelligence (LNAI), pages 17–29. Springer-Verlag, 2008. [Arnold et al. , 2017] Thomas Arnold, Daniel Kasenberg, and Matthias Scheutz. Value Alignment or Misalignment–What Will Keep Systems Accountable?, 2017. Presented at the 3rd Interna- tional Workshop on AI, Ethics and Society at AAAI 2017. [Bentzen, 2016] Martin Mose Bentzen. The Principle Of Double Effect Applied to Ethical Dilemmas of Social Robots. Frontiers in Artificial Intelligence and Applications , 2016. [Berreby et al. , 2015] Fiona Berreby, Gauvain Bourgne, and Jean- Gabriel Ganascia. Modelling Moral Reasoning and Ethical Re- sponsibility with Logic Programming. In Logic for Program- ming, Artificial Intelligence, and Reasoning , pages 532–548. Springer, 2015. [Boolos et al. , 2003] George S. Boolos, John P. Burgess, and Richard C. Jeffrey. Computability and Logic (Fifth Edition) . Cambridge University Press, Cambridge, UK, 2003. [Bringsjord and Govindarajulu, 2012] Selmer Bringsjord and Naveen Sundar Govindarajulu. Given the Web, What is Intelligence, Really? Metaphilosophy , 43(4):361–532, 2012. [Bringsjord and Taylor, 2012] Selmer Bringsjord and Joshua Tay- lor. The Divine-Command Approach to Robot Ethics. In P. Lin, G. Bekey, and K. Abney, editors, Robot Ethics: The Ethical and Social Implications of Robotics , pages 85–108. MIT Press, Cam- bridge, MA, 2012. [Bringsjord et al. , 2014] Selmer Bringsjord, Naveen Sundar Govin- darajulu, Daniel Thero, and Mei Si. Akratic Robots and the Com- putational Logic Thereof. In Proceedings of ETHICS • 2014 (2014 IEEE Symposium on Ethics in Engineering, Science, and Technology) , pages 22–29, Chicago, IL, 2014. IEEE Catalog Number: CFP14ETI-POD. [Bringsjord, 2017] Selmer Bringsjord. A 21st-Century Ethical Hi- erarchy for Robots and Persons: EH . In A World with Robots: International Conference on Robot Ethics: ICRE 2015 , vol- ume 84, page 47. Springer, 2017. [Cushman et al. , 2006] Fiery Cushman, Liane Young, and Marc Hauser. The Role of Conscious Reasoning and Intuition in Moral Judgment Testing Three Principles of Harm. Psychological sci- ence , 17(12):1082–1089, 2006. [Feldman, 1978] Fred Feldman. Introductory Ethics . Prentice-Hall, Englewood Cliffs, NJ, 1978. [Fletcher, 1998] George P. Fletcher. Basic Concepts of Criminal Law . Oxford University Press, 1998. [Francez and Dyckhoff, 2010] Nissim Francez and Roy Dyckhoff. Proof-theoretic Semantics for a Natural Language Fragment. Lin- guistics and Philosophy , 33:447–477, 2010. [Gentzen, 1935] Gerhard Gentzen. Investigations into Logical De- duction. In M. E. Szabo, editor, The Collected Papers of Gerhard Gentzen , pages 68–131. North-Holland, Amsterdam, The Nether- lands, 1935. This is an English version of the well-known 1935 German version. [Hauser et al. , 2007] Marc Hauser, Fiery Cushman, Liane Young, R Kang-Xing Jin, and John Mikhail. A Dissociation Between Moral Judgments and Justifications. Mind & Language , 22(1):1– 21, 2007. [Huxtable, 2004] Richard Huxtable. Get Out Of Jail Free? The Doctrine Of Double Effect In English Law. Palliative Medicine , 18(1):62–68, 2004. [Kamm, 2007] Frances Myrna Kamm. Intricate Ethics: Rights, Re- sponsibilities, And Permissible Harm . Oxford University Press, New York, New York, 2007. [Khasidashvili et al. , 2009] Zurab Khasidashvili, Mahmoud Ki- nanah, and Andrei Voronkov. Verifying Equivalence of Memo- ries Using a First Order Logic Theorem Prover. In Formal Meth- ods in Computer-Aided Design, 2009. FMCAD 2009 , pages 128– 135. IEEE, 2009. [Khatchadourian, 1988] Haig Khatchadourian. Is the Principle of Double Effect Morally Acceptable? International Philosophical Quarterly , 28(1):21–30, 1988. [Malle et al. , 2015] Bertram F. Malle, Matthias Scheutz, Thomas Arnold, John Voiklis, and Corey Cusimano. Sacrifice One for the Good of Many?: People Apply Different Moral Norms to Human and Robot Agents. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-robot Interac- tion , pages 117–124. ACM, 2015. [McIntyre, 2014] Alison McIntyre. Relevance Logic. In Ed- ward Zalta, editor, The Standford Encyclopedia of Philosophy . September, 2014 edition, 2014. [McNamara, 2010] Paul McNamara. Deontic Logic. In Edward Zalta, editor, The Stanford Encyclopedia of Philosophy . 2010. McNamara’s (brief) note on a paradox arising from Kant’s Law is given in an offshoot of the main entry. [Mikhail, 2011] John Mikhail. Elements of Moral Cognition: Rawls’ Linguistic Analogy and the Cognitive Science of Moral and Legal Judgment . Cambridge University Press, Cambridge, UK, 2011. Kindle edition. [Mueller, 2006] Erik T. Mueller. Commonsense Reasoning: An Event Calculus Based Approach . Morgan Kaufmann, San Fran- cisco, CA, 2006. This is the first edition of the book. The second edition was published in 2014. [Pereira and Saptawijaya, 2016a] Lu ́ ıs Moniz Pereira and Ari Saptawijaya. Counterfactuals, Logic Programming and Agent Morality. In Shahid Rahman and Juan Redmond, editors, Logic, Argumentation and Reasoning , pages 85–99. Springer, 2016. [Pereira and Saptawijaya, 2016b] Lu ́ ıs Moniz Pereira and Ari Saptawijaya. Programming Machine Ethics . Springer, Basel, Switzerland, 2016. This book is in Springer’s SAPERE series, Vol. 26. [Pollock, 1976] John Pollock. Subjunctive Reasoning . D. Reidel, Dordrecht, Holland & Boston, USA, 1976. [Rao and Georgeff, 1991] Anand S. Rao and Michael P. Georgeff. Modeling Rational Agents Within a BDI-architecture. In R. Fikes and E. Sandewall, editors, Proceedings of Knowledge Represen- tation and Reasoning (KR&R-91) , pages 473–484, San Mateo, CA, 1991. Morgan Kaufmann. [Stannett and N ́ emeti, 2014] Mike Stannett and Istv ́ an N ́ emeti. Us- ing Isabelle/HOL to Verify First-order Relativity Theory. Journal of Automated Reasoning , 52(4):361–378, 2014. [Zalta, 1988] Edward N. Zalta. Intensional Logic and the Meta- physics of Intentionality . MIT Press, Cambridge, MA, 1988.