The Inverse Task of the Reflexive Game Theory: Theoretical Matters, Practical Applications and Relationship with Other Issues Sergey Tarasenko Kyoto University, Yoshida honmachi, Kyoto 606-8501, Japan infra.core@gmail.com Abstract. The Reflexive Game Theory (RGT) has been recently pro- posed by Vladimir Lefebvre to model behavior of individuals in groups. The goal of this study is to introduce the Inverse task. We consider meth- ods of solution together with practical applications. We present a brief overview of the RGT for easy understanding of the problem. We also de- velop the schematic representation of the RGT inference algorithms to create the basis for soft- and hardware solutions of the RGT tasks. We propose a unified hierarchy of schemas to represent humans and robots. This hierarchy is considered as a unified framework to solve the entire spectrum of the RGT tasks. We conclude by illustrating how this frame- work can be applied for modeling of mixed groups of humans and robots. All together this provides the exhaustive solution of the Inverse task and clearly illustrates its role and relationships with other issues considered in the RGT. Key words: Reflexive Game Theory (RGT), group behavior, society behavior, RGT Forward Task, RGT Inverse Task, Asimov’s Laws of Robotics, robots in RGT, mixed groups of humans and robots, human- robot societies 1 Introduction The Reflexive Game Theory (RGT) has been entirely developed by Lefebvre [1, 2] and is based on the principles of anti-selfishness or egoism forbiddeness [1, 2] and human reflexion processes [3]. Therefore RGT is based on the human-like decision-making processes. The main goal of the theory is to model behavior of individuals in the groups. It is possible to predict choices, which are likely to be made by each individual in the group, and influence each individual’s decision- making due to make this individual to make a certain choice. In particular, the RGT can be used to predict terrorists’ behavior [4]. In general, the RGT is a simple tool to predict behavoir of invididuals and influence individuals’ choices. Therefore it makes possible to control the individ- uals in the groups by guiding their behavoir (decision-making, choices) by means of the corresponding influences. arXiv:1011.3397v1 [cs.MA] 15 Nov 2010 2 Sergey Tarasenko On the other hand, now days robots have become an essential part of our life. One of the purposes robots serve to is to substitute human beings in dangerous situations and environments, like defuse a bomb or radioactive zones etc. In contrast, human nature shows strong inclinations towards the risky be- havior, which can cause not only injuries, but even threaten the human life. The list of these reasons includes a wide range starting from irresponsible kids’ behavior to necessity to find solution in a critical situation. In such a situation, a robot should full-fill a function of refraining humans from doing risky actions and perform the risky action itself, if needed. However, robots are forbidden and should not physically force people, but must convince people on the mental level to refrain from doing a risky action. This method is more effective rather than a simple physical compulsion, because humans make the decisions (choices) themselves and treat these decisions as their own. Such technique is called a reflexive control [3]. The task of finding appropriate reflexive control is closely related with the Inverse task, when we need to find suitable influence of one subject on another one or on a group of subject on the subject of interest. Therefore, it is needed to develop the framework of how to solve the Inverse task. This is the primary goal of this study. However, for better understanding of the gist of the Inverse task and its intrinsic relationships with other issues of the RGT, we introduce the entire spectrum of the tasks, which can be solved by the RGT. This forms the scope of inference algorithms used in the RGT. We present the RGT algorithms in the form of the schemas of control systems that can be instantly applied for developement of soft- or/and hardware solutions. We develop a hierarchy of control systems for abstract individual (including human subject) and robotic agent (robot) based on these control schemas. Finally, we illustrate application of the Inverse task together with other RGT inference algorithms to model robot’s behavior in the mixed groups of humans and robots. 2 Brief Overview of the Reflexive Game Theory (RGT) 2.1 Representation of groups: graphs, polynomials and stratification tree The RGT deals with groups of abstract subjects (individuals, humans, au- tonomous agents etc). Each subject is assigned a unique variable ( subject vari- able ). Any group of subjects is represented in the shape of fully connected graph , which is called a relationship graph . Each vertex of the graph corresponds to a single subject. Therefore the number of vertices of the graph is in one-to-one correspondence with overall number of subjects in the groups. Each vertex is named after the corresponding subject variable. The RGT uses the set theory and the Boolean algebra as the basis for calcu- lus. Therefore the values of subject variables are elements of Boolean algebra. The Inverse Task 3 All the subjects in the group can have either alliance or conflict relationship. The relationships are identified as a result of group macroanalysis. It is suggested that the installed relationships can be changed. The relationships are illustrated with graph ribs. The solid-line ribs correspond to alliance, while dashed ones are considered as conflict. For mathematical analysis alliance is considered to be conjunction (multiplication) operation ( · ), and conflict is defined as disjunction (summation) operation (+). The graph presented in Fig. 1a or any graph containing any sub-graph isomor- phic to this graph are not decomposable. In this case, the subjects are excluded from the group one by one, until the graph becomes decomposable. The exclusion is done according to the importance of the other subjects for a particular one [1, 2]. Any other fully connected graphs are decomposable. Any decomposable graph can be presented in an analytical form of a corresponding polynomial . Any relationship graph of three subjects is decomposable (see [1, 2]). Consider three subjects a, b and c . Let subject a is in alliance with other subjects, while subjects b and c are in conflict (Fig. 1b). The polynomial corre- sponding to this graph is a ( b + c ). a c b a c b a b c a c b d Fig. 1. The relationship graphs. [ a ( b + c )] [ a ] [ b + c ] ⋅ [ b ] + [ c ] Fig. 2. Polynomial Stratification Tree. Polynomials [ a ] , [ b ] and [ c ] are elementary poly- nomials. Regarding a certain relationship, the polynomial can be stratified (decom- posed) into sub-polynomials [1, 2]. Each sub-polynomial belongs to a particular level of stratification. If the stratification regarding alliance was first built, then the stratification regarding the conflict is implemented on the next step. The stratification procedure finalizes, when the elementary polynomials , containing a single variable, are obtained after a certain stratification step. The result of stratification is the Polynomial Stratification Tree (PST) . It has been proved that each non-elementary polynomial can be stratified in an unique way, i.e., each non-elementary polynomial has only one corresponding 4 Sergey Tarasenko PST (see [7] considering one-to-one correspondence between graphs and polyno- mials). Each higher level of the tree contains polynomials simpler than the ones on the lower level. For the purpose of stratification the polynomials are written in square brackets. The PST for a ( b + c ) polynomial is presented in Fig.2. Next, we omit the branches of the PST and from each non-elementary polyno- mial write in top right corner its sub-polynomials. The resulting tree-like struc- ture is called a diagonal form [1, 2, 5, 6]. Consider the diagonal form correspond- ing to the PST in Fig. 2: [ b ] + [ c ] [ a ][ b + c ] [ a ( b + c )] . Hereafter, the diagonal form is considered as a function defined on the set of all subsets of the universal set . The universal set contains the elementary actions . For example, these actions are actions α and β . By definition, the Boolean algebra of the universal set includes four elements: 1 = { α, β } , { α } , { β } and the empty set 0 = {} = Ø. These elements are all the possible subsets of universal set and considered as alternatives that each subject can choose. The alternative 0 = {} is interpreted as an inactive or idle state. In general, Boolean algebra consists of 2 n alternatives, if universal set contains n actions. Accroding to definition given by Lefebvre [5], we present here exponential operation defined by formula P W = P + W , (1) where W stands for negation of W [1, 2, 4]. This exponential operation is used to fold the diagonal form. During the folding, round and square brackets are considered to be interchangeable. The following equalities are also considered to be true: x + x = 1 , x + 0 = x and x + 1 = 1. Next we implement folding of diagonal form of polynomial a ( b + c ): [ b ] + [ c ] [ a ][ b + c ] [ a ]([ b + c ] + [ b ] + [ c ]) [ a ( b + c )] = [ a ( b + c )] = a ( b + c ) + a . It is considered that the levels of the PST represent different processing levels of natural or artificial cognitive system. Each level is considered as an images. The root of the tree is the input into the cognitive system and, therefore can be considered as the image of the world (environment including self and others), perceived by the subject. As it follows from the PST, there is a hierarchy of images, corresponding to a particular cognitive level. During processing along this hierarchy in the bottom-up manner, the image on the lower level undergoes an extensive process of simplification by the means of decomposition into simpler parts on the higher level. These parts are considered to be the images of the image on the previous level. Therefore, the images on the second level are different representions of the The Inverse Task 5 original image of the world. This procedure repeats until we obtain elementary part (elementary polynomials) [1, 2]. On the other hand, the PST folding procedure can be referred as top-down intergration process of simpler images from the higher levels. Therefore, the stratification procedure of original polynomial together with the folding procedure of the diagonal form illustrate the interplay of bottom-up and top-down information processes, which are widely imployed in biological [8, 9, 10, 11] and artificial [12, 13, 14] information processing systems. The idea of hierarchical structure is highly coherent with hierarchical organization of ma- jority of natural (inanimate objects) and biological (living creatures) entities. Furthermore, it has been shown that hierarchical structure is intrinsic for the relationships in societies of insects [15], animals [17, 16, 18] and human beings. Therefore hierarchical representation of the groups in the form of PST corre- spond to extraction of the hierarchical structure of the given group, while fusion of the PST and its diagonal form with diagonal form folding procedure closely resembles the way of information processing within a single independent congni- tive system as discussed above. Thus, RGT imploys the fundamental principles of hierarchical organization on both group (reflects structure of the groups) and individual (illustrates information processing within independent cognitive sys- tem of a single unit) levels. This makes RGT universal tools that mildly bridges the gap between representation and analysis. 2.2 The Decision Equation: definition and solution The goal of each subject in a group is to choose an alternative from the set of alternatives under consideration. To obtain choice of each subject, we consider the decision equations , which contain subject variable in the left-hand side and the result of diagonal form folding in the right-hand side: a = ( b + c ) a + a b = ( b + c ) a + a c = ( b + c ) a + a To find solution of the decision equations, we consider the following equation: x = Ax + Bx , (2) where x is the subject variable, and A and B are some sets. Eq.(2) represents the canonical form of decision equation . This equation has solution if and only if the set B is contained in set A : A ⊇ B . If this requirement is satisfied, then eq.(2) has at least one solution from the interval A ⊇ x ⊇ B [4]. Otherwise, the decision equation has no solution, and it is considered that subject cannot make a decision. In such situation, the subject is in frustration state. Therefore, to find solutions of decision equation, one should first transform it into the canonical form . Out of three presented equations only the decision 6 Sergey Tarasenko equation for subject a is in the canonical form, while other two should be trans- formed. We consider explicit transformation only of decision equation for subject b [20]: a ( b + c ) + a = ab + ac + a = ab + ( ac + a ) b + ( ac + a ) b = ( a + a + ac ) b + ( ac + a ) b = (1 + ac ) b + ( ac + a ) b = b + ( ac + a ) b = b + ( ac + ac + a ) b = b + ( c + a ) b . Therefore, b = b + ( c + a ) b. (3) The transformation of equation for subject c be can be easily derived by analogy: c = c + ( b + a ) c . Next we consider two tasks, which can be formulated regarding the decision equation in the canonical form and provide methods to solve each task. 2.3 The Forward Task The variable in the left-hand side of the decision equation in canonical form is the variable of the equation, while other variables are considered as influences on the subject from the other subjects. The Forward task is formulated as a task to find the possible choices of a subject of interest, when the influences on him from other subjects are given. After transformation of arbitral decision equation into its canonical form, the sets A and B are functions of other subjects’ influences. For example, if we consider group of subjects a , b , c , etc. togehter with the abstract representation of decision equation in canonical form for subject a , the sets A and B will be the functions of subject variables b , c , etc. : a = A ( b, c, ... ) a + B ( b, c, ... ) a . (4) In the case of only three subjects a , b and c , A ( b, c, ... ) = A ( b, c ) and B ( b, c, ... ) = B ( b, c ). All the influences are presented in influence matrix (Table 1). The main diagonal of influence matrix contains the subject variables. The rows of the matrix represent influences of the given subject on other subjects, while columns represent the influences of other subjects on the given one. The influence values are used in decision equations. Table 1. Influence Matrix a b c a a { α } { β } b { β } b { β } c { β } { β } c The Inverse Task 7 For subject a : a = ( { β } + { β } ) a + a ⇒ a = { β } a + a . For subject b : b = b + ( { α }{ β } + { α } ) b ⇒ b = b + { β } b . For subject c : c = c + ( { β }{ β } + { β } ) c ⇒ c = c + ( { β } + { α } ) c ⇒ c = 1. Equation for subject a does not have any solutions, since set A = A ( b, c ) = { β } is contained in set B = B ( b, c ) = 1: A ⊂ B . Thus, subject a cannot make any decision. Therefore he is considered to be in frustration state. Equation for subject b has at least one solution, since A = A ( b, c ) = 1 = { α, β } ⊇ B = B ( b, c ) = { β } . The solution belongs to the interval 1 ⊇ b ⊇ { β } . Therefore subject b can choose any alternative from Boolean algebra, which contains alternative { β } . These alternatives are 1 = { α, β } and { β } . Equation for subject c turns into equality c = 1. This is possible only in the case, when A ( b, c ) ≡ B ( b, c ). Here A = B = 1. 2.4 The Inverse Task In contrast to the Forward task, the Inverse task is formulated as a task to find all the simultaneous (or joint) influences of all the subjects together on the subject of interest that result in choice of a particular alternative or subset of alternatives. We call the subject of interest to be a controlled subject . Let subject a be a controlled subject and a ∗ is a fixed value, representing an alternative or subset of alternatives, which subjects b , c , etc. want subject a to choose. We call value a ∗ to be a target choice . By substituting subject variable a with fixed value a ∗ , we obtain the influence equation . If we substitute the subject variable a with fixed value a ∗ in the canonical form of the decision equation (eq. (4)), we obtain the canonical form of the influence equation : a ∗ = A ( b, c, ... ) a ∗ + B ( b, c, ... ) a ∗ , (5) For only three subjects a , b and c , A ( b, c, ... ) = A ( b, c ) and B ( b, c, ... ) = B ( b, c ). In contrast to the decision equation, which is equation of a single variable, the influence equation is the equation of multiple variables. However, the number of variables of influence equation is not trivial question. In fact, the number of variables in influence equation can be less then ( n − 1), where n is the total number of subjects in the group. There are groups, in which sets A and B are functions of less than ( n − 1) variables (see Appendix A). Therefore the variables that present in influence equation are called effective variables . The Inverse task is by definition 1 formalized as to find all the joint solutions of all subjects in the group, except for the controlled one, when the target choice is represented by interval χ 1 ⊇ a ∗ ⊇ χ 2 , where χ 1 and χ 2 are some sets and χ 1 ⊃ χ 2 . In such a case, to solve the Inverse task, one should solve the system of influence equations: 1 We need a system of influence equations because solutions of the influence equation a ∗ = A ( b, c, ... ) a ∗ + B ( b, c, ... ) a ∗ itself only guaratee that the original decision equa- tion a = A ( b, c, ... ) a + B ( b, c, ... ) a turns into true equality, but it is not guaranteed that these solutions are the only ones that turn decision equation into true equality. 8 Sergey Tarasenko { A ( b, c, ... ) = χ 1 B ( b, c, ... ) = χ 2 (6) (7) If the target choice is a single alternative, then χ 1 = χ 2 = a ∗ . The solutions of the system (6-7) are considered as reflexive control strategies. The solution of the Inverse task in particular is characterized from two points. The first point is whether it is required to find the influence of a particular single subject or joint influences of a group of subjects. The second one is whether the target choice is represented as a single alternative or as an interval of alternatives. To illustrate these points, we introduce a particular group of subjects. Let subjects a and b are in alliance with each other and in conflict with subject c . The polynomial corresponding to this graph is ab + c . The diagonal form corresponding to this polynomial and its folding is [ a ][ b ] [ ab ] +[ c ] [ ab + c ] = ab + c Therefore the decision equation for all the subjects in the group is x = ab + c, (8) where x can be any subject variable a , b or c . Influence of a single subject vs joint influences of a group. First we consider example, when the influence of a single subject is required. Let subject b makes influence { α } and a ∗ = { α } . Then we need to find influences of a single subject c , which result in solution a ∗ = { α } of decision equation a = ab + c . The canonical form of this influence equation is a ∗ = ( b + c ) a ∗ + ca ∗ . Since a ∗ = { α } , χ 1 = χ 2 = { α } , we obtain a system of equations: { { α } + c = { α } c = { α } (9) (10) Therefore, the straight forward solution of this system is c = { α } . This simple example illustrates the very gist of the Inverse task - to find the appropriate influences, which result in target choice. Next, we consider that influence of subject b is not known. Therefore, we obtain system { b + c = { α } c = { α } (11) (12) In this case, we need to find the values of variable b , which together with c , result in solution a ∗ = { α } . In other words, we need to find all the pairs ( b, c ), resulting in solution a ∗ = { α } . These pairs are solutions of the system (11-12). Therefore, we run all the possible values of variable b and check if the first equation of the system (11-12) turns into true equality: b = 1 : 1 + { α } = 1 ⇒ 1 6 = { α } ; The Inverse Task 9 b = { α } : { α } + { α } = { α } ⇒ { α } = { α } ; b = { β } : { β } + { α } = 1 ⇒ 1 6 = { α } ; b = 0 : 0 + { α } = { α } ⇒ { α } = { α } . Therefore, out of four possible values of variable b , only two values { α } and 0 are appropriate. Thus, we obtain two pairs ( b, c ): ( { α } , { α } ) and ( { α } , 0). A single target alternative vs interval of alternatives. In the previous examples we considered a target choice to be only a single alternative. Here we illustrate the case, when a target choice is an interval. Let b = { β } , and 1 ⊇ a ∗ ⊇ { α } . To find corresponding influences of subject c , we solve the system of equations: { { β } + c = 1 c = { α } (13) (14) Again, we instantly obtain the solution of this system: c = { α } . In this section, we have formulated the Inverse task in general and considered its particular formalization depending on the number of influences and what is the target choice. However, we do not have a method to solve arbitral influence equation. Therefore, we solve this problem in the next section. 3 How to Solve an Arbitral Influence Equation As an introduction for this section, we consider the fundamental proposition, which will be the conner stone to solve the influence equations. Proposition 1. Let P and Q be some abstract sets. Then P Q + P Q = 0 ⇔ P = Q . Proof. Necessity. Let P Q + P Q = 0, then P Q + P Q = 0 ⇒ P Q + P Q + P = P ⇒ P + P Q = P ⇒ P ( Q + Q ) + P Q = Q + P Q + P Q = P ⇒ Q = P. Therefore if P Q + P Q = 0, then P = Q . Sufficiency. Let P = Q , then P P + P P = 0.  Now let us consider the new type of equation: A 1 x + B 1 x = 0 (15) This equation has solution if and only if A 1 ⊇ x ⊇ B 1 . 10 Sergey Tarasenko 3.1 Solving Influence Equations There are three operations defined on the Boolean algebra. They are conjunc- tion ( · or multiplication), disjunction (+ or summation) and negation ( x , where x is subject variable). The negation operation is unary operation, while other two operations are binary. Using combination of these three operations, we can compose any influence equation. Since, it is obvious how to solve the equation including only unary operation, we discuss how to solve influence equations in- cluding a single binary operation. For this perpose, we consider two abstract subject variables x 1 and x 2 and abstract alternative χ . Lemma 1. The solution of equation x 1 + x 2 = χ (16) regarding variable x i , where i = 1 , 2 , is given by the interval χ ⊇ x i ⊇ ( χx j + x j χ ) , where j = 1 , 2; j 6 = i . Proof. According to Proposition 1, P = x 1 + x 2 , Q = χ , P = x 1 + x 2 = x 1 x 2 and Q = χ . Therefore, P Q + P Q = ( x 1 + x 2 ) χ + x 1 x 2 χ = x 1 χ + x 2 χ + x 1 x 2 χ . Conse- quently, we obtain eq.(17): x 1 χ + x 2 χ + x 2 χx 1 = 0 (17) We solve eq.(17) regarding variable x 1 . First, we transform eq.(17) into canon- ical form: χx 1 + ( χx 2 + χx 2 ) x 1 = 0 (18) Therefore, the solution of eq.(18) is given by the interval χ ⊇ x 1 ⊇ ( χx 2 + x 2 χ ) . (19) Since variables x 1 and x 2 are interchangable and it is possible to solve eq.(17) regarding variable x 2 as well, the general form of solution of eq.(16) is the interval χ ⊇ x i ⊇ ( χx j + x j χ ) . (20) where i = 1 , 2 and j = 1 , 2; j 6 = i.  Lemma 2. The solution of equation x 1 x 2 = χ (21) regarding variable x i , where i = 1 , 2 , is given by the interval ( χx j + χ x j ) ⊇ x i ⊇ χ , where j = 1 , 2; j 6 = i . The Inverse Task 11 Proof. According to Proposition 1, P = x 1 x 2 , Q = χ , P = x 1 x 2 = x 1 + x 2 and Q = χ . Therefore, P Q + P Q = ( x 1 x 2 ) χ + ( x 1 + x 2 ) χ = x 2 χx 1 + x 1 χ + x 2 χ . Thus, we obtain eq.(22): x 2 χx 1 + x 1 χ + x 2 χ = 0 (22) We solve eq.(22) regarding variable x 1 . First, we transform eq.(22) into canon- ical form: ( χx 2 + χx 2 ) x 1 + χx 1 = 0 (23) Since χx 2 + χx 2 = χx 2 + χ x 2 , the solution of eq.(23) is given by the interval ( χx 2 + χ x 2 ) ⊇ x 1 ⊇ χ. (24) Since variables x 1 and x 2 are interchangable and it is possible to solve eq.(22) regarding variable x 2 as well, the general form of solution of eq.(21) is the interval ( χx j + χ x j ) ⊇ x i ⊇ χ. (25) where i = 1 , 2 and j = 1 , 2; j 6 = i.  Since one bound of the solution intervals for eqs.(16) and (21) are functions of the second variable, we need to run all the possible values of the second variable in order to obtain all possible solutions of these equations in the form of pairs ( x 1 , x 2 ). Next we consider several examples, illustrating application of Lemmas 1 and 2. Example 1 . For illustration, we solve equation a ∗ = ba ∗ + c . Consider χ = a ∗ , x 1 = ba ∗ and x 2 = c , we obtain the solution interval for variable x 2 = c : χ ⊇ c ⊇ ( χχb + χ χb ). After simplfication, we get interval (26): χ ⊇ c ⊇ χb (26) Next we consider examples with particular alternatives. Let it be alternative { α } : χ = { α } . The solution interval is then { α } ⊇ c ⊇ { α } b . Since the lower bound of this interval is a function of variable b , to find all solutions of equation a ∗ = ba ∗ + c , we calculate value of expression { α } b for all possible values of variable b (Table 2). To reesure that solutions are correct, we check that decision equation a = ba + c turns into true equality for the obained pairs ( b, c ): ( { α } , { α } ): { α }{ α } + { α } = { α } ⇒ { α } = { α } is true; ( { α } , 0): { α }{ α } + 0 = { α } ⇒ { α } = { α } is true; ( { β } , { α } ): { α }{ β } + { α } = { α } ⇒ { α } = { α } is true; (1 , { α } ): { α } 1 + { α } = { α } ⇒ { α } = { α } is true; (1 , 0): { α } 1 + 0 = { α } ⇒ { α } = { α } is true; (0 , { α } ): { α } 0 + { α } = { α } ⇒ { α } = { α } is true. 12 Sergey Tarasenko So far, we have illustrated how to solve the influence equation. We as well showed that the pairs ( b, c ) obtained by solving equation a ∗ = ba ∗ + c in ac- cordance with Proposition 1 and Lemmas 1 and 2 are indeed solutions of this equation. Table 2. Solutions of the influence equation a ∗ = ba ∗ + c Values of b { α } { β } 1 0 Pairs ( b, c ) ( { α } , { α } ) ( { β } , { α } ) (1 , { α } ) (0 , { α } ) ( { α } , 0) (1 , 0) Example 2 . We consider influence equation for subject b obtained from eq.(3). ( c + a ) χ + χ = χ (27) First, we transform the left-hand side of eq.(27): ( c + a ) χ + χ = cχ + aχ + χ = cχ + aχ + ( c + a + 1) χ = c + a + χ . Therefore, eq.(27) can be rewritten as follows: c + a + χ = χ (28) Considering, x 1 = c and x 2 = a + χ , we instantly obtain the solution interval of eq.(28): χ ⊇ c ⊇ ( χ ( a + χ ) + χ ( a + χ )) ⇒ χ ⊇ c ⊇ ( χ a + χχa ). Finally, χ ⊇ c ⊇ χ a (29) Example 3 . Next, we consider influence equation ab + χ = χ (30) Considering, x 1 = ab and x 2 = χ , we instantly obtain the solution interval χ ⊇ ab ⊇ ( χχ + χχ ) or χ ⊇ ab ⊇ 0 (31) Therefore, in order to find all solutions of eq.(30), we need to solve the equa- tions ab = y (32) where y is any sub-set of set χ ( y ⊇ χ ). Each equation can be solved according to Lemma 2. Example 4 . As a final example, we again consider influence equation a ∗ = ( b + c ) a ∗ + ca ∗ and show how application of Lemma 1 essentially simplifies its solution. We get the system of influence equations: The Inverse Task 13 { b + c = { α } ; c = { α } . (33) (34) From this system we obtain a single equation: b + { α } = { α } . (35) According to Lemma 1, we instantly obtain the solution interval of eq.(35): { α } ⊇ b ⊇ 0 . (36) Thus, eq.(35) has two solutions: b = { α } and b = 0. Therefore the solution of system (33-34) consists of two pairs ( { α } , { α } ) and (0 , { α } ). To conclude this section, we provide its brief summary. We have shown how to solve the Inverse task by means of influence equations. We have proved two fundamental lemmas, which allow to solve any influence equation regardless of the number of variables. Finally, we have illustrated several examples of how apply these lemmas. 3.2 Analysis of Extreme Cases 1: Frustration In this section we analyze the situation, when subject can appear in frustration state, from the point of view of the inverse task. Let us consider the polynomial a ( b + c ) discussed in the section 2.1. The decision equation that corresponds to this polynomial is x = ( b + c ) a + a , where x can be any subject variable. Next we try to find all the pairs ( b, c ) such that result in selection of a particular alternative by subject a . The decision equation for subject a is a = ( b + c ) a + a . The solution interval of this decision equation is b + c ⊇ a ⊇ 1. We need to check which alternative subject a can be convinced to choose. To do this, we consider the system of equation for each alternative. Alternative { α } : { b + c = { α } 1 = { α } (37) (38) Alternative { β } : { b + c = { β } 1 = { β } (39) (40) Alternative 0 = {} : { b + c = 0 1 = 0 (41) (42) In these systems the second equation is incorrect equality. Therefore these systems have no solution. Alternative 1 = { α, β } : 14 Sergey Tarasenko { b + c = 1 1 = 1 (43) (44) The second equation is correct equality. Therefore this system has solution. Thus, out of four possible alternatives, subject a actually can choose only alternative 1 = { α, β } . To find solutions, resulting in selection of the alternative 1 = { α, β } , we need to solve only eq.(43), since eq.(44) turns into the true equality. According to Lemma 1, we instantly obtain the solution interval for eq.(43): 1 ⊇ b ⊇ c (45) We calculate the pairs ( b, c ) for all possible values of variable c (Table 3). Table 3. Solutions of the influence equation b + c = 1 Values of c { α } { β } 1 0 Pairs ( b, c ) ( { β } , { α } ) ( { α } , { β } ) (0 , 1) (1 , 0) (1 , { α } ) (1 , { β } ) ( { α } , 1) ( { β } , 1) (1 , 1) Therefore, the influence analysis of the decision equation a = ( b + c ) a + a shows that the only alternative that subject a can choose is alternative 1 = { α, β } . The influence analysis provides us with the set (exhaustive list) of pairs ( b, c ) of joint influences resulting in selection of alternative 1 = { α, β } . Therefore, if the pair of influences does not match any pair from this list, the decision equation has no solution and this results in frustration state. Summarizing, this section we note that in general there are two sets. The set D contains alternatives that a controlled subject can choose. The set U is the set of altertanives of the target choice. Therefore, the need to put subject a into frustration state emerges, if the target choice of a controlled subject cannot be made by this subject. In other words, we need to put a subject into frustration state, if D ∩ U = Ø. 3.3 Analysis of Extreme Cases 2: What to do with Super-Active Groups Among all the possible groups, there are groups, in which subjects will always choose only the alternative 1 = { α, β } regardless of the influence of other sub- jects. Such groups are called super-active groups . The Inverse Task 15 Next we consider one special case of super active groups - the homogenous groups. The group is called homogenous , if all the subjects in the group are connected with the same relationship. Here we provide proof of the lemma about homogenous groups originally formulated by Lefebvre [1, 2]. Lemma 3. Any homogenous group is the super-active group. Proof. We consider the homogenous groups, where all the subjects are connected with alliance (alliance groups) and conflict (conflict groups) relationship, sepa- rately. Without loss of generallity, we suggest that there are n subjects a 1 , a 2 , ..., a n . Alliance groups . The polynomial corresponding to the alliance group of n subject is a 1 a 2 ...a n . Next we construct the diagonal form and apply folding procedure: [ a 1 ][ a 2 ] ... [ a n ] [ a 1 a 2 ...a n ] = [ a 1 a 2 ...a n ] + [ a 1 ][ a 2 ] ... [ a n ] = 1 . Therefore the alliance groups are always super-active. Conflict groups . The polynomial corresponding to the conflict group of n subject is a 1 + a 2 + ... + a n . Next we construct the diagonal form and apply folding procedure: [ a 1 ] + [ a 2 ] + ... + [ a n ] [ a 1 + a 2 + ... + a n ] = [ a 1 + a 2 + ... + a n ]+ [ a 1 ] + [ a 2 ] + ... + [ a n ] = 1 . Therefore the conflict groups are always super-active. Since both the alliance and the conflict groups are super-active, this lemma is proved.  However, there are non-homogenous super-active groups as well (see Ap- pendix B). Summarizing this section, we note that subjects in the super-active groups cannot be controlled in their choices and the entire groups is uncontrolable. Therefore, once the super-active groups emerges, the only way to make it con- trollable is to change the relationships in the group. 4 The Basic Control Schema of an Abstract Subject (BCSAS) in the RGT We have presented the detailed description of the RGT including solution of the Forward and Inverse tasks. We have also considered the extream cases of decisions like putting a subject into frustration state or changing structure of a super-active group. As a final stroke, we summarize all the presented material in 16 Sergey Tarasenko Z χ == {}? yes no Pairs(M) M = 1 Start M=< N M = M + 1 End yes no D h = D h + χ Read Pairs ( χ , Z χ ) x Save D h Z h = Z h + Z χ Save Z h Fig. 3. The Block schema for extracting sets D h and Z h . the form of Basic Control Schema of an Abstract Subject (BCSAS) in the RGT . The input comes from the environment and is formalized in the form of exter- nal Influences on the subject, the Boolean algebra of Alternatives and Structure of a Group. Information about the Influences, Boolean algebra and Group Structure is propagated into the Decision Module . The Decision Module implements solution of the Forward task. Therefore the output set D of the Decision Module is the set of possible alternatives, which subject can choose under the given conditions. The information about Boolean algebra and Group Structure is propagated into the Influence Module . The Influence Module solves the Inverse task. The output set D h of the Influence Module is the set of the pairs ( χ, Z χ ) x , where χ is the target alternative, the set Z χ is the set of all the joint influences, resulting in selection of the target choice; and x represents a subject variable. Each ( χ, Z χ ) x represents a reflexive control strategy. Therefore, the decision to put a subject into f rustration state is justified if it is impossible to make subject x choose the target alternative χ , i.e., if for pair ( χ, Z χ ) x set Z χ = {} , and subject x should not choose any other alternative except for the target one. The Inverse Task 17 4.1 Schema for Iterative Algorithm to Obtain Output of the Influence Module The alternatives χ with corresponding non-empty sets Z χ are included into the set D h . Here we introduce set Z h to store the non-empty sets Z χ . The schema of the algorithm for extracting sets D h and Z h is presented in Fig. 3. First the sets D h and Z h are empty: D h = {} and Z h = {} . The algorithm reads the set of pairs ( χ, Z χ ) x and stores it in array P airs ( M ), where M is a counting variable, N is the total number of pairs. Then it is checked for each pairs from array P airs whether set Z χ is empty: Z χ == {} ? . If ’yes’, the algorithm increments counting variable M ( M = M + 1) and proceeds to the next pair from array Pairs. If ’no’, then alternative χ is included into the set D h ( D h = D h + χ ), set D h is saved, the set Z χ is included into set Z h ( Z h = Z h + Z χ ) and set Z h is saved. The process is run while M ≤ N . In this iterative algorithm, we separately store the alternatives χ , which can be chosen by a certian subject, in the set D h and the joint influences Z χ , which result in selection of alternative χ , in the set Z h . Therefore, we should modify the schema of Influence Module in BCSAS as follows. We present elaborated schema, where sub-module ”Solution: D h ” is ac- companied with sub-module ”Solution: Z h ”. Together these sub-modules are included into the ”Solutions” sub-module. BCSAS is the fundamental schema of an abstract subject, which is used through out the RGT. The BCSAS is presented in Fig.4. This concludes the overview of RGT and description of tasks within the scope of the general theory. Therefore, we continue with application of the RGT to the mixed groups of humans and robots. Decision equation of a robot Decision Module Solutions : D Boolean Algebra of Alternatives Environment Decision equation of a human Influence Module Solution: D h Realization of an alternative Reflexive control Influences System of Influence eqs. Structure of a Group Solution: Z h Solutions Fig. 4. The Basic Control Schema of an Abstract Subject (BSCAS). 18 Sergey Tarasenko 5 Defining Robots in RGT As we have noted in the Introduction section, the goal of the robots in mixed groups of humans and robots is to refrain human subject from choosing risky actions, which might result in injuries or even threaten live. It is considered by default that robot follows the program of behavior. Such program consists of at least three modules. The Module 1 implements robot’s ability of human-like decision-making based on the RGT. The Module 2 contains the rules, which refrain robot from making a harm to human beings. The Module 3 predicts the choice of each human subject and suggests the possible reflexive control strategies. The Modules 1 and 3 are inhereted from the BCSAS of an Abstract Individ- ual. They correspond to Decision Module and Influence Module of the BCSAS (Fig. 4), respectively. Therefore all the properties and meaning of outputs of the Modules 1 and 3 are the same as the ones for Decision and Influence modules, respectively. The Module 2 is the new module, which is intrinsic for robotic agents studied in the context of mixed groups of humans and robots. This module is responsible for extraction of only harmless or non-risky alternatives for human subject. We suggest to apply Asimov’s Three Laws of robotics [19], which formulate the basics of the Module 2: 1) a robot may not injure a human being or, through inaction, allow a human being to come to harm; 2) a robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law; 3) a robot must protect its own existence as long as such protection does not conflict with the First or Second Law. We consider that these laws are intrinsic part of robots ”mind”, which cannot be erased or corrupted by any means. The interaction of Modules 1 and 2 is performed in the Interaction Module 1. The interaction of Modules 3 and 2 is implements in the Interaction Module 2. The Boolean algebra is filtered according to Asimov’s laws in Module 2. The output of Module 2 is set U of approved alternatives. This data is then propagated into interaction modules. The output of the Module 1 is set D of alternatives, which robot has to choose under the given joint influences. In the Interaction Module 1, the conjunction of sets D and U is performed: D ∩ U = DU . If set DU is not empty set, this means that there are aproved alternatives among the alternatives that robot should choose in accordance with the joint influences. Therefore, robot can implement any alternative from the set DU . If set DU is empty, this means that under given joint influences robot cannot choose any approved alternative, therefore robot will choose an alternative from set U . This is how the Interaction Module 1 works. The output of the Module 3 contains sets D h and Z h . The goal of the robot is to refrain human subjects from choosing risky alternative. This can be done The Inverse Task 19 Asimov Laws’ based filter set U of approved alternatives Module 2 Decision equation of a robot Module 1 Solutions : D DU = = {}? U DU Boolean Algebra of Alternatives yes no D h U = = {}? U D h U yes no Environment Realization of an alternative Reflexive control Decision equation of a human Module 3 Solution: D h Influences Structure of a Group Solution: Z h Solutions Frustration yes X=D h U, for ∀χ∈ X get ( χ , Z χ ) for ∀ Z χ ∈ Z U get ( χ , Z χ ) Z U = = {}? get set Z U of Z χ ≠ {} : ∀χ∈ U no Interaction Module 1 Interaction Module 2 Fig. 5. The Basic Control Schema of a Robotic Agent (BCSRA). by convincing human subjects to choose alternatives from the set U . First, we check whether D h contains any approved alternative. We do so by performing conjunction of sets D h and U : D h ∩ U = D h U . If set D h U is not empty, then it means that it is possible to make a human subject to choose some non-risky alternative. Therefore, we should choose the corresponding reflexive control strategy from the set Z h . However, if set D h U is empty, we have to find the reflexive control strategy that will make human subject to select approved alternative from set U . For this purpose, we construct set Z U by including all the joint influences Z χ for approved alternatives: Z χ ∈ Z U ⇔ χ ∈ U . Next we check whether set Z U is empty. If set Z U is empty this means it is impossible to convince a human subject to choose non-risky alternative. Therefore, the only option of reflexive control in this case is to put this subject into frustration state. However, if set Z U is not empty, this means that there exist at least one reflexive control strategy that results in selection of alternative from the set of the approved (non-risky) ones. Therefore, the BCSRA inherits the entire structure of the BCSAS and aug- ments it with Module 2 of Asimov’s Laws together with Interaction Modules 1 and 2. The original schema of robot’s control system has been recently presented in [20]. The BCSRA is extended version of the original schema. The BCSRA 20 Sergey Tarasenko provides comprehensive approach of how Forward and Inverse tasks are solved in the robot’s ”mind”. Thus, in this section we have presented the formalization of robotic agent in the RGT. We outlined the specific features of robotic agents, which distinguish them from other subjects. Furthermore, we provided detailed explanation of how the Forward and Inverse tasks are solved in the framrework of control system (BCSRA) of robots. Next, we proceed with consideration of sample sutiations of interactions be- tween humans and robots. 6 Extended Sample Analysis of Mixed Groups Here we elaborate two examples, presented in the previous study [20], of how robots in the mixed groups can make humans refrain from risky actions. We discuss the application of the extended schema of robot’s control system and provide explicit derivation of reflexive control strategies, which has been applied in these examples in the prevous study [20]. 6.1 Robots Baby-Sitters Suppose robots have to play a part of baby-sitters by looking after the kids. We consider a mixed group of two kids and two robots. Each robot is looking after a particular kid. Having finished the game, kids are considering what to do next. They choose between “to compete climbing the high tree” (action α ) and “to play with a ball” (action β ). Together actions α and β represent the active state 1= { α, β } = { α } + { β } . Therefore the Boolean algebra of alternatives consists of four elements: 1) the alternative { α } is to climb the tree; 2) the alternative { β } is to play with a ball; 3) the alternative 1 = { α, β } means that a kid is hesitating what to do; and 4) the alternative 0 = {} means to take a rest. We consider that each kid considers his robot as ally and another kid and his robot as the competitors. The kids are subjects a and c , while robots are subjects b and d . The relationship graph is presented in Fig. 6. a c b d Fig. 6. The relationship graph for robots baby-sitters examples. Next we calculate the diagonal form and fold it in order to obtain decision equation for each subject: The Inverse Task 21 [ a ][ b ] [ c ][ d ] [ ab ] +[ cd ] [ ab + cd ] = ab + cd . From two actions α and β , action α is a risky action, since a kid can fall from the tree and this is real threat for his health or even life. Therefore according to Asimov’s laws, robots cannot allow kids to start the competition. Thus, robots have to convince kids not to choose alternative { α } . In terms of alternatives, the Asimov’s laws serve like filters which filter out the risky alternatives. The remaining alternatives are included into set U . In this case, U = {{ β } , {}} . Next we solve the Inverse taks, regarding alternatives { β } and {} . We conduct the analysis regarding kid a . This analysis can be further extended for kid c in the similar manner. Solution of the Inverse task for kid a with approved alternatives as target choice. The decision equation for kid a is a = ab + cd . First, we transform it into canonical form: a = ( b + cd ) a + cda . Next we consider system of influence equations: { b + cd = χ cd = χ, (46) (47) where alternative χ ∈ U . Regarding eq.(47), eq.(46) is transformed into equation b + χ = χ (48) The solution of eq.(48) directly follows from Lemma 1: χ ⊇ b ⊇ 0. Therefore for χ = { β } and χ = {} the solutions are { β } ⊇ b ⊇ 0 and b = 0, respectively. The eq.(47) can be instantly solved according to Lemma 2: χd + χ d ⊇ c ⊇ χ . Consider χ = { β } first . Then { β } d + { α } d ⊇ c ⊇ { β } . By varying values of variable d , we obtain all the pairs ( c, d ): d = 1: { β } ⊇ c ⊇ { β } ⇒ c = { β } . Therefore the solution is pair ( { β } , 1); d = 0: { α } ⊇ c ⊇ { β } . Since { α } ∩ { β } = {} , there is no solution; d = { α } : 0 ⊇ c ⊇ { β } . Since { β } ⊇ {} , there is no solution; d = { β } : 1 ⊇ c ⊇ { β } . Therefore there are two solutions (1 , { β } ) and ( { β } , { β } ). Therefore equation cd = { β } has three solutions ( { β } , 1), (1 , { β } ) and ( { β } , { β } ). Thus, we have solved both equations from system (46-47). The solutions of this system are the triplets ( b, c, d ) of joint influences, which are all possible com- binations of solutions of both equations. Since there are two solution of eq.(46) and three solutions of eq.(47), there are six triplets ( b, c, d ) in total: (0 , { β } , 1) and ( { β } , { β } , 1); (0 , 1 , { β } ) and ( { β } , 1 , { β } ); (0 , { β } , { β } ) and ( { β } , { β } , { β } ). Now we consider the case, when χ = 0 = {} . Then d ⊇ c ⊇ 0. We obtain pairs ( c, d ) for all values of variable d : d = 1: 1 ⊇ c ⊇ 0 ⇒ c = 0. Thus, there is only one solution (0,1); 22 Sergey Tarasenko d = 0: 1 ⊇ c ⊇ 0. Thus, there are four solutions (1 , 0) , ( { α } , 0) , ( { β } , 0) and (1 , 0); d = { α } : { β } ⊇ c ⊇ 0. Thus, there are four solutions ( { β } , { α } ) and (0 , { α } ); d = { β } : { α } ⊇ d ⊇ 0. Thus, there are four solutions ( { α } , { β } ) and (0 , { β } ). In total, equation cd = 0 has 9 solutions. Therefore system (49-50) also has 9 solutions as triplets ( b, c, d ): (0 , 1 , 0), (0 , 0 , 0), (0 , 0 , { α } ), (0 , 0 , { β } ), (0 , 0 , 1), (0 , { α } , { β } ), (0 , { α } , 0), (0 , { β } , { α } ) and (0 , { β } , 0). We have considered two cases, when both upper and lower bounds of the interval of decision equation equal to the same alternative. Now we discuss a new situation, when variable a should take not a single value, but several values. In this case, we should find the joint influences ( b, c, d ) that result in selection of either alternative { β } or {} . Since, { β } ⊇ {} , we need to find all the triplets ( b, c, d ), resulting in the solution of decision equation as interval { β } ⊇ a ⊇ {} . Thus, { β } ⊇ a ∗ ⊇ {} . Therefore, we need to solve the following system of equations: { b + cd = { β } cd = 0 . (49) (50) The eq.(49) turns into equality b = { β } , and we need to solve eq.(50). How- ever, this equation has been already solved in the previous example. Therefore we obtian the solutions of the system (49-50): ( { β } , 1 , 0) , ( { β } , 0 , 0), ( { β } , 0 , { α } ) , ( { β } , 0 , { β } ) , ( { β } , 0 , 1) , ( { β } , { α } , { β } ) , ( { β } , { α } , 0), ( { β } , { β } , { α } ) and ( { β } , { β } , 0). Comparing solutions of all three system of influence equation, we can see that there are four remarkable solutions ( { β } , { β } , { β } ) and ( { β } , {} , { β } ); ( { β } , 1 , { β } ) and ( { β } , { α } , { β } ). The first pair of solution results in choice of only alternative { β } , while second pair of solutions results in selection of eighter alternative { β } or alternative {} . These four solutions together illustrate that if b = d = { β } , it is guaranteed that regardless of influence of kid c , kid a will choose either of approved alternatives. By analogy, we can see that among solutions of system (46-47) with χ = {} , there are four solutions (0 , 1 , 0),(0 , 0 , 0), (0 , { α } , 0) and (0 , { β } , 0). Therefore, if b = d = 0, kid a will choose alternative 0 = {} regardless of influence of kid c . These two examples of binding variables b and d were considered in Scenario 1 and Scenario 2 of sample situation with robot baby-sitters, originally presented in [20]. Summarizing the results of this section, we have shown that robots can suc- cessfully control kids’ behavior by refraining them from doing risky actions. The basic of this control is entirely based on the proposed schema of robot’s control system. We have analyzed all the possible reflexive control strategies by solving three systems of influence equation: two systems regarding a single alternative and one system regarding the interval of alternatives. Therefore, we have shown how the Inverse task can be effectively solved by our proposed algorithm in situation similar to the real conditions. The Inverse Task 23 6.2 Mountain-Climbers and Rescue Robot We consider that there are two climbers in the mountain and rescue robot. The climbers and robot are communicating via radio. One of the climbers (subject b ) got into difficult situation and needs help. Suggest, he fell into the rift because the edge of the rift was covered with ice. The rift is not too deep and there is a thick layer of snow on the bottom, therefore climber is not hurt, but he cannot get out of the rift himself. The second climber (subject a ) wants to rescue his friend himself (action α ), which is risky action. The second option is that robot will perform rescue mission (action β ). Since inaction is inappropriate solution according to the First Law, the set U of approved alternatives for robot includes only alternative { β } . The goal of the robot is to refrain the climber a from choosing alernative { α } and perform rescue mission itself. We suggest that from the beginning all subjects are in alliance. The cor- responding graph is presented in Fig. 1c and its polynomial is abc . Therefore by definition it is homogenous group and, consequently, it is super-active group according to Lemma 3. Thus, any subject in the group is in active state. Therefore, group is un- controllable (see Section 3.3). In this case, robot makes decision to change his relationship with the climber b from alliance to conflict. Robot can do that, for instance, by not responding to climber’s orders. Which reflexive control leads to frustration state? Then the polynomial corre- sponding to the new group is a ( b + c ). This polynomial has been already broadly discussed in the Section 3.2. Therefore, we know decision equation for subject a : a = ( b + c ) a + a . We have shown as well that subject a can choose only alternative 1 = { α, β } , if appropriate joint influences are applied (see Section 3.2), overwise subject a is in frustration state and cannot make any choice. Therefore, in or- der to put subject a into frustration state, the reflexive control strategy should N OT be selected from the list of solutions (Section 3.2): ( { β } , { α } ); (1 , { α } ); ( { α } , { β } ); (1 , { β } ); (0 , 1); ( { α } , 1); ( { β } , 1); (1 , 1) and (1 , 0). Here we provide two examples of such joint influences ( b, c ): ( { α } , { α } ) ⇒ ( { α } + { α } ) = { α } ⊂ 1 and ( { β } , {} ) ⇒ ( { β } + {} ) = { β } ⊂ 1. Whether robot can complete mission regardless of joint influences of other subjects? The decision equation for robot c is c = c + ( b + a ) c . The corresponding solution interval is 1 ⊇ c ⊇ ( b + a ). Here we analyze all 16 possible reflexive control strategies ( a, b ) that climbers can apply to robot c . Examples with emtpy set DU . For (0 , b ), there will be the same situation regardless of value of variable b : 1 ⊇ c ⊇ ( b + 0) ⇒ 1 ⊇ c ⊇ ( b + 1) ⇒ c = 1. For ( a, 1), there will be the same situation regardless of value of variable a : 1 ⊇ c ⊇ (1 + a ) ⇒ c = 1. For ( { α } , { α } ): 1 ⊇ c ⊇ ( { α } + { α } ) ⇒ 1 ⊇ c ⊇ ( { α } + { β } ) ⇒ c = 1. For ( { β } , { β } ): 1 ⊇ c ⊇ ( { β } + { β } ) ⇒ 1 ⊇ c ⊇ ( { β } + { α } ) ⇒ c = 1. Therefore in these cases set D = {{ α, β }} . Next we consider other pairs ( a, b ). 24 Sergey Tarasenko (1 , { α } ): 1 ⊇ c ⊇ ( { α } + 1) ⇒ 1 ⊇ c ⊇ { α } . Here set D = {{ α, β } , { α }} . ( { β } , { α } ): 1 ⊇ c ⊇ ( { α } + { β } ) ⇒ 1 ⊇ c ⊇ { α } . Here set D = {{ α, β } , { α }} . ( { β } , 0): 1 ⊇ c ⊇ (0 + { β } ) ⇒ 1 ⊇ c ⊇ { α } . Therefore, set D = {{ α, β } , { α }} . Since U = {{ β }} , DU = {} for all the cases considered above, robot will choose alternative { β } from the set U . Examples with non-empty set DU . Consider the following pairs ( a, b ): (1 , { β } ): 1 ⊇ c ⊇ ( { β } + 1) ⇒ 1 ⊇ c ⊇ { β } . Therefore, set D = {{ α, β } , { β }} . (1 , 0): 1 ⊇ c ⊇ (0 + 1) ⇒ 1 ⊇ c ⊇ 0. Thus, set D = {{ α, β } , { α } , { β } , {}} . ( { α } , { β } ): 1 ⊇ c ⊇ ( { β } + { α } ) ⇒ 1 ⊇ c ⊇ { β } . Thus, set D = {{ α, β } , { β }} . ( { α } , { β } ): 1 ⊇ c ⊇ ( { β } + { α } ) ⇒ 1 ⊇ c ⊇ { β } . Thus, set D = {{ α, β } , { β }} . ( { α } , 0): 1 ⊇ c ⊇ (0 + { α } ) ⇒ 1 ⊇ c ⊇ { β } . Thus, set D = {{ α, β } , { β }} . Since U = {{ β }} , DU = {{ β }} for all the cases considered above, robot will choose alternative { β } from the set DU . Thus, we have shown that under all 16 reflexive control strategies ( a, b ), robot c can choose the alternative { β } , which is to perform the rescue mission itself. Therefore robot will choose alternative { β } regardless of the joint influences ( a, b ) of the climbers. The discussed example illustrates how robot can transform uncontrollable group into controllable one by manipulating the relationships in the group. In the controllable group by its influence on the human subjects, robot can refrain the climber a from risky action to rescue climber b . Robot achieves its goal by putting climber a into frustration state, in which climber a cannot make any decision. On the other hand, set U of approved alternatives guarantees that robot itself will choose the option with no risk for humans and implement it regardless of climber’s influence. Therefore, in this section we have illustrated robot’s ability to refrain human being from risky actions and to perform these risky actions itself. This proves that our approach achieves both goals of robotic agent: 1) to refrain people from risky actions and 2) to perform risky actions itself regardless of human’s influences. 7 Discussion and Conclusion Summarizing, the results of this paper, we outline the most important of them. First of all, we have introduced the Inverse task and developed the ultimate methods to solve it. We have provided a comprehensive tutorial to the brand new Reflexive Game Theory recently formulated and proposed by Vladimir Lefebvre [1, 2, 3, 4]. The tutoral contains the detailed description of the Forward and Inverse tasks together with methods to solve them. We propose control schemas for both abstract subject (BCSAS) and robotic agent (BCSRA). These schemas were specially designed to incorporate solution of the Forward and Inverse tasks, thus providing us with autonomous units The Inverse Task 25 (individuals, subjects, agents) capable of making decisions in the human-like manner. We have shown that robotic agents based on BCSRA can be easily included into the mixed groups of humans and robots and effectively serve their fundamental goals (refraining humans from risky actions and, if needed, perform the risky acions itself). Therefore, we consider that present study provides the comprehensive overview of the classic RGT proposed by Vladimir Lefebvre [1, 2, 3, 4] and newly developed self-consistent framework for analysis of different kinds of groups and societies, including human social groups and mixed groups of humans and robots together with application tutorial of this new framework. This framework is entirely based on the principles of the RGT and brings together all its elements. The solution of the Inverse task, presented in this paper, plays a crutial role in formation of this framework. Therefore, by having the Inverse task as one of its fundamentals, this framework illustrates the role of the Inverse task and its relationship with other issues considered in the RGT. References 1. Lefebvre, V.A.: Lectures on Reflexive Game Theory. Leaf & Oaks, Los Angeles (2010). 2. Lefebvre, V.A.: Lectures on Reflexive Game Theory. Cogito-Center, Moscow (2009) [in Russian]. 3. Lefebvre, V.A.: The basic ideas of reflexive game’s logic. Problems of research of systems and structures. pp. 73-79 (1965) [in Russian]. 4. Lefebvre, V.A.: Reflexive analysis of groups. In: Argamon, S. and Howard, N. (eds.) Computational models for counterterrorism. pp. 173-210. Springer, Heidel- berg (2009). 5. Lefebvre, V.A.: Algebra of Conscience. D. Reidel, Holland (1982). 6. Lefebvre, V.A.: Algebra of Conscience. 2nd Edition. Holland: Kluwer (2001). 7. Batchelder, W.H., Lefebvre, V.A.: A mathematical analysis of a natural class of partitions of a graph. J. Math. Psy. 26, pp. 124-148 (1982). 8. Kobatake, E., and Tanaka, K.: Neuronal Selectivities to Complex Object Features in the Ventral Pathway of the Macaque Monkey. Journal of Neurophysiology, 71, 3, pp. 856-867 (1994). 9. Koerner, E., Gewaltig, M.-O., Koerner, U., Richter, A., and Rodemann, T.: A model of computation in neocortical architecture. Neural Networks, 12, pp. 989- 1005 (1999). 10. L ̈ ucke, J., and von der Malsburg, C.: Rapid processing and unsupervised learning in a model of the cortical macrocolumn. Neural Computation, 16, pp. 501-533 (2003). 11. Schrander, S., Gewaltig, M.-O., K ̈ orner, U. and K ̈ orner, E.: Cortext: A columnar- model of bottom-up and top-down processing in the neocortex. Neural Networks, 22, pp. 1055-1070 (2009). 12. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mech- anism of pattern recognitition unaffected by shift and position, Biological Cyber- natics, 36, pp. 193-201 (1980). 13. Riesenhuber, M. and Poggio, T.: Hierarchical models of object recognition in cor- tex. Nature Neuroscience, 2, 11, pp. 109-125 (1999). 26 Sergey Tarasenko 14. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio.: Robust Object Recognition with Cortex-like Mechanisms, IEEE Transactions on pattern analysis and machine intelligence, 29, 3, pp. 411-426 (2007). 15. Hienze, J.: Hierarchy length in orphaned colonies of the ant Temnothorax nylanderi Naturwissenschaften, 95, 8, pp. 757-760 (2008). 16. Chase, I., D.: Models of hierarchy formation in animal societies. Behavioral Science, 19, 6, pp. 374-382 (2007). 17. Chase I., Tovey C., Spangler-Martin D., Manfredonia M.: Individual differences versus social dynamics in the formation of animal dominance hierarchies. PNAS, 99, 9, pp. 5744-5749 (2002). 18. Buston P.: Social hierarchies: size and growth modification in clownfish. Nature, 424, pp. 145-146 (2003). 19. Asimov, I.: Runaround. Astounding Science Fiction, March, pp. 94-103 (1942). 20. Tarasenko, S.: Modeling mixed groups of humans and robots with Reflexive Game Theory. In Lamers, M.H., and Verbeek, F.J. (eds.): HRPR 2010, LINCST 59, pp. 108-117 (2011). Appendix A When sets A and B are functions of less than total number of subject minus one variables Consider groups of four subjects a, b, c and d . Suggest the polynomial corre- sponding to this group is b ( a + d ) + c . Next we construct diagonal form and perform folding operation: [ a ] + [ d ] [ b ][ a + d ] [ b ( a + d )] +[ c ] [ b ( a + d ) + c ] = [ b ]([ a + d ] + [ a ] + [ d ]) [ b ( a + d )] +[ c ] [ b ( a + d ) + c ] = [ b ] [ b ( a + d )] +[ c ] [ b ( a + d ) + c ] = = b ( a + d ) + c + b ( a + d ) + b + c Next we simplify the resultant expression of diagonal form folding: b ( a + d ) + c + b ( a + d ) + b + c = b ( a + d ) + c + b ( a + d ) cb = b ( a + d ) + cb + cb + b ( a + d ) cb = b (( a + d ) + c + b ( a + d ) c ) + cb = b (( a + d ) c + ( a + d ) c + c + ( b + ( a + d )) c ) + cb = b (( a + d ) c + ( a + d ) c + c + bc + ( a + d ) c ) + cb = b ( c + ( a + d ) c + (( a + d ) + ( a + d )) c ) + cb = b (( a + d ) c + c + c ) + cb = b (( a + d ) c + 1) + cb = b + cb = b + c The Inverse Task 27 Consequently, [ b ( a + d )] + [ b ] + [ c ] [ b ( a + d ) + c ] = b + c Therefore, the decision equation includes only two subject variables instead of four. Consequenly, for subjects a and d the decision equations in canonical forms are a = ( b + c ) a + ( b + c ) a (51) d = ( b + c ) d + ( b + c ) d (52) Thus, the sets A and B for subjects a and d are equal. The sets A and B are functions of only variables b and c : A = A ( b, c ) = b + cb and B = B ( b, c ) = b + cb . The canonical forms of decision equations for subjects b and c are: b = b + cb (53) c = c + bc (54) Therefore, set A = 1 for both subjects. Set B is a functions of a single variable: B ( c ) = c and B ( b ) = b for subjects b and c , respectively. B Example of non-homogenous super-active groups Here we provide an example of non-homogenous super-active group. Consider the group of four subject a, b, c and d , which is described by poly- nomial c ( ab + b ). Let us build the diagonal form and perform its folding: [ a ][ b ] [ ab ] +[ d ] [ c ][ ab + d ] [ c ( ab + d )] = ([ ab ] + [ a ][ b ]) + [ d ] [ c ][ ab + d ] = [ c ( ab + d )] = 1 [ c ][ ab + d ] = [ c ( ab + d )] = = [ c ( ab + d )] + [ c ][ ab + d ] = 1