The Inverse Task of the Reﬂexive Game Theory:
Theoretical Matters, Practical Applications and
Relationship with Other Issues
Sergey Tarasenko
Kyoto University, Yoshida honmachi, Kyoto 606-8501, Japan
infra.core@gmail.com
Abstract. The Reﬂexive Game Theory (RGT) has been recently pro-
posed by Vladimir Lefebvre to model behavior of individuals in groups.
The goal of this study is to introduce the Inverse task. We consider meth-
ods of solution together with practical applications. We present a brief
overview of the RGT for easy understanding of the problem. We also de-
velop the schematic representation of the RGT inference algorithms to
create the basis for soft- and hardware solutions of the RGT tasks. We
propose a uniﬁed hierarchy of schemas to represent humans and robots.
This hierarchy is considered as a uniﬁed framework to solve the entire
spectrum of the RGT tasks. We conclude by illustrating how this frame-
work can be applied for modeling of mixed groups of humans and robots.
All together this provides the exhaustive solution of the Inverse task and
clearly illustrates its role and relationships with other issues considered
in the RGT.
Key words: Reﬂexive Game Theory (RGT), group behavior, society
behavior, RGT Forward Task, RGT Inverse Task, Asimov’s Laws of
Robotics, robots in RGT, mixed groups of humans and robots, human-
robot societies
1 Introduction
The Reﬂexive Game Theory (RGT) has been entirely developed by Lefebvre [1, 2]
and is based on the principles of anti-selﬁshness or egoism forbiddeness [1, 2]
and human reﬂexion processes [3]. Therefore RGT is based on the human-like
decision-making processes. The main goal of the theory is to model behavior of
individuals in the groups. It is possible to predict choices, which are likely to be
made by each individual in the group, and inﬂuence each individual’s decision-
making due to make this individual to make a certain choice. In particular, the
RGT can be used to predict terrorists’ behavior [4].
In general, the RGT is a simple tool to predict behavoir of invididuals and
inﬂuence individuals’ choices. Therefore it makes possible to control the individ-
uals in the groups by guiding their behavoir (decision-making, choices) by means
of the corresponding inﬂuences.
arXiv:1011.3397v1  [cs.MA]  15 Nov 2010
2
Sergey Tarasenko
On the other hand, now days robots have become an essential part of our life.
One of the purposes robots serve to is to substitute human beings in dangerous
situations and environments, like defuse a bomb or radioactive zones etc.
In contrast, human nature shows strong inclinations towards the risky be-
havior, which can cause not only injuries, but even threaten the human life.
The list of these reasons includes a wide range starting from irresponsible kids’
behavior to necessity to ﬁnd solution in a critical situation. In such a situation,
a robot should full-ﬁll a function of refraining humans from doing risky actions
and perform the risky action itself, if needed.
However, robots are forbidden and should not physically force people, but
must convince people on the mental level to refrain from doing a risky action.
This method is more eﬀective rather than a simple physical compulsion, because
humans make the decisions (choices) themselves and treat these decisions as
their own. Such technique is called a reﬂexive control [3].
The task of ﬁnding appropriate reﬂexive control is closely related with the
Inverse task, when we need to ﬁnd suitable inﬂuence of one subject on another
one or on a group of subject on the subject of interest. Therefore, it is needed
to develop the framework of how to solve the Inverse task. This is the primary
goal of this study.
However, for better understanding of the gist of the Inverse task and its
intrinsic relationships with other issues of the RGT, we introduce the entire
spectrum of the tasks, which can be solved by the RGT. This forms the scope
of inference algorithms used in the RGT. We present the RGT algorithms in
the form of the schemas of control systems that can be instantly applied for
developement of soft- or/and hardware solutions. We develop a hierarchy of
control systems for abstract individual (including human subject) and robotic
agent (robot) based on these control schemas. Finally, we illustrate application of
the Inverse task together with other RGT inference algorithms to model robot’s
behavior in the mixed groups of humans and robots.
2 Brief Overview of the Reﬂexive Game Theory (RGT)
2.1 Representation of groups: graphs, polynomials and stratiﬁcation
tree
The RGT deals with groups of abstract subjects (individuals, humans, au-
tonomous agents etc). Each subject is assigned a unique variable (subject vari-
able). Any group of subjects is represented in the shape of fully connected graph,
which is called a relationship graph. Each vertex of the graph corresponds to a
single subject. Therefore the number of vertices of the graph is in one-to-one
correspondence with overall number of subjects in the groups. Each vertex is
named after the corresponding subject variable.
The RGT uses the set theory and the Boolean algebra as the basis for calcu-
lus. Therefore the values of subject variables are elements of Boolean algebra.
The Inverse Task
3
All the subjects in the group can have either alliance or conﬂict relationship.
The relationships are identiﬁed as a result of group macroanalysis. It is suggested
that the installed relationships can be changed. The relationships are illustrated
with graph ribs. The solid-line ribs correspond to alliance, while dashed ones
are considered as conﬂict. For mathematical analysis alliance is considered to be
conjunction (multiplication) operation (·), and conﬂict is deﬁned as disjunction
(summation) operation (+).
The graph presented in Fig. 1a or any graph containing any sub-graph isomor-
phic to this graph are not decomposable. In this case, the subjects are excluded
from the group one by one, until the graph becomes decomposable. The exclusion
is done according to the importance of the other subjects for a particular one
[1, 2]. Any other fully connected graphs are decomposable. Any decomposable
graph can be presented in an analytical form of a corresponding polynomial. Any
relationship graph of three subjects is decomposable (see [1, 2]).
Consider three subjects a, b and c. Let subject a is in alliance with other
subjects, while subjects b and c are in conﬂict (Fig. 1b). The polynomial corre-
sponding to this graph is a(b + c).
a
c
b
a
c
b
a
b
c
a
c
b
d
Fig. 1. The relationship graphs.
[a(b+c)]
[a]
[b+c]
⋅
[b]+ [c]
Fig. 2. Polynomial Stratiﬁcation Tree. Polynomials [a], [b] and [c] are elementary poly-
nomials.
Regarding a certain relationship, the polynomial can be stratiﬁed (decom-
posed) into sub-polynomials [1, 2]. Each sub-polynomial belongs to a particular
level of stratiﬁcation. If the stratiﬁcation regarding alliance was ﬁrst built, then
the stratiﬁcation regarding the conﬂict is implemented on the next step. The
stratiﬁcation procedure ﬁnalizes, when the elementary polynomials, containing
a single variable, are obtained after a certain stratiﬁcation step.
The result of stratiﬁcation is the Polynomial Stratiﬁcation Tree (PST). It
has been proved that each non-elementary polynomial can be stratiﬁed in an
unique way, i.e., each non-elementary polynomial has only one corresponding
4
Sergey Tarasenko
PST (see [7] considering one-to-one correspondence between graphs and polyno-
mials). Each higher level of the tree contains polynomials simpler than the ones
on the lower level. For the purpose of stratiﬁcation the polynomials are written
in square brackets. The PST for a(b + c) polynomial is presented in Fig.2.
Next, we omit the branches of the PST and from each non-elementary polyno-
mial write in top right corner its sub-polynomials. The resulting tree-like struc-
ture is called a diagonal form[1, 2, 5, 6]. Consider the diagonal form correspond-
ing to the PST in Fig. 2:
[b] + [c]
[a][b + c]
[a(b + c)]
.
Hereafter, the diagonal form is considered as a function deﬁned on the set of
all subsets of the universal set. The universal set contains the elementary actions.
For example, these actions are actions α and β. By deﬁnition, the Boolean algebra
of the universal set includes four elements: 1 = {α, β}, {α}, {β} and the empty
set 0 = {} = Ø. These elements are all the possible subsets of universal set and
considered as alternatives that each subject can choose. The alternative 0 = {}
is interpreted as an inactive or idle state. In general, Boolean algebra consists of
2n alternatives, if universal set contains n actions.
Accroding to deﬁnition given by Lefebvre [5], we present here exponential
operation deﬁned by formula
P W = P + W ,
(1)
where W stands for negation of W [1, 2, 4].
This exponential operation is used to fold the diagonal form. During the
folding, round and square brackets are considered to be interchangeable. The
following equalities are also considered to be true: x + x = 1, x + 0 = x and
x + 1 = 1. Next we implement folding of diagonal form of polynomial a(b + c):
[b] + [c]
[a][b + c]
[a]([b + c] + [b] + [c])
[a(b + c)]
= [a(b + c)]
= a(b + c) + a .
It is considered that the levels of the PST represent diﬀerent processing levels
of natural or artiﬁcial cognitive system. Each level is considered as an images.
The root of the tree is the input into the cognitive system and, therefore can be
considered as the image of the world (environment including self and others),
perceived by the subject.
As it follows from the PST, there is a hierarchy of images, corresponding
to a particular cognitive level. During processing along this hierarchy in the
bottom-up manner, the image on the lower level undergoes an extensive process
of simpliﬁcation by the means of decomposition into simpler parts on the higher
level. These parts are considered to be the images of the image on the previous
level. Therefore, the images on the second level are diﬀerent representions of the
The Inverse Task
5
original image of the world. This procedure repeats until we obtain elementary
part (elementary polynomials) [1, 2].
On the other hand, the PST folding procedure can be referred as top-down
intergration process of simpler images from the higher levels.
Therefore, the stratiﬁcation procedure of original polynomial together with
the folding procedure of the diagonal form illustrate the interplay of bottom-up
and top-down information processes, which are widely imployed in biological
[8, 9, 10, 11] and artiﬁcial [12, 13, 14] information processing systems. The idea
of hierarchical structure is highly coherent with hierarchical organization of ma-
jority of natural (inanimate objects) and biological (living creatures) entities.
Furthermore, it has been shown that hierarchical structure is intrinsic for the
relationships in societies of insects [15], animals [17, 16, 18] and human beings.
Therefore hierarchical representation of the groups in the form of PST corre-
spond to extraction of the hierarchical structure of the given group, while fusion
of the PST and its diagonal form with diagonal form folding procedure closely
resembles the way of information processing within a single independent congni-
tive system as discussed above. Thus, RGT imploys the fundamental principles
of hierarchical organization on both group (reﬂects structure of the groups) and
individual (illustrates information processing within independent cognitive sys-
tem of a single unit) levels. This makes RGT universal tools that mildly bridges
the gap between representation and analysis.
2.2 The Decision Equation: deﬁnition and solution
The goal of each subject in a group is to choose an alternative from the set of
alternatives under consideration. To obtain choice of each subject, we consider
the decision equations, which contain subject variable in the left-hand side and
the result of diagonal form folding in the right-hand side:
a = (b + c)a + a
b = (b + c)a + a
c = (b + c)a + a
To ﬁnd solution of the decision equations, we consider the following equation:
x = Ax + Bx ,
(2)
where x is the subject variable, and A and B are some sets. Eq.(2) represents
the canonical form of decision equation. This equation has solution if and only
if the set B is contained in set A: A ⊇B. If this requirement is satisﬁed, then
eq.(2) has at least one solution from the interval A ⊇x ⊇B [4]. Otherwise, the
decision equation has no solution, and it is considered that subject cannot make
a decision. In such situation, the subject is in frustration state.
Therefore, to ﬁnd solutions of decision equation, one should ﬁrst transform
it into the canonical form. Out of three presented equations only the decision
6
Sergey Tarasenko
equation for subject a is in the canonical form, while other two should be trans-
formed. We consider explicit transformation only of decision equation for subject
b [20]:
a(b+c)+a = ab+ac+a = ab+(ac+a)b+(ac+a)b = (a+a+ac)b+(ac+a)b =
(1 + ac)b + (ac + a)b = b + (ac + a)b = b + (ac + ac + a)b = b + (c + a)b.
Therefore,
b = b + (c + a)b.
(3)
The transformation of equation for subject c be can be easily derived by
analogy: c = c + (b + a)c.
Next we consider two tasks, which can be formulated regarding the decision
equation in the canonical form and provide methods to solve each task.
2.3 The Forward Task
The variable in the left-hand side of the decision equation in canonical form is
the variable of the equation, while other variables are considered as inﬂuences
on the subject from the other subjects. The Forward task is formulated as a task
to ﬁnd the possible choices of a subject of interest, when the inﬂuences on him
from other subjects are given.
After transformation of arbitral decision equation into its canonical form,
the sets A and B are functions of other subjects’ inﬂuences. For example, if we
consider group of subjects a, b, c, etc. togehter with the abstract representation
of decision equation in canonical form for subject a, the sets A and B will be
the functions of subject variables b, c, etc. :
a = A(b, c, ...)a + B(b, c, ...)a .
(4)
In the case of only three subjects a, b and c, A(b, c, ...) = A(b, c) and
B(b, c, ...) = B(b, c).
All the inﬂuences are presented in inﬂuence matrix (Table 1). The main
diagonal of inﬂuence matrix contains the subject variables. The rows of the
matrix represent inﬂuences of the given subject on other subjects, while columns
represent the inﬂuences of other subjects on the given one. The inﬂuence values
are used in decision equations.
Table 1. Inﬂuence Matrix
a
b
c
a
a
{α} {β}
b {β}
b
{β}
c {β} {β}
c
The Inverse Task
7
For subject a: a = ({β} + {β})a + a ⇒a = {β}a + a.
For subject b: b = b + ({α}{β} + {α})b ⇒b = b + {β}b.
For subject c: c = c + ({β}{β} + {β})c ⇒c = c + ({β} + {α})c ⇒c = 1.
Equation for subject a does not have any solutions, since set A = A(b, c) =
{β} is contained in set B = B(b, c) = 1: A ⊂B. Thus, subject a cannot make
any decision. Therefore he is considered to be in frustration state.
Equation for subject b has at least one solution, since A = A(b, c) = 1 =
{α, β} ⊇B = B(b, c) = {β}. The solution belongs to the interval 1 ⊇b ⊇{β}.
Therefore subject b can choose any alternative from Boolean algebra, which
contains alternative {β}. These alternatives are 1 = {α, β} and {β}.
Equation for subject c turns into equality c = 1. This is possible only in the
case, when A(b, c) ≡B(b, c). Here A = B = 1.
2.4 The Inverse Task
In contrast to the Forward task, the Inverse task is formulated as a task to
ﬁnd all the simultaneous (or joint) inﬂuences of all the subjects together on the
subject of interest that result in choice of a particular alternative or subset of
alternatives. We call the subject of interest to be a controlled subject.
Let subject a be a controlled subject and a∗is a ﬁxed value, representing an
alternative or subset of alternatives, which subjects b, c, etc. want subject a to
choose. We call value a∗to be a target choice. By substituting subject variable a
with ﬁxed value a∗, we obtain the inﬂuence equation. If we substitute the subject
variable a with ﬁxed value a∗in the canonical form of the decision equation (eq.
(4)), we obtain the canonical form of the inﬂuence equation:
a∗= A(b, c, ...)a∗+ B(b, c, ...)a∗,
(5)
For only three subjects a, b and c, A(b, c, ...) = A(b, c) and B(b, c, ...) =
B(b, c).
In contrast to the decision equation, which is equation of a single variable,
the inﬂuence equation is the equation of multiple variables. However, the number
of variables of inﬂuence equation is not trivial question. In fact, the number of
variables in inﬂuence equation can be less then (n −1), where n is the total
number of subjects in the group. There are groups, in which sets A and B are
functions of less than (n−1) variables (see Appendix A). Therefore the variables
that present in inﬂuence equation are called eﬀective variables.
The Inverse task is by deﬁnition1 formalized as to ﬁnd all the joint solutions
of all subjects in the group, except for the controlled one, when the target choice
is represented by interval χ1 ⊇a∗⊇χ2, where χ1 and χ2 are some sets and
χ1 ⊃χ2. In such a case, to solve the Inverse task, one should solve the system
of inﬂuence equations:
1 We need a system of inﬂuence equations because solutions of the inﬂuence equation
a∗= A(b, c, ...)a∗+ B(b, c, ...)a∗itself only guaratee that the original decision equa-
tion a = A(b, c, ...)a + B(b, c, ...)a turns into true equality, but it is not guaranteed
that these solutions are the only ones that turn decision equation into true equality.
8
Sergey Tarasenko
A(b, c, ...) = χ1
B(b, c, ...) = χ2
(6)
(7)
If the target choice is a single alternative, then χ1 = χ2 = a∗.
The solutions of the system (6-7) are considered as reﬂexive control strategies.
The solution of the Inverse task in particular is characterized from two points.
The ﬁrst point is whether it is required to ﬁnd the inﬂuence of a particular single
subject or joint inﬂuences of a group of subjects. The second one is whether the
target choice is represented as a single alternative or as an interval of alternatives.
To illustrate these points, we introduce a particular group of subjects. Let
subjects a and b are in alliance with each other and in conﬂict with subject
c. The polynomial corresponding to this graph is ab + c. The diagonal form
corresponding to this polynomial and its folding is
[a][b]
[ab]
+[c]
[ab + c]
= ab + c
Therefore the decision equation for all the subjects in the group is
x = ab + c,
(8)
where x can be any subject variable a, b or c.
Inﬂuence of a single subject vs joint inﬂuences of a group. First we consider
example, when the inﬂuence of a single subject is required. Let subject b makes
inﬂuence {α} and a∗= {α}. Then we need to ﬁnd inﬂuences of a single subject
c, which result in solution a∗= {α} of decision equation a = ab + c.
The canonical form of this inﬂuence equation is a∗= (b + c)a∗+ ca∗. Since
a∗= {α}, χ1 = χ2 = {α}, we obtain a system of equations:
{α} + c = {α}
c = {α}
(9)
(10)
Therefore, the straight forward solution of this system is c = {α}.
This simple example illustrates the very gist of the Inverse task - to ﬁnd the
appropriate inﬂuences, which result in target choice.
Next, we consider that inﬂuence of subject b is not known. Therefore, we
obtain system
b + c = {α}
c = {α}
(11)
(12)
In this case, we need to ﬁnd the values of variable b, which together with
c, result in solution a∗= {α}. In other words, we need to ﬁnd all the pairs
(b, c), resulting in solution a∗= {α}. These pairs are solutions of the system
(11-12). Therefore, we run all the possible values of variable b and check if the
ﬁrst equation of the system (11-12) turns into true equality:
b = 1 : 1 + {α} = 1 ⇒1 ̸= {α};
The Inverse Task
9
b = {α} : {α} + {α} = {α} ⇒{α} = {α};
b = {β} : {β} + {α} = 1 ⇒1 ̸= {α};
b = 0 : 0 + {α} = {α} ⇒{α} = {α}.
Therefore, out of four possible values of variable b, only two values {α} and
0 are appropriate. Thus, we obtain two pairs (b, c): ({α}, {α}) and ({α}, 0).
A single target alternative vs interval of alternatives. In the previous examples
we considered a target choice to be only a single alternative. Here we illustrate
the case, when a target choice is an interval. Let b = {β}, and 1 ⊇a∗⊇{α}. To
ﬁnd corresponding inﬂuences of subject c, we solve the system of equations:
{β} + c = 1
c = {α}
(13)
(14)
Again, we instantly obtain the solution of this system: c = {α}.
In this section, we have formulated the Inverse task in general and considered
its particular formalization depending on the number of inﬂuences and what is
the target choice. However, we do not have a method to solve arbitral inﬂuence
equation. Therefore, we solve this problem in the next section.
3 How to Solve an Arbitral Inﬂuence Equation
As an introduction for this section, we consider the fundamental proposition,
which will be the conner stone to solve the inﬂuence equations.
Proposition 1. Let P and Q be some abstract sets. Then PQ+PQ = 0 ⇔P =
Q.
Proof. Necessity. Let PQ + PQ = 0, then
PQ + PQ = 0 ⇒PQ + PQ + P = P ⇒P + PQ = P ⇒
P(Q + Q) + PQ = Q + PQ + PQ = P ⇒Q = P.
Therefore if PQ + PQ = 0, then P = Q.
Suﬃciency. Let P = Q, then PP + PP = 0. □
Now let us consider the new type of equation:
A1x + B1x = 0
(15)
This equation has solution if and only if A1 ⊇x ⊇B1.
10
Sergey Tarasenko
3.1 Solving Inﬂuence Equations
There are three operations deﬁned on the Boolean algebra. They are conjunc-
tion (· or multiplication), disjunction (+ or summation) and negation (x, where
x is subject variable). The negation operation is unary operation, while other
two operations are binary. Using combination of these three operations, we can
compose any inﬂuence equation. Since, it is obvious how to solve the equation
including only unary operation, we discuss how to solve inﬂuence equations in-
cluding a single binary operation.
For this perpose, we consider two abstract subject variables x1 and x2 and
abstract alternative χ.
Lemma 1. The solution of equation
x1 + x2 = χ
(16)
regarding variable xi, where i = 1, 2, is given by the interval χ ⊇xi ⊇(χxj +
xjχ), where j = 1, 2; j ̸= i.
Proof. According to Proposition 1, P = x1 + x2, Q = χ, P = x1 + x2 = x1 x2
and Q = χ.
Therefore, PQ + PQ = (x1 + x2)χ + x1 x2χ = x1χ + x2χ + x1 x2χ. Conse-
quently, we obtain eq.(17):
x1χ + x2χ + x2χx1 = 0
(17)
We solve eq.(17) regarding variable x1. First, we transform eq.(17) into canon-
ical form:
χx1 + (χx2 + χx2)x1 = 0
(18)
Therefore, the solution of eq.(18) is given by the interval
χ ⊇x1 ⊇(χx2 + x2χ).
(19)
Since variables x1 and x2 are interchangable and it is possible to solve eq.(17)
regarding variable x2 as well, the general form of solution of eq.(16) is the interval
χ ⊇xi ⊇(χxj + xjχ).
(20)
where i = 1, 2 and j = 1, 2; j ̸= i.□
Lemma 2. The solution of equation
x1x2 = χ
(21)
regarding variable xi, where i = 1, 2, is given by the interval (χxj +χ xj) ⊇xi ⊇
χ, where j = 1, 2; j ̸= i.
The Inverse Task
11
Proof. According to Proposition 1, P = x1x2, Q = χ, P = x1x2 = x1 + x2 and
Q = χ.
Therefore, PQ + PQ = (x1x2)χ + (x1 + x2)χ = x2χx1 + x1χ + x2χ.
Thus, we obtain eq.(22):
x2χx1 + x1χ + x2χ = 0
(22)
We solve eq.(22) regarding variable x1. First, we transform eq.(22) into canon-
ical form:
(χx2 + χx2)x1 + χx1 = 0
(23)
Since χx2 + χx2 = χx2 +χ x2, the solution of eq.(23) is given by the interval
(χx2 + χ x2) ⊇x1 ⊇χ.
(24)
Since variables x1 and x2 are interchangable and it is possible to solve eq.(22)
regarding variable x2 as well, the general form of solution of eq.(21) is the interval
(χxj + χ xj) ⊇xi ⊇χ.
(25)
where i = 1, 2 and j = 1, 2; j ̸= i.□
Since one bound of the solution intervals for eqs.(16) and (21) are functions of
the second variable, we need to run all the possible values of the second variable
in order to obtain all possible solutions of these equations in the form of pairs
(x1, x2).
Next we consider several examples, illustrating application of Lemmas 1 and
2.
Example 1. For illustration, we solve equation a∗= ba∗+c. Consider χ = a∗,
x1 = ba∗and x2 = c, we obtain the solution interval for variable x2 = c:
χ ⊇c ⊇(χχb + χ χb). After simplﬁcation, we get interval (26):
χ ⊇c ⊇χb
(26)
Next we consider examples with particular alternatives. Let it be alternative
{α} : χ = {α}. The solution interval is then {α} ⊇c ⊇{α}b. Since the lower
bound of this interval is a function of variable b, to ﬁnd all solutions of equation
a∗= ba∗+ c, we calculate value of expression {α}b for all possible values of
variable b (Table 2).
To reesure that solutions are correct, we check that decision equation a =
ba + c turns into true equality for the obained pairs (b, c):
({α}, {α}): {α}{α} + {α} = {α} ⇒{α} = {α} is true;
({α}, 0): {α}{α} + 0 = {α} ⇒{α} = {α} is true;
({β}, {α}): {α}{β} + {α} = {α} ⇒{α} = {α} is true;
(1, {α}): {α}1 + {α} = {α} ⇒{α} = {α} is true;
(1, 0): {α}1 + 0 = {α} ⇒{α} = {α} is true;
(0, {α}): {α}0 + {α} = {α} ⇒{α} = {α} is true.
12
Sergey Tarasenko
So far, we have illustrated how to solve the inﬂuence equation. We as well
showed that the pairs (b, c) obtained by solving equation a∗= ba∗+ c in ac-
cordance with Proposition 1 and Lemmas 1 and 2 are indeed solutions of this
equation.
Table 2. Solutions of the inﬂuence equation a∗= ba∗+ c
Values of b
{α}
{β}
1
0
Pairs (b, c) ({α}, {α}) ({β}, {α}) (1, {α}) (0, {α})
({α}, 0)
(1, 0)
Example 2. We consider inﬂuence equation for subject b obtained from eq.(3).
(c + a)χ + χ = χ
(27)
First, we transform the left-hand side of eq.(27):
(c + a)χ + χ = cχ + aχ + χ = cχ + aχ + (c + a + 1)χ = c + a + χ.
Therefore, eq.(27) can be rewritten as follows:
c + a + χ = χ
(28)
Considering, x1 = c and x2 = a+χ, we instantly obtain the solution interval
of eq.(28): χ ⊇c ⊇(χ(a + χ) + χ(a + χ)) ⇒χ ⊇c ⊇(χ a + χχa).
Finally,
χ ⊇c ⊇χ a
(29)
Example 3. Next, we consider inﬂuence equation
ab + χ = χ
(30)
Considering, x1 = ab and x2 = χ, we instantly obtain the solution interval
χ ⊇ab ⊇(χχ + χχ) or
χ ⊇ab ⊇0
(31)
Therefore, in order to ﬁnd all solutions of eq.(30), we need to solve the equa-
tions
ab = y
(32)
where y is any sub-set of set χ (y ⊇χ).
Each equation can be solved according to Lemma 2.
Example 4. As a ﬁnal example, we again consider inﬂuence equation a∗=
(b + c)a∗+ ca∗and show how application of Lemma 1 essentially simpliﬁes its
solution. We get the system of inﬂuence equations:
The Inverse Task
13
b + c = {α} ;
c = {α} .
(33)
(34)
From this system we obtain a single equation:
b + {α} = {α} .
(35)
According to Lemma 1, we instantly obtain the solution interval of eq.(35):
{α} ⊇b ⊇0 .
(36)
Thus, eq.(35) has two solutions: b = {α} and b = 0. Therefore the solution
of system (33-34) consists of two pairs ({α}, {α}) and (0, {α}).
To conclude this section, we provide its brief summary. We have shown how
to solve the Inverse task by means of inﬂuence equations. We have proved two
fundamental lemmas, which allow to solve any inﬂuence equation regardless of
the number of variables. Finally, we have illustrated several examples of how
apply these lemmas.
3.2 Analysis of Extreme Cases 1: Frustration
In this section we analyze the situation, when subject can appear in frustration
state, from the point of view of the inverse task. Let us consider the polynomial
a(b + c) discussed in the section 2.1. The decision equation that corresponds to
this polynomial is x = (b + c)a + a, where x can be any subject variable.
Next we try to ﬁnd all the pairs (b, c) such that result in selection of a
particular alternative by subject a.
The decision equation for subject a is a = (b + c)a + a. The solution interval
of this decision equation is b + c ⊇a ⊇1. We need to check which alternative
subject a can be convinced to choose. To do this, we consider the system of
equation for each alternative.
Alternative {α}:
b + c = {α}
1 = {α}
(37)
(38)
Alternative {β}:
b + c = {β}
1 = {β}
(39)
(40)
Alternative 0 = {}:
b + c = 0
1 = 0
(41)
(42)
In these systems the second equation is incorrect equality. Therefore these
systems have no solution.
Alternative 1 = {α, β}:
14
Sergey Tarasenko
b + c = 1
1 = 1
(43)
(44)
The second equation is correct equality. Therefore this system has solution.
Thus, out of four possible alternatives, subject a actually can choose only
alternative 1 = {α, β}. To ﬁnd solutions, resulting in selection of the alternative
1 = {α, β}, we need to solve only eq.(43), since eq.(44) turns into the true
equality.
According to Lemma 1, we instantly obtain the solution interval for eq.(43):
1 ⊇b ⊇c
(45)
We calculate the pairs (b, c) for all possible values of variable c (Table 3).
Table 3. Solutions of the inﬂuence equation b + c = 1
Values of c
{α}
{β}
1
0
Pairs (b, c)
({β}, {α}) ({α}, {β})
(0, 1)
(1, 0)
(1, {α})
(1, {β})
({α}, 1)
({β}, 1)
(1, 1)
Therefore, the inﬂuence analysis of the decision equation a = (b+c)a+a shows
that the only alternative that subject a can choose is alternative 1 = {α, β}. The
inﬂuence analysis provides us with the set (exhaustive list) of pairs (b, c) of joint
inﬂuences resulting in selection of alternative 1 = {α, β}. Therefore, if the pair
of inﬂuences does not match any pair from this list, the decision equation has
no solution and this results in frustration state.
Summarizing, this section we note that in general there are two sets. The set
D contains alternatives that a controlled subject can choose. The set U is the
set of altertanives of the target choice. Therefore, the need to put subject a into
frustration state emerges, if the target choice of a controlled subject cannot be
made by this subject. In other words, we need to put a subject into frustration
state, if D ∩U = Ø.
3.3 Analysis of Extreme Cases 2: What to do with Super-Active
Groups
Among all the possible groups, there are groups, in which subjects will always
choose only the alternative 1 = {α, β} regardless of the inﬂuence of other sub-
jects. Such groups are called super-active groups.
The Inverse Task
15
Next we consider one special case of super active groups - the homogenous
groups. The group is called homogenous, if all the subjects in the group are
connected with the same relationship.
Here we provide proof of the lemma about homogenous groups originally
formulated by Lefebvre [1, 2].
Lemma 3. Any homogenous group is the super-active group.
Proof. We consider the homogenous groups, where all the subjects are connected
with alliance (alliance groups) and conﬂict (conﬂict groups) relationship, sepa-
rately.
Without loss of generallity, we suggest that there are n subjects a1, a2, ..., an.
Alliance groups. The polynomial corresponding to the alliance group of n
subject is a1a2...an. Next we construct the diagonal form and apply folding
procedure:
[a1][a2]...[an]
[a1a2...an]
= [a1a2...an] + [a1][a2]...[an] = 1 .
Therefore the alliance groups are always super-active.
Conﬂict groups. The polynomial corresponding to the conﬂict group of n
subject is a1 + a2 + ... + an. Next we construct the diagonal form and apply
folding procedure:
[a1] + [a2] + ... + [an]
[a1 + a2 + ... + an]
=
[a1 + a2 + ... + an]+ [a1] + [a2] + ... + [an] = 1 .
Therefore the conﬂict groups are always super-active.
Since both the alliance and the conﬂict groups are super-active, this lemma
is proved. □
However, there are non-homogenous super-active groups as well (see Ap-
pendix B).
Summarizing this section, we note that subjects in the super-active groups
cannot be controlled in their choices and the entire groups is uncontrolable.
Therefore, once the super-active groups emerges, the only way to make it con-
trollable is to change the relationships in the group.
4 The Basic Control Schema of an Abstract Subject
(BCSAS) in the RGT
We have presented the detailed description of the RGT including solution of
the Forward and Inverse tasks. We have also considered the extream cases of
decisions like putting a subject into frustration state or changing structure of a
super-active group. As a ﬁnal stroke, we summarize all the presented material in
16
Sergey Tarasenko
Zχ== 
{}?      
yes
no
Pairs(M)
M = 1
Start 
M=<
N
M = M + 1
End 
yes
no
Dh = Dh + χ
Read 
Pairs (χ, Zχ)x
Save Dh
Zh= Zh + Zχ
Save Zh
Fig. 3. The Block schema for extracting sets Dh and Zh.
the form of Basic Control Schema of an Abstract Subject (BCSAS) in the RGT.
The input comes from the environment and is formalized in the form of exter-
nal Inﬂuences on the subject, the Boolean algebra of Alternatives and Structure
of a Group.
Information about the Inﬂuences, Boolean algebra and Group Structure is
propagated into the Decision Module. The Decision Module implements solution
of the Forward task. Therefore the output set D of the Decision Module is the
set of possible alternatives, which subject can choose under the given conditions.
The information about Boolean algebra and Group Structure is propagated
into the Inﬂuence Module. The Inﬂuence Module solves the Inverse task. The
output set Dh of the Inﬂuence Module is the set of the pairs (χ, Zχ)x, where χ is
the target alternative, the set Zχ is the set of all the joint inﬂuences, resulting in
selection of the target choice; and x represents a subject variable. Each (χ, Zχ)x
represents a reﬂexive control strategy.
Therefore, the decision to put a subject into frustration state is justiﬁed if
it is impossible to make subject x choose the target alternative χ, i.e., if for pair
(χ, Zχ)x set Zχ = {}, and subject x should not choose any other alternative
except for the target one.
The Inverse Task
17
4.1 Schema for Iterative Algorithm to Obtain Output of the
Inﬂuence Module
The alternatives χ with corresponding non-empty sets Zχ are included into the
set Dh. Here we introduce set Zh to store the non-empty sets Zχ. The schema
of the algorithm for extracting sets Dh and Zh is presented in Fig. 3. First the
sets Dh and Zh are empty: Dh = {} and Zh = {}. The algorithm reads the set of
pairs (χ, Zχ)x and stores it in array Pairs(M), where M is a counting variable,
N is the total number of pairs. Then it is checked for each pairs from array
Pairs whether set Zχ is empty: Zχ == {}? . If ’yes’, the algorithm increments
counting variable M(M = M + 1) and proceeds to the next pair from array
Pairs. If ’no’, then alternative χ is included into the set Dh(Dh = Dh + χ), set
Dh is saved, the set Zχ is included into set Zh (Zh = Zh + Zχ) and set Zh is
saved. The process is run while M ≤N.
In this iterative algorithm, we separately store the alternatives χ , which can
be chosen by a certian subject, in the set Dh and the joint inﬂuences Zχ , which
result in selection of alternative χ, in the set Zh.
Therefore, we should modify the schema of Inﬂuence Module in BCSAS as
follows. We present elaborated schema, where sub-module ”Solution: Dh” is ac-
companied with sub-module ”Solution: Zh”. Together these sub-modules are
included into the ”Solutions” sub-module.
BCSAS is the fundamental schema of an abstract subject, which is used
through out the RGT. The BCSAS is presented in Fig.4.
This concludes the overview of RGT and description of tasks within the scope
of the general theory. Therefore, we continue with application of the RGT to the
mixed groups of humans and robots.
Decision 
equation of 
a  robot
Decision Module 
Solutions : D
Boolean 
Algebra of
Alternatives
Environment
Decision 
equation of a 
human
Influence Module
Solution: Dh
Realization of 
an alternative
Reflexive 
control
Influences
System of 
Influence eqs.
Structure 
of a Group
Solution: Zh
Solutions
Fig. 4. The Basic Control Schema of an Abstract Subject (BSCAS).
18
Sergey Tarasenko
5 Deﬁning Robots in RGT
As we have noted in the Introduction section, the goal of the robots in mixed
groups of humans and robots is to refrain human subject from choosing risky
actions, which might result in injuries or even threaten live.
It is considered by default that robot follows the program of behavior. Such
program consists of at least three modules. The Module 1 implements robot’s
ability of human-like decision-making based on the RGT. The Module 2 contains
the rules, which refrain robot from making a harm to human beings. The Module
3 predicts the choice of each human subject and suggests the possible reﬂexive
control strategies.
The Modules 1 and 3 are inhereted from the BCSAS of an Abstract Individ-
ual. They correspond to Decision Module and Inﬂuence Module of the BCSAS
(Fig. 4), respectively. Therefore all the properties and meaning of outputs of the
Modules 1 and 3 are the same as the ones for Decision and Inﬂuence modules,
respectively.
The Module 2 is the new module, which is intrinsic for robotic agents studied
in the context of mixed groups of humans and robots. This module is responsible
for extraction of only harmless or non-risky alternatives for human subject.
We suggest to apply Asimov’s Three Laws of robotics [19], which formulate
the basics of the Module 2:
1) a robot may not injure a human being or, through inaction, allow a human
being to come to harm;
2) a robot must obey any orders given to it by human beings, except where such
orders would conﬂict with the First Law;
3) a robot must protect its own existence as long as such protection does not
conﬂict with the First or Second Law.
We consider that these laws are intrinsic part of robots ”mind”, which cannot
be erased or corrupted by any means.
The interaction of Modules 1 and 2 is performed in the Interaction Module
1. The interaction of Modules 3 and 2 is implements in the Interaction Module
2.
The Boolean algebra is ﬁltered according to Asimov’s laws in Module 2.
The output of Module 2 is set U of approved alternatives. This data is then
propagated into interaction modules.
The output of the Module 1 is set D of alternatives, which robot has to choose
under the given joint inﬂuences. In the Interaction Module 1, the conjunction of
sets D and U is performed: D ∩U = DU. If set DU is not empty set, this means
that there are aproved alternatives among the alternatives that robot should
choose in accordance with the joint inﬂuences. Therefore, robot can implement
any alternative from the set DU. If set DU is empty, this means that under given
joint inﬂuences robot cannot choose any approved alternative, therefore robot
will choose an alternative from set U. This is how the Interaction Module 1
works.
The output of the Module 3 contains sets Dh and Zh. The goal of the robot
is to refrain human subjects from choosing risky alternative. This can be done
The Inverse Task
19
Asimov 
Laws’ 
based 
filter
set U of 
approved 
alternatives
Module 2
Decision 
equation of 
a  robot
Module 1
Solutions : D
DU=
= {}?      
U
DU
Boolean 
Algebra of
Alternatives
yes
no
DhU=
= {}?      
U
DhU
yes
no
Environment
Realization of 
an alternative
Reflexive control
Decision 
equation of a 
human
Module 3
Solution: Dh
Influences
Structure 
of a Group
Solution: Zh
Solutions
Frustration
yes
X=DhU,
for∀χ∈X get
(χ, Zχ)
for∀Zχ∈ZU
get (χ, Zχ)
ZU =
= {}?      
get set ZU of 
Zχ≠{}:∀χ∈U
no
Interaction 
Module 1
Interaction 
Module 2
Fig. 5. The Basic Control Schema of a Robotic Agent (BCSRA).
by convincing human subjects to choose alternatives from the set U. First, we
check whether Dh contains any approved alternative. We do so by performing
conjunction of sets Dh and U: Dh ∩U = DhU.
If set DhU is not empty, then it means that it is possible to make a human
subject to choose some non-risky alternative. Therefore, we should choose the
corresponding reﬂexive control strategy from the set Zh. However, if set DhU
is empty, we have to ﬁnd the reﬂexive control strategy that will make human
subject to select approved alternative from set U. For this purpose, we construct
set ZU by including all the joint inﬂuences Zχ for approved alternatives: Zχ ∈
ZU ⇔χ ∈U. Next we check whether set ZU is empty. If set ZU is empty
this means it is impossible to convince a human subject to choose non-risky
alternative. Therefore, the only option of reﬂexive control in this case is to put
this subject into frustration state. However, if set ZU is not empty, this means
that there exist at least one reﬂexive control strategy that results in selection of
alternative from the set of the approved (non-risky) ones.
Therefore, the BCSRA inherits the entire structure of the BCSAS and aug-
ments it with Module 2 of Asimov’s Laws together with Interaction Modules 1
and 2.
The original schema of robot’s control system has been recently presented
in [20]. The BCSRA is extended version of the original schema. The BCSRA
20
Sergey Tarasenko
provides comprehensive approach of how Forward and Inverse tasks are solved
in the robot’s ”mind”.
Thus, in this section we have presented the formalization of robotic agent in
the RGT. We outlined the speciﬁc features of robotic agents, which distinguish
them from other subjects. Furthermore, we provided detailed explanation of how
the Forward and Inverse tasks are solved in the framrework of control system
(BCSRA) of robots.
Next, we proceed with consideration of sample sutiations of interactions be-
tween humans and robots.
6 Extended Sample Analysis of Mixed Groups
Here we elaborate two examples, presented in the previous study [20], of how
robots in the mixed groups can make humans refrain from risky actions. We
discuss the application of the extended schema of robot’s control system and
provide explicit derivation of reﬂexive control strategies, which has been applied
in these examples in the prevous study [20].
6.1 Robots Baby-Sitters
Suppose robots have to play a part of baby-sitters by looking after the kids. We
consider a mixed group of two kids and two robots. Each robot is looking after a
particular kid. Having ﬁnished the game, kids are considering what to do next.
They choose between “to compete climbing the high tree” (action α) and “to
play with a ball” (action β). Together actions α and β represent the active state
1={α, β} = {α} + {β}. Therefore the Boolean algebra of alternatives consists of
four elements: 1) the alternative {α} is to climb the tree; 2) the alternative {β}
is to play with a ball; 3) the alternative 1 = {α, β} means that a kid is hesitating
what to do; and 4) the alternative 0 = {} means to take a rest.
We consider that each kid considers his robot as ally and another kid and
his robot as the competitors. The kids are subjects a and c, while robots are
subjects b and d. The relationship graph is presented in Fig. 6.
a
c
b
d
Fig. 6. The relationship graph for robots baby-sitters examples.
Next we calculate the diagonal form and fold it in order to obtain decision
equation for each subject:
The Inverse Task
21
[a][b]
[c][d]
[ab]
+[cd]
[ab + cd]
= ab + cd .
From two actions α and β, action α is a risky action, since a kid can fall from
the tree and this is real threat for his health or even life. Therefore according to
Asimov’s laws, robots cannot allow kids to start the competition. Thus, robots
have to convince kids not to choose alternative {α}. In terms of alternatives,
the Asimov’s laws serve like ﬁlters which ﬁlter out the risky alternatives. The
remaining alternatives are included into set U. In this case, U = {{β}, {}}.
Next we solve the Inverse taks, regarding alternatives {β} and {}. We conduct
the analysis regarding kid a. This analysis can be further extended for kid c in
the similar manner.
Solution of the Inverse task for kid a with approved alternatives as target
choice. The decision equation for kid a is a = ab+cd. First, we transform it into
canonical form: a = (b + cd)a + cda.
Next we consider system of inﬂuence equations:
b + cd = χ
cd = χ,
(46)
(47)
where alternative χ ∈U.
Regarding eq.(47), eq.(46) is transformed into equation
b + χ = χ
(48)
The solution of eq.(48) directly follows from Lemma 1: χ ⊇b ⊇0. Therefore
for χ = {β} and χ = {} the solutions are {β} ⊇b ⊇0 and b = 0, respectively.
The eq.(47) can be instantly solved according to Lemma 2: χd+χ d ⊇c ⊇χ.
Consider χ = {β} ﬁrst. Then {β}d + {α}d ⊇c ⊇{β}. By varying values of
variable d, we obtain all the pairs (c, d):
d = 1: {β} ⊇c ⊇{β} ⇒c = {β}. Therefore the solution is pair ({β}, 1);
d = 0: {α} ⊇c ⊇{β}. Since {α} ∩{β} = {}, there is no solution;
d ={α} : 0 ⊇c ⊇{β}. Since {β} ⊇{}, there is no solution;
d ={β} : 1 ⊇c ⊇{β}. Therefore there are two solutions (1, {β}) and
({β}, {β}).
Therefore equation cd = {β} has three solutions ({β}, 1), (1, {β}) and
({β}, {β}).
Thus, we have solved both equations from system (46-47). The solutions of
this system are the triplets (b, c, d) of joint inﬂuences, which are all possible com-
binations of solutions of both equations. Since there are two solution of eq.(46)
and three solutions of eq.(47), there are six triplets (b, c, d) in total: (0, {β}, 1)
and ({β}, {β}, 1); (0, 1, {β}) and ({β}, 1, {β}); (0, {β}, {β}) and ({β}, {β}, {β}).
Now we consider the case, when χ = 0 = {}. Then d ⊇c ⊇0. We obtain
pairs (c, d) for all values of variable d:
d = 1: 1 ⊇c ⊇0 ⇒c = 0. Thus, there is only one solution (0,1);
22
Sergey Tarasenko
d = 0: 1 ⊇c ⊇0. Thus, there are four solutions (1, 0), ({α}, 0), ({β}, 0) and
(1, 0);
d = {α}: {β} ⊇c ⊇0. Thus, there are four solutions ({β}, {α}) and (0, {α});
d = {β}: {α} ⊇d ⊇0. Thus, there are four solutions ({α}, {β}) and (0, {β}).
In total, equation cd = 0 has 9 solutions. Therefore system (49-50) also has
9 solutions as triplets (b, c, d): (0, 1, 0), (0, 0, 0), (0, 0, {α}), (0, 0, {β}), (0, 0, 1),
(0, {α}, {β}), (0, {α}, 0), (0, {β}, {α}) and (0, {β}, 0).
We have considered two cases, when both upper and lower bounds of the
interval of decision equation equal to the same alternative. Now we discuss a
new situation, when variable a should take not a single value, but several values.
In this case, we should ﬁnd the joint inﬂuences (b, c, d) that result in selection
of either alternative {β} or {}. Since, {β} ⊇{}, we need to ﬁnd all the triplets
(b, c, d), resulting in the solution of decision equation as interval {β} ⊇a ⊇{}.
Thus, {β} ⊇a∗⊇{}.
Therefore, we need to solve the following system of equations:
b + cd = {β}
cd = 0.
(49)
(50)
The eq.(49) turns into equality b = {β}, and we need to solve eq.(50). How-
ever, this equation has been already solved in the previous example. Therefore we
obtian the solutions of the system (49-50): ({β}, 1, 0), ({β}, 0, 0), ({β}, 0, {α}),
({β}, 0, {β}), ({β}, 0, 1), ({β}, {α}, {β}), ({β}, {α}, 0), ({β}, {β}, {α}) and
({β}, {β}, 0).
Comparing solutions of all three system of inﬂuence equation, we can see
that there are four remarkable solutions ({β}, {β}, {β}) and ({β}, {}, {β});
({β}, 1, {β}) and ({β}, {α}, {β}). The ﬁrst pair of solution results in choice of
only alternative {β}, while second pair of solutions results in selection of eighter
alternative {β} or alternative {}. These four solutions together illustrate that
if b = d = {β}, it is guaranteed that regardless of inﬂuence of kid c, kid a will
choose either of approved alternatives.
By analogy, we can see that among solutions of system (46-47) with χ = {},
there are four solutions (0, 1, 0),(0, 0, 0), (0, {α}, 0) and (0, {β}, 0). Therefore, if
b = d = 0, kid a will choose alternative 0 = {} regardless of inﬂuence of kid c.
These two examples of binding variables b and d were considered in Scenario 1
and Scenario 2 of sample situation with robot baby-sitters, originally presented
in [20].
Summarizing the results of this section, we have shown that robots can suc-
cessfully control kids’ behavior by refraining them from doing risky actions. The
basic of this control is entirely based on the proposed schema of robot’s control
system. We have analyzed all the possible reﬂexive control strategies by solving
three systems of inﬂuence equation: two systems regarding a single alternative
and one system regarding the interval of alternatives. Therefore, we have shown
how the Inverse task can be eﬀectively solved by our proposed algorithm in
situation similar to the real conditions.
The Inverse Task
23
6.2 Mountain-Climbers and Rescue Robot
We consider that there are two climbers in the mountain and rescue robot. The
climbers and robot are communicating via radio. One of the climbers (subject b)
got into diﬃcult situation and needs help. Suggest, he fell into the rift because
the edge of the rift was covered with ice. The rift is not too deep and there is a
thick layer of snow on the bottom, therefore climber is not hurt, but he cannot
get out of the rift himself. The second climber (subject a) wants to rescue his
friend himself (action α), which is risky action. The second option is that robot
will perform rescue mission (action β). Since inaction is inappropriate solution
according to the First Law, the set U of approved alternatives for robot includes
only alternative {β}. The goal of the robot is to refrain the climber a from
choosing alernative {α} and perform rescue mission itself.
We suggest that from the beginning all subjects are in alliance. The cor-
responding graph is presented in Fig. 1c and its polynomial is abc. Therefore
by deﬁnition it is homogenous group and, consequently, it is super-active group
according to Lemma 3.
Thus, any subject in the group is in active state. Therefore, group is un-
controllable (see Section 3.3). In this case, robot makes decision to change his
relationship with the climber b from alliance to conﬂict. Robot can do that, for
instance, by not responding to climber’s orders.
Which reﬂexive control leads to frustration state? Then the polynomial corre-
sponding to the new group is a(b+c). This polynomial has been already broadly
discussed in the Section 3.2. Therefore, we know decision equation for subject a:
a = (b+c)a+a. We have shown as well that subject a can choose only alternative
1 = {α, β}, if appropriate joint inﬂuences are applied (see Section 3.2), overwise
subject a is in frustration state and cannot make any choice. Therefore, in or-
der to put subject a into frustration state, the reﬂexive control strategy should
NOT be selected from the list of solutions (Section 3.2): ({β}, {α}); (1, {α});
({α}, {β}); (1, {β}); (0, 1); ({α}, 1); ({β}, 1); (1, 1) and (1, 0).
Here we provide two examples of such joint inﬂuences (b, c): ({α}, {α}) ⇒
({α} + {α}) = {α} ⊂1 and ({β}, {}) ⇒({β} + {}) = {β} ⊂1.
Whether robot can complete mission regardless of joint inﬂuences of other
subjects? The decision equation for robot c is c = c+(b+a)c. The corresponding
solution interval is 1 ⊇c ⊇(b + a).
Here we analyze all 16 possible reﬂexive control strategies (a, b) that climbers
can apply to robot c.
Examples with emtpy set DU. For (0, b), there will be the same situation
regardless of value of variable b : 1 ⊇c ⊇(b + 0) ⇒1 ⊇c ⊇(b + 1) ⇒c = 1.
For (a, 1), there will be the same situation regardless of value of variable a :
1 ⊇c ⊇(1 + a) ⇒c = 1.
For ({α}, {α}): 1 ⊇c ⊇({α} + {α}) ⇒1 ⊇c ⊇({α} + {β}) ⇒c = 1.
For ({β}, {β}): 1 ⊇c ⊇({β} + {β}) ⇒1 ⊇c ⊇({β} + {α}) ⇒c = 1.
Therefore in these cases set D = {{α, β}}.
Next we consider other pairs (a, b).
24
Sergey Tarasenko
(1, {α}): 1 ⊇c ⊇({α} + 1) ⇒1 ⊇c ⊇{α}. Here set D = {{α, β}, {α}}.
({β}, {α}): 1 ⊇c ⊇({α} + {β}) ⇒1 ⊇c ⊇{α}. Here set D = {{α, β}, {α}}.
({β}, 0): 1 ⊇c ⊇(0+{β}) ⇒1 ⊇c ⊇{α}. Therefore, set D = {{α, β}, {α}}.
Since U = {{β}}, DU = {} for all the cases considered above, robot will
choose alternative {β} from the set U.
Examples with non-empty set DU. Consider the following pairs (a, b):
(1, {β}): 1 ⊇c ⊇({β}+1) ⇒1 ⊇c ⊇{β}. Therefore, set D = {{α, β}, {β}}.
(1, 0): 1 ⊇c ⊇(0 + 1) ⇒1 ⊇c ⊇0. Thus, set D = {{α, β}, {α}, {β}, {}}.
({α}, {β}): 1 ⊇c ⊇({β}+{α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
({α}, {β}): 1 ⊇c ⊇({β}+{α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
({α}, 0): 1 ⊇c ⊇(0 + {α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
Since U = {{β}}, DU = {{β}} for all the cases considered above, robot will
choose alternative {β} from the set DU.
Thus, we have shown that under all 16 reﬂexive control strategies (a, b), robot
c can choose the alternative {β}, which is to perform the rescue mission itself.
Therefore robot will choose alternative {β} regardless of the joint inﬂuences
(a, b) of the climbers.
The discussed example illustrates how robot can transform uncontrollable
group into controllable one by manipulating the relationships in the group. In
the controllable group by its inﬂuence on the human subjects, robot can refrain
the climber a from risky action to rescue climber b. Robot achieves its goal by
putting climber a into frustration state, in which climber a cannot make any
decision. On the other hand, set U of approved alternatives guarantees that
robot itself will choose the option with no risk for humans and implement it
regardless of climber’s inﬂuence.
Therefore, in this section we have illustrated robot’s ability to refrain human
being from risky actions and to perform these risky actions itself. This proves
that our approach achieves both goals of robotic agent: 1) to refrain people
from risky actions and 2) to perform risky actions itself regardless of human’s
inﬂuences.
7 Discussion and Conclusion
Summarizing, the results of this paper, we outline the most important of them.
First of all, we have introduced the Inverse task and developed the ultimate
methods to solve it.
We have provided a comprehensive tutorial to the brand new Reﬂexive
Game Theory recently formulated and proposed by Vladimir Lefebvre [1, 2,
3, 4]. The tutoral contains the detailed description of the Forward and Inverse
tasks together with methods to solve them.
We propose control schemas for both abstract subject (BCSAS) and robotic
agent (BCSRA). These schemas were specially designed to incorporate solution
of the Forward and Inverse tasks, thus providing us with autonomous units
The Inverse Task
25
(individuals, subjects, agents) capable of making decisions in the human-like
manner. We have shown that robotic agents based on BCSRA can be easily
included into the mixed groups of humans and robots and eﬀectively serve their
fundamental goals (refraining humans from risky actions and, if needed, perform
the risky acions itself).
Therefore, we consider that present study provides the comprehensive overview
of the classic RGT proposed by Vladimir Lefebvre [1, 2, 3, 4] and newly developed
self-consistent framework for analysis of diﬀerent kinds of groups and societies,
including human social groups and mixed groups of humans and robots together
with application tutorial of this new framework.
This framework is entirely based on the principles of the RGT and brings
together all its elements. The solution of the Inverse task, presented in this
paper, plays a crutial role in formation of this framework. Therefore, by having
the Inverse task as one of its fundamentals, this framework illustrates the role
of the Inverse task and its relationship with other issues considered in the RGT.
References
1. Lefebvre, V.A.: Lectures on Reﬂexive Game Theory. Leaf & Oaks, Los Angeles
(2010).
2. Lefebvre, V.A.: Lectures on Reﬂexive Game Theory. Cogito-Center, Moscow (2009)
[in Russian].
3. Lefebvre, V.A.: The basic ideas of reﬂexive game’s logic. Problems of research of
systems and structures. pp. 73-79 (1965) [in Russian].
4. Lefebvre, V.A.: Reﬂexive analysis of groups. In: Argamon, S. and Howard, N.
(eds.) Computational models for counterterrorism. pp. 173-210. Springer, Heidel-
berg (2009).
5. Lefebvre, V.A.: Algebra of Conscience. D. Reidel, Holland (1982).
6. Lefebvre, V.A.: Algebra of Conscience. 2nd Edition. Holland: Kluwer (2001).
7. Batchelder, W.H., Lefebvre, V.A.: A mathematical analysis of a natural class of
partitions of a graph. J. Math. Psy. 26, pp. 124-148 (1982).
8. Kobatake, E., and Tanaka, K.: Neuronal Selectivities to Complex Object Features
in the Ventral Pathway of the Macaque Monkey. Journal of Neurophysiology, 71,
3, pp. 856-867 (1994).
9. Koerner, E., Gewaltig, M.-O., Koerner, U., Richter, A., and Rodemann, T.: A
model of computation in neocortical architecture. Neural Networks, 12, pp. 989-
1005 (1999).
10. L¨ucke, J., and von der Malsburg, C.: Rapid processing and unsupervised learning
in a model of the cortical macrocolumn. Neural Computation, 16, pp. 501-533
(2003).
11. Schrander, S., Gewaltig, M.-O., K¨orner, U. and K¨orner, E.: Cortext: A columnar-
model of bottom-up and top-down processing in the neocortex. Neural Networks,
22, pp. 1055-1070 (2009).
12. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mech-
anism of pattern recognitition unaﬀected by shift and position, Biological Cyber-
natics, 36, pp. 193-201 (1980).
13. Riesenhuber, M. and Poggio, T.: Hierarchical models of object recognition in cor-
tex. Nature Neuroscience, 2, 11, pp. 109-125 (1999).
26
Sergey Tarasenko
14. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio.: Robust Object
Recognition with Cortex-like Mechanisms, IEEE Transactions on pattern analysis
and machine intelligence, 29, 3, pp. 411-426 (2007).
15. Hienze, J.: Hierarchy length in orphaned colonies of the ant Temnothorax nylanderi
Naturwissenschaften, 95, 8, pp. 757-760 (2008).
16. Chase, I., D.: Models of hierarchy formation in animal societies. Behavioral Science,
19, 6, pp. 374-382 (2007).
17. Chase I., Tovey C., Spangler-Martin D., Manfredonia M.: Individual diﬀerences
versus social dynamics in the formation of animal dominance hierarchies. PNAS,
99, 9, pp. 5744-5749 (2002).
18. Buston P.: Social hierarchies: size and growth modiﬁcation in clownﬁsh. Nature,
424, pp. 145-146 (2003).
19. Asimov, I.: Runaround. Astounding Science Fiction, March, pp. 94-103 (1942).
20. Tarasenko, S.: Modeling mixed groups of humans and robots with Reﬂexive Game
Theory. In Lamers, M.H., and Verbeek, F.J. (eds.): HRPR 2010, LINCST 59, pp.
108-117 (2011).
Appendix
A When sets A and B are functions of less than total
number of subject minus one variables
Consider groups of four subjects a, b, c and d. Suggest the polynomial corre-
sponding to this group is b(a + d) + c. Next we construct diagonal form and
perform folding operation:
[a] + [d]
[b][a + d]
[b(a + d)]
+[c]
[b(a + d) + c]
=
[b]([a + d] + [a] + [d])
[b(a + d)]
+[c]
[b(a + d) + c]
=
[b]
[b(a + d)]
+[c]
[b(a + d) + c]
=
= b(a + d) + c + b(a + d) + b + c
Next we simplify the resultant expression of diagonal form folding:
b(a + d) + c + b(a + d) + b + c = b(a + d) + c + b(a + d)cb =
b(a + d) + cb + cb + b(a + d)cb = b((a + d) + c + b(a + d)c) + cb =
b((a + d)c + (a + d)c + c + (b + (a + d))c) + cb =
b((a + d)c + (a + d)c + c + bc + (a + d)c) + cb =
b(c + (a + d)c + ((a + d) + (a + d))c) + cb = b((a + d)c + c + c) + cb =
b((a + d)c + 1) + cb = b + cb = b + c
The Inverse Task
27
Consequently,
[b(a + d)] + [b] + [c]
[b(a + d) + c]
= b + c
Therefore, the decision equation includes only two subject variables instead
of four. Consequenly, for subjects a and d the decision equations in canonical
forms are
a = (b + c)a + (b + c)a
(51)
d = (b + c)d + (b + c)d
(52)
Thus, the sets A and B for subjects a and d are equal. The sets A and B are
functions of only variables b and c: A = A(b, c) = b+cb and B = B(b, c) = b+cb.
The canonical forms of decision equations for subjects b and c are:
b = b + cb
(53)
c = c + bc
(54)
Therefore, set A = 1 for both subjects. Set B is a functions of a single
variable: B(c) = c and B(b) = b for subjects b and c, respectively.
B Example of non-homogenous super-active groups
Here we provide an example of non-homogenous super-active group.
Consider the group of four subject a, b, c and d, which is described by poly-
nomial c(ab + b). Let us build the diagonal form and perform its folding:
[a][b]
[ab]
+[d]
[c][ab + d]
[c(ab + d)]
=
([ab] + [a][b]) + [d]
[c][ab + d]
= [c(ab + d)]
=
1
[c][ab + d]
= [c(ab + d)]
=
= [c(ab + d)] + [c][ab + d] = 1 □
The Inverse Task of the Reﬂexive Game Theory:
Theoretical Matters, Practical Applications and
Relationship with Other Issues
Sergey Tarasenko
Kyoto University, Yoshida honmachi, Kyoto 606-8501, Japan
infra.core@gmail.com
Abstract. The Reﬂexive Game Theory (RGT) has been recently pro-
posed by Vladimir Lefebvre to model behavior of individuals in groups.
The goal of this study is to introduce the Inverse task. We consider meth-
ods of solution together with practical applications. We present a brief
overview of the RGT for easy understanding of the problem. We also de-
velop the schematic representation of the RGT inference algorithms to
create the basis for soft- and hardware solutions of the RGT tasks. We
propose a uniﬁed hierarchy of schemas to represent humans and robots.
This hierarchy is considered as a uniﬁed framework to solve the entire
spectrum of the RGT tasks. We conclude by illustrating how this frame-
work can be applied for modeling of mixed groups of humans and robots.
All together this provides the exhaustive solution of the Inverse task and
clearly illustrates its role and relationships with other issues considered
in the RGT.
Key words: Reﬂexive Game Theory (RGT), group behavior, society
behavior, RGT Forward Task, RGT Inverse Task, Asimov’s Laws of
Robotics, robots in RGT, mixed groups of humans and robots, human-
robot societies
1 Introduction
The Reﬂexive Game Theory (RGT) has been entirely developed by Lefebvre [1, 2]
and is based on the principles of anti-selﬁshness or egoism forbiddeness [1, 2]
and human reﬂexion processes [3]. Therefore RGT is based on the human-like
decision-making processes. The main goal of the theory is to model behavior of
individuals in the groups. It is possible to predict choices, which are likely to be
made by each individual in the group, and inﬂuence each individual’s decision-
making due to make this individual to make a certain choice. In particular, the
RGT can be used to predict terrorists’ behavior [4].
In general, the RGT is a simple tool to predict behavoir of invididuals and
inﬂuence individuals’ choices. Therefore it makes possible to control the individ-
uals in the groups by guiding their behavoir (decision-making, choices) by means
of the corresponding inﬂuences.
arXiv:1011.3397v1  [cs.MA]  15 Nov 2010
2
Sergey Tarasenko
On the other hand, now days robots have become an essential part of our life.
One of the purposes robots serve to is to substitute human beings in dangerous
situations and environments, like defuse a bomb or radioactive zones etc.
In contrast, human nature shows strong inclinations towards the risky be-
havior, which can cause not only injuries, but even threaten the human life.
The list of these reasons includes a wide range starting from irresponsible kids’
behavior to necessity to ﬁnd solution in a critical situation. In such a situation,
a robot should full-ﬁll a function of refraining humans from doing risky actions
and perform the risky action itself, if needed.
However, robots are forbidden and should not physically force people, but
must convince people on the mental level to refrain from doing a risky action.
This method is more eﬀective rather than a simple physical compulsion, because
humans make the decisions (choices) themselves and treat these decisions as
their own. Such technique is called a reﬂexive control [3].
The task of ﬁnding appropriate reﬂexive control is closely related with the
Inverse task, when we need to ﬁnd suitable inﬂuence of one subject on another
one or on a group of subject on the subject of interest. Therefore, it is needed
to develop the framework of how to solve the Inverse task. This is the primary
goal of this study.
However, for better understanding of the gist of the Inverse task and its
intrinsic relationships with other issues of the RGT, we introduce the entire
spectrum of the tasks, which can be solved by the RGT. This forms the scope
of inference algorithms used in the RGT. We present the RGT algorithms in
the form of the schemas of control systems that can be instantly applied for
developement of soft- or/and hardware solutions. We develop a hierarchy of
control systems for abstract individual (including human subject) and robotic
agent (robot) based on these control schemas. Finally, we illustrate application of
the Inverse task together with other RGT inference algorithms to model robot’s
behavior in the mixed groups of humans and robots.
2 Brief Overview of the Reﬂexive Game Theory (RGT)
2.1 Representation of groups: graphs, polynomials and stratiﬁcation
tree
The RGT deals with groups of abstract subjects (individuals, humans, au-
tonomous agents etc). Each subject is assigned a unique variable (subject vari-
able). Any group of subjects is represented in the shape of fully connected graph,
which is called a relationship graph. Each vertex of the graph corresponds to a
single subject. Therefore the number of vertices of the graph is in one-to-one
correspondence with overall number of subjects in the groups. Each vertex is
named after the corresponding subject variable.
The RGT uses the set theory and the Boolean algebra as the basis for calcu-
lus. Therefore the values of subject variables are elements of Boolean algebra.
The Inverse Task
3
All the subjects in the group can have either alliance or conﬂict relationship.
The relationships are identiﬁed as a result of group macroanalysis. It is suggested
that the installed relationships can be changed. The relationships are illustrated
with graph ribs. The solid-line ribs correspond to alliance, while dashed ones
are considered as conﬂict. For mathematical analysis alliance is considered to be
conjunction (multiplication) operation (·), and conﬂict is deﬁned as disjunction
(summation) operation (+).
The graph presented in Fig. 1a or any graph containing any sub-graph isomor-
phic to this graph are not decomposable. In this case, the subjects are excluded
from the group one by one, until the graph becomes decomposable. The exclusion
is done according to the importance of the other subjects for a particular one
[1, 2]. Any other fully connected graphs are decomposable. Any decomposable
graph can be presented in an analytical form of a corresponding polynomial. Any
relationship graph of three subjects is decomposable (see [1, 2]).
Consider three subjects a, b and c. Let subject a is in alliance with other
subjects, while subjects b and c are in conﬂict (Fig. 1b). The polynomial corre-
sponding to this graph is a(b + c).
a
c
b
a
c
b
a
b
c
a
c
b
d
Fig. 1. The relationship graphs.
[a(b+c)]
[a]
[b+c]
⋅
[b]+ [c]
Fig. 2. Polynomial Stratiﬁcation Tree. Polynomials [a], [b] and [c] are elementary poly-
nomials.
Regarding a certain relationship, the polynomial can be stratiﬁed (decom-
posed) into sub-polynomials [1, 2]. Each sub-polynomial belongs to a particular
level of stratiﬁcation. If the stratiﬁcation regarding alliance was ﬁrst built, then
the stratiﬁcation regarding the conﬂict is implemented on the next step. The
stratiﬁcation procedure ﬁnalizes, when the elementary polynomials, containing
a single variable, are obtained after a certain stratiﬁcation step.
The result of stratiﬁcation is the Polynomial Stratiﬁcation Tree (PST). It
has been proved that each non-elementary polynomial can be stratiﬁed in an
unique way, i.e., each non-elementary polynomial has only one corresponding
4
Sergey Tarasenko
PST (see [7] considering one-to-one correspondence between graphs and polyno-
mials). Each higher level of the tree contains polynomials simpler than the ones
on the lower level. For the purpose of stratiﬁcation the polynomials are written
in square brackets. The PST for a(b + c) polynomial is presented in Fig.2.
Next, we omit the branches of the PST and from each non-elementary polyno-
mial write in top right corner its sub-polynomials. The resulting tree-like struc-
ture is called a diagonal form[1, 2, 5, 6]. Consider the diagonal form correspond-
ing to the PST in Fig. 2:
[b] + [c]
[a][b + c]
[a(b + c)]
.
Hereafter, the diagonal form is considered as a function deﬁned on the set of
all subsets of the universal set. The universal set contains the elementary actions.
For example, these actions are actions α and β. By deﬁnition, the Boolean algebra
of the universal set includes four elements: 1 = {α, β}, {α}, {β} and the empty
set 0 = {} = Ø. These elements are all the possible subsets of universal set and
considered as alternatives that each subject can choose. The alternative 0 = {}
is interpreted as an inactive or idle state. In general, Boolean algebra consists of
2n alternatives, if universal set contains n actions.
Accroding to deﬁnition given by Lefebvre [5], we present here exponential
operation deﬁned by formula
P W = P + W ,
(1)
where W stands for negation of W [1, 2, 4].
This exponential operation is used to fold the diagonal form. During the
folding, round and square brackets are considered to be interchangeable. The
following equalities are also considered to be true: x + x = 1, x + 0 = x and
x + 1 = 1. Next we implement folding of diagonal form of polynomial a(b + c):
[b] + [c]
[a][b + c]
[a]([b + c] + [b] + [c])
[a(b + c)]
= [a(b + c)]
= a(b + c) + a .
It is considered that the levels of the PST represent diﬀerent processing levels
of natural or artiﬁcial cognitive system. Each level is considered as an images.
The root of the tree is the input into the cognitive system and, therefore can be
considered as the image of the world (environment including self and others),
perceived by the subject.
As it follows from the PST, there is a hierarchy of images, corresponding
to a particular cognitive level. During processing along this hierarchy in the
bottom-up manner, the image on the lower level undergoes an extensive process
of simpliﬁcation by the means of decomposition into simpler parts on the higher
level. These parts are considered to be the images of the image on the previous
level. Therefore, the images on the second level are diﬀerent representions of the
The Inverse Task
5
original image of the world. This procedure repeats until we obtain elementary
part (elementary polynomials) [1, 2].
On the other hand, the PST folding procedure can be referred as top-down
intergration process of simpler images from the higher levels.
Therefore, the stratiﬁcation procedure of original polynomial together with
the folding procedure of the diagonal form illustrate the interplay of bottom-up
and top-down information processes, which are widely imployed in biological
[8, 9, 10, 11] and artiﬁcial [12, 13, 14] information processing systems. The idea
of hierarchical structure is highly coherent with hierarchical organization of ma-
jority of natural (inanimate objects) and biological (living creatures) entities.
Furthermore, it has been shown that hierarchical structure is intrinsic for the
relationships in societies of insects [15], animals [17, 16, 18] and human beings.
Therefore hierarchical representation of the groups in the form of PST corre-
spond to extraction of the hierarchical structure of the given group, while fusion
of the PST and its diagonal form with diagonal form folding procedure closely
resembles the way of information processing within a single independent congni-
tive system as discussed above. Thus, RGT imploys the fundamental principles
of hierarchical organization on both group (reﬂects structure of the groups) and
individual (illustrates information processing within independent cognitive sys-
tem of a single unit) levels. This makes RGT universal tools that mildly bridges
the gap between representation and analysis.
2.2 The Decision Equation: deﬁnition and solution
The goal of each subject in a group is to choose an alternative from the set of
alternatives under consideration. To obtain choice of each subject, we consider
the decision equations, which contain subject variable in the left-hand side and
the result of diagonal form folding in the right-hand side:
a = (b + c)a + a
b = (b + c)a + a
c = (b + c)a + a
To ﬁnd solution of the decision equations, we consider the following equation:
x = Ax + Bx ,
(2)
where x is the subject variable, and A and B are some sets. Eq.(2) represents
the canonical form of decision equation. This equation has solution if and only
if the set B is contained in set A: A ⊇B. If this requirement is satisﬁed, then
eq.(2) has at least one solution from the interval A ⊇x ⊇B [4]. Otherwise, the
decision equation has no solution, and it is considered that subject cannot make
a decision. In such situation, the subject is in frustration state.
Therefore, to ﬁnd solutions of decision equation, one should ﬁrst transform
it into the canonical form. Out of three presented equations only the decision
6
Sergey Tarasenko
equation for subject a is in the canonical form, while other two should be trans-
formed. We consider explicit transformation only of decision equation for subject
b [20]:
a(b+c)+a = ab+ac+a = ab+(ac+a)b+(ac+a)b = (a+a+ac)b+(ac+a)b =
(1 + ac)b + (ac + a)b = b + (ac + a)b = b + (ac + ac + a)b = b + (c + a)b.
Therefore,
b = b + (c + a)b.
(3)
The transformation of equation for subject c be can be easily derived by
analogy: c = c + (b + a)c.
Next we consider two tasks, which can be formulated regarding the decision
equation in the canonical form and provide methods to solve each task.
2.3 The Forward Task
The variable in the left-hand side of the decision equation in canonical form is
the variable of the equation, while other variables are considered as inﬂuences
on the subject from the other subjects. The Forward task is formulated as a task
to ﬁnd the possible choices of a subject of interest, when the inﬂuences on him
from other subjects are given.
After transformation of arbitral decision equation into its canonical form,
the sets A and B are functions of other subjects’ inﬂuences. For example, if we
consider group of subjects a, b, c, etc. togehter with the abstract representation
of decision equation in canonical form for subject a, the sets A and B will be
the functions of subject variables b, c, etc. :
a = A(b, c, ...)a + B(b, c, ...)a .
(4)
In the case of only three subjects a, b and c, A(b, c, ...) = A(b, c) and
B(b, c, ...) = B(b, c).
All the inﬂuences are presented in inﬂuence matrix (Table 1). The main
diagonal of inﬂuence matrix contains the subject variables. The rows of the
matrix represent inﬂuences of the given subject on other subjects, while columns
represent the inﬂuences of other subjects on the given one. The inﬂuence values
are used in decision equations.
Table 1. Inﬂuence Matrix
a
b
c
a
a
{α} {β}
b {β}
b
{β}
c {β} {β}
c
The Inverse Task
7
For subject a: a = ({β} + {β})a + a ⇒a = {β}a + a.
For subject b: b = b + ({α}{β} + {α})b ⇒b = b + {β}b.
For subject c: c = c + ({β}{β} + {β})c ⇒c = c + ({β} + {α})c ⇒c = 1.
Equation for subject a does not have any solutions, since set A = A(b, c) =
{β} is contained in set B = B(b, c) = 1: A ⊂B. Thus, subject a cannot make
any decision. Therefore he is considered to be in frustration state.
Equation for subject b has at least one solution, since A = A(b, c) = 1 =
{α, β} ⊇B = B(b, c) = {β}. The solution belongs to the interval 1 ⊇b ⊇{β}.
Therefore subject b can choose any alternative from Boolean algebra, which
contains alternative {β}. These alternatives are 1 = {α, β} and {β}.
Equation for subject c turns into equality c = 1. This is possible only in the
case, when A(b, c) ≡B(b, c). Here A = B = 1.
2.4 The Inverse Task
In contrast to the Forward task, the Inverse task is formulated as a task to
ﬁnd all the simultaneous (or joint) inﬂuences of all the subjects together on the
subject of interest that result in choice of a particular alternative or subset of
alternatives. We call the subject of interest to be a controlled subject.
Let subject a be a controlled subject and a∗is a ﬁxed value, representing an
alternative or subset of alternatives, which subjects b, c, etc. want subject a to
choose. We call value a∗to be a target choice. By substituting subject variable a
with ﬁxed value a∗, we obtain the inﬂuence equation. If we substitute the subject
variable a with ﬁxed value a∗in the canonical form of the decision equation (eq.
(4)), we obtain the canonical form of the inﬂuence equation:
a∗= A(b, c, ...)a∗+ B(b, c, ...)a∗,
(5)
For only three subjects a, b and c, A(b, c, ...) = A(b, c) and B(b, c, ...) =
B(b, c).
In contrast to the decision equation, which is equation of a single variable,
the inﬂuence equation is the equation of multiple variables. However, the number
of variables of inﬂuence equation is not trivial question. In fact, the number of
variables in inﬂuence equation can be less then (n −1), where n is the total
number of subjects in the group. There are groups, in which sets A and B are
functions of less than (n−1) variables (see Appendix A). Therefore the variables
that present in inﬂuence equation are called eﬀective variables.
The Inverse task is by deﬁnition1 formalized as to ﬁnd all the joint solutions
of all subjects in the group, except for the controlled one, when the target choice
is represented by interval χ1 ⊇a∗⊇χ2, where χ1 and χ2 are some sets and
χ1 ⊃χ2. In such a case, to solve the Inverse task, one should solve the system
of inﬂuence equations:
1 We need a system of inﬂuence equations because solutions of the inﬂuence equation
a∗= A(b, c, ...)a∗+ B(b, c, ...)a∗itself only guaratee that the original decision equa-
tion a = A(b, c, ...)a + B(b, c, ...)a turns into true equality, but it is not guaranteed
that these solutions are the only ones that turn decision equation into true equality.
8
Sergey Tarasenko
A(b, c, ...) = χ1
B(b, c, ...) = χ2
(6)
(7)
If the target choice is a single alternative, then χ1 = χ2 = a∗.
The solutions of the system (6-7) are considered as reﬂexive control strategies.
The solution of the Inverse task in particular is characterized from two points.
The ﬁrst point is whether it is required to ﬁnd the inﬂuence of a particular single
subject or joint inﬂuences of a group of subjects. The second one is whether the
target choice is represented as a single alternative or as an interval of alternatives.
To illustrate these points, we introduce a particular group of subjects. Let
subjects a and b are in alliance with each other and in conﬂict with subject
c. The polynomial corresponding to this graph is ab + c. The diagonal form
corresponding to this polynomial and its folding is
[a][b]
[ab]
+[c]
[ab + c]
= ab + c
Therefore the decision equation for all the subjects in the group is
x = ab + c,
(8)
where x can be any subject variable a, b or c.
Inﬂuence of a single subject vs joint inﬂuences of a group. First we consider
example, when the inﬂuence of a single subject is required. Let subject b makes
inﬂuence {α} and a∗= {α}. Then we need to ﬁnd inﬂuences of a single subject
c, which result in solution a∗= {α} of decision equation a = ab + c.
The canonical form of this inﬂuence equation is a∗= (b + c)a∗+ ca∗. Since
a∗= {α}, χ1 = χ2 = {α}, we obtain a system of equations:
{α} + c = {α}
c = {α}
(9)
(10)
Therefore, the straight forward solution of this system is c = {α}.
This simple example illustrates the very gist of the Inverse task - to ﬁnd the
appropriate inﬂuences, which result in target choice.
Next, we consider that inﬂuence of subject b is not known. Therefore, we
obtain system
b + c = {α}
c = {α}
(11)
(12)
In this case, we need to ﬁnd the values of variable b, which together with
c, result in solution a∗= {α}. In other words, we need to ﬁnd all the pairs
(b, c), resulting in solution a∗= {α}. These pairs are solutions of the system
(11-12). Therefore, we run all the possible values of variable b and check if the
ﬁrst equation of the system (11-12) turns into true equality:
b = 1 : 1 + {α} = 1 ⇒1 ̸= {α};
The Inverse Task
9
b = {α} : {α} + {α} = {α} ⇒{α} = {α};
b = {β} : {β} + {α} = 1 ⇒1 ̸= {α};
b = 0 : 0 + {α} = {α} ⇒{α} = {α}.
Therefore, out of four possible values of variable b, only two values {α} and
0 are appropriate. Thus, we obtain two pairs (b, c): ({α}, {α}) and ({α}, 0).
A single target alternative vs interval of alternatives. In the previous examples
we considered a target choice to be only a single alternative. Here we illustrate
the case, when a target choice is an interval. Let b = {β}, and 1 ⊇a∗⊇{α}. To
ﬁnd corresponding inﬂuences of subject c, we solve the system of equations:
{β} + c = 1
c = {α}
(13)
(14)
Again, we instantly obtain the solution of this system: c = {α}.
In this section, we have formulated the Inverse task in general and considered
its particular formalization depending on the number of inﬂuences and what is
the target choice. However, we do not have a method to solve arbitral inﬂuence
equation. Therefore, we solve this problem in the next section.
3 How to Solve an Arbitral Inﬂuence Equation
As an introduction for this section, we consider the fundamental proposition,
which will be the conner stone to solve the inﬂuence equations.
Proposition 1. Let P and Q be some abstract sets. Then PQ+PQ = 0 ⇔P =
Q.
Proof. Necessity. Let PQ + PQ = 0, then
PQ + PQ = 0 ⇒PQ + PQ + P = P ⇒P + PQ = P ⇒
P(Q + Q) + PQ = Q + PQ + PQ = P ⇒Q = P.
Therefore if PQ + PQ = 0, then P = Q.
Suﬃciency. Let P = Q, then PP + PP = 0. □
Now let us consider the new type of equation:
A1x + B1x = 0
(15)
This equation has solution if and only if A1 ⊇x ⊇B1.
10
Sergey Tarasenko
3.1 Solving Inﬂuence Equations
There are three operations deﬁned on the Boolean algebra. They are conjunc-
tion (· or multiplication), disjunction (+ or summation) and negation (x, where
x is subject variable). The negation operation is unary operation, while other
two operations are binary. Using combination of these three operations, we can
compose any inﬂuence equation. Since, it is obvious how to solve the equation
including only unary operation, we discuss how to solve inﬂuence equations in-
cluding a single binary operation.
For this perpose, we consider two abstract subject variables x1 and x2 and
abstract alternative χ.
Lemma 1. The solution of equation
x1 + x2 = χ
(16)
regarding variable xi, where i = 1, 2, is given by the interval χ ⊇xi ⊇(χxj +
xjχ), where j = 1, 2; j ̸= i.
Proof. According to Proposition 1, P = x1 + x2, Q = χ, P = x1 + x2 = x1 x2
and Q = χ.
Therefore, PQ + PQ = (x1 + x2)χ + x1 x2χ = x1χ + x2χ + x1 x2χ. Conse-
quently, we obtain eq.(17):
x1χ + x2χ + x2χx1 = 0
(17)
We solve eq.(17) regarding variable x1. First, we transform eq.(17) into canon-
ical form:
χx1 + (χx2 + χx2)x1 = 0
(18)
Therefore, the solution of eq.(18) is given by the interval
χ ⊇x1 ⊇(χx2 + x2χ).
(19)
Since variables x1 and x2 are interchangable and it is possible to solve eq.(17)
regarding variable x2 as well, the general form of solution of eq.(16) is the interval
χ ⊇xi ⊇(χxj + xjχ).
(20)
where i = 1, 2 and j = 1, 2; j ̸= i.□
Lemma 2. The solution of equation
x1x2 = χ
(21)
regarding variable xi, where i = 1, 2, is given by the interval (χxj +χ xj) ⊇xi ⊇
χ, where j = 1, 2; j ̸= i.
The Inverse Task
11
Proof. According to Proposition 1, P = x1x2, Q = χ, P = x1x2 = x1 + x2 and
Q = χ.
Therefore, PQ + PQ = (x1x2)χ + (x1 + x2)χ = x2χx1 + x1χ + x2χ.
Thus, we obtain eq.(22):
x2χx1 + x1χ + x2χ = 0
(22)
We solve eq.(22) regarding variable x1. First, we transform eq.(22) into canon-
ical form:
(χx2 + χx2)x1 + χx1 = 0
(23)
Since χx2 + χx2 = χx2 +χ x2, the solution of eq.(23) is given by the interval
(χx2 + χ x2) ⊇x1 ⊇χ.
(24)
Since variables x1 and x2 are interchangable and it is possible to solve eq.(22)
regarding variable x2 as well, the general form of solution of eq.(21) is the interval
(χxj + χ xj) ⊇xi ⊇χ.
(25)
where i = 1, 2 and j = 1, 2; j ̸= i.□
Since one bound of the solution intervals for eqs.(16) and (21) are functions of
the second variable, we need to run all the possible values of the second variable
in order to obtain all possible solutions of these equations in the form of pairs
(x1, x2).
Next we consider several examples, illustrating application of Lemmas 1 and
2.
Example 1. For illustration, we solve equation a∗= ba∗+c. Consider χ = a∗,
x1 = ba∗and x2 = c, we obtain the solution interval for variable x2 = c:
χ ⊇c ⊇(χχb + χ χb). After simplﬁcation, we get interval (26):
χ ⊇c ⊇χb
(26)
Next we consider examples with particular alternatives. Let it be alternative
{α} : χ = {α}. The solution interval is then {α} ⊇c ⊇{α}b. Since the lower
bound of this interval is a function of variable b, to ﬁnd all solutions of equation
a∗= ba∗+ c, we calculate value of expression {α}b for all possible values of
variable b (Table 2).
To reesure that solutions are correct, we check that decision equation a =
ba + c turns into true equality for the obained pairs (b, c):
({α}, {α}): {α}{α} + {α} = {α} ⇒{α} = {α} is true;
({α}, 0): {α}{α} + 0 = {α} ⇒{α} = {α} is true;
({β}, {α}): {α}{β} + {α} = {α} ⇒{α} = {α} is true;
(1, {α}): {α}1 + {α} = {α} ⇒{α} = {α} is true;
(1, 0): {α}1 + 0 = {α} ⇒{α} = {α} is true;
(0, {α}): {α}0 + {α} = {α} ⇒{α} = {α} is true.
12
Sergey Tarasenko
So far, we have illustrated how to solve the inﬂuence equation. We as well
showed that the pairs (b, c) obtained by solving equation a∗= ba∗+ c in ac-
cordance with Proposition 1 and Lemmas 1 and 2 are indeed solutions of this
equation.
Table 2. Solutions of the inﬂuence equation a∗= ba∗+ c
Values of b
{α}
{β}
1
0
Pairs (b, c) ({α}, {α}) ({β}, {α}) (1, {α}) (0, {α})
({α}, 0)
(1, 0)
Example 2. We consider inﬂuence equation for subject b obtained from eq.(3).
(c + a)χ + χ = χ
(27)
First, we transform the left-hand side of eq.(27):
(c + a)χ + χ = cχ + aχ + χ = cχ + aχ + (c + a + 1)χ = c + a + χ.
Therefore, eq.(27) can be rewritten as follows:
c + a + χ = χ
(28)
Considering, x1 = c and x2 = a+χ, we instantly obtain the solution interval
of eq.(28): χ ⊇c ⊇(χ(a + χ) + χ(a + χ)) ⇒χ ⊇c ⊇(χ a + χχa).
Finally,
χ ⊇c ⊇χ a
(29)
Example 3. Next, we consider inﬂuence equation
ab + χ = χ
(30)
Considering, x1 = ab and x2 = χ, we instantly obtain the solution interval
χ ⊇ab ⊇(χχ + χχ) or
χ ⊇ab ⊇0
(31)
Therefore, in order to ﬁnd all solutions of eq.(30), we need to solve the equa-
tions
ab = y
(32)
where y is any sub-set of set χ (y ⊇χ).
Each equation can be solved according to Lemma 2.
Example 4. As a ﬁnal example, we again consider inﬂuence equation a∗=
(b + c)a∗+ ca∗and show how application of Lemma 1 essentially simpliﬁes its
solution. We get the system of inﬂuence equations:
The Inverse Task
13
b + c = {α} ;
c = {α} .
(33)
(34)
From this system we obtain a single equation:
b + {α} = {α} .
(35)
According to Lemma 1, we instantly obtain the solution interval of eq.(35):
{α} ⊇b ⊇0 .
(36)
Thus, eq.(35) has two solutions: b = {α} and b = 0. Therefore the solution
of system (33-34) consists of two pairs ({α}, {α}) and (0, {α}).
To conclude this section, we provide its brief summary. We have shown how
to solve the Inverse task by means of inﬂuence equations. We have proved two
fundamental lemmas, which allow to solve any inﬂuence equation regardless of
the number of variables. Finally, we have illustrated several examples of how
apply these lemmas.
3.2 Analysis of Extreme Cases 1: Frustration
In this section we analyze the situation, when subject can appear in frustration
state, from the point of view of the inverse task. Let us consider the polynomial
a(b + c) discussed in the section 2.1. The decision equation that corresponds to
this polynomial is x = (b + c)a + a, where x can be any subject variable.
Next we try to ﬁnd all the pairs (b, c) such that result in selection of a
particular alternative by subject a.
The decision equation for subject a is a = (b + c)a + a. The solution interval
of this decision equation is b + c ⊇a ⊇1. We need to check which alternative
subject a can be convinced to choose. To do this, we consider the system of
equation for each alternative.
Alternative {α}:
b + c = {α}
1 = {α}
(37)
(38)
Alternative {β}:
b + c = {β}
1 = {β}
(39)
(40)
Alternative 0 = {}:
b + c = 0
1 = 0
(41)
(42)
In these systems the second equation is incorrect equality. Therefore these
systems have no solution.
Alternative 1 = {α, β}:
14
Sergey Tarasenko
b + c = 1
1 = 1
(43)
(44)
The second equation is correct equality. Therefore this system has solution.
Thus, out of four possible alternatives, subject a actually can choose only
alternative 1 = {α, β}. To ﬁnd solutions, resulting in selection of the alternative
1 = {α, β}, we need to solve only eq.(43), since eq.(44) turns into the true
equality.
According to Lemma 1, we instantly obtain the solution interval for eq.(43):
1 ⊇b ⊇c
(45)
We calculate the pairs (b, c) for all possible values of variable c (Table 3).
Table 3. Solutions of the inﬂuence equation b + c = 1
Values of c
{α}
{β}
1
0
Pairs (b, c)
({β}, {α}) ({α}, {β})
(0, 1)
(1, 0)
(1, {α})
(1, {β})
({α}, 1)
({β}, 1)
(1, 1)
Therefore, the inﬂuence analysis of the decision equation a = (b+c)a+a shows
that the only alternative that subject a can choose is alternative 1 = {α, β}. The
inﬂuence analysis provides us with the set (exhaustive list) of pairs (b, c) of joint
inﬂuences resulting in selection of alternative 1 = {α, β}. Therefore, if the pair
of inﬂuences does not match any pair from this list, the decision equation has
no solution and this results in frustration state.
Summarizing, this section we note that in general there are two sets. The set
D contains alternatives that a controlled subject can choose. The set U is the
set of altertanives of the target choice. Therefore, the need to put subject a into
frustration state emerges, if the target choice of a controlled subject cannot be
made by this subject. In other words, we need to put a subject into frustration
state, if D ∩U = Ø.
3.3 Analysis of Extreme Cases 2: What to do with Super-Active
Groups
Among all the possible groups, there are groups, in which subjects will always
choose only the alternative 1 = {α, β} regardless of the inﬂuence of other sub-
jects. Such groups are called super-active groups.
The Inverse Task
15
Next we consider one special case of super active groups - the homogenous
groups. The group is called homogenous, if all the subjects in the group are
connected with the same relationship.
Here we provide proof of the lemma about homogenous groups originally
formulated by Lefebvre [1, 2].
Lemma 3. Any homogenous group is the super-active group.
Proof. We consider the homogenous groups, where all the subjects are connected
with alliance (alliance groups) and conﬂict (conﬂict groups) relationship, sepa-
rately.
Without loss of generallity, we suggest that there are n subjects a1, a2, ..., an.
Alliance groups. The polynomial corresponding to the alliance group of n
subject is a1a2...an. Next we construct the diagonal form and apply folding
procedure:
[a1][a2]...[an]
[a1a2...an]
= [a1a2...an] + [a1][a2]...[an] = 1 .
Therefore the alliance groups are always super-active.
Conﬂict groups. The polynomial corresponding to the conﬂict group of n
subject is a1 + a2 + ... + an. Next we construct the diagonal form and apply
folding procedure:
[a1] + [a2] + ... + [an]
[a1 + a2 + ... + an]
=
[a1 + a2 + ... + an]+ [a1] + [a2] + ... + [an] = 1 .
Therefore the conﬂict groups are always super-active.
Since both the alliance and the conﬂict groups are super-active, this lemma
is proved. □
However, there are non-homogenous super-active groups as well (see Ap-
pendix B).
Summarizing this section, we note that subjects in the super-active groups
cannot be controlled in their choices and the entire groups is uncontrolable.
Therefore, once the super-active groups emerges, the only way to make it con-
trollable is to change the relationships in the group.
4 The Basic Control Schema of an Abstract Subject
(BCSAS) in the RGT
We have presented the detailed description of the RGT including solution of
the Forward and Inverse tasks. We have also considered the extream cases of
decisions like putting a subject into frustration state or changing structure of a
super-active group. As a ﬁnal stroke, we summarize all the presented material in
16
Sergey Tarasenko
Zχ== 
{}?      
yes
no
Pairs(M)
M = 1
Start 
M=<
N
M = M + 1
End 
yes
no
Dh = Dh + χ
Read 
Pairs (χ, Zχ)x
Save Dh
Zh= Zh + Zχ
Save Zh
Fig. 3. The Block schema for extracting sets Dh and Zh.
the form of Basic Control Schema of an Abstract Subject (BCSAS) in the RGT.
The input comes from the environment and is formalized in the form of exter-
nal Inﬂuences on the subject, the Boolean algebra of Alternatives and Structure
of a Group.
Information about the Inﬂuences, Boolean algebra and Group Structure is
propagated into the Decision Module. The Decision Module implements solution
of the Forward task. Therefore the output set D of the Decision Module is the
set of possible alternatives, which subject can choose under the given conditions.
The information about Boolean algebra and Group Structure is propagated
into the Inﬂuence Module. The Inﬂuence Module solves the Inverse task. The
output set Dh of the Inﬂuence Module is the set of the pairs (χ, Zχ)x, where χ is
the target alternative, the set Zχ is the set of all the joint inﬂuences, resulting in
selection of the target choice; and x represents a subject variable. Each (χ, Zχ)x
represents a reﬂexive control strategy.
Therefore, the decision to put a subject into frustration state is justiﬁed if
it is impossible to make subject x choose the target alternative χ, i.e., if for pair
(χ, Zχ)x set Zχ = {}, and subject x should not choose any other alternative
except for the target one.
The Inverse Task
17
4.1 Schema for Iterative Algorithm to Obtain Output of the
Inﬂuence Module
The alternatives χ with corresponding non-empty sets Zχ are included into the
set Dh. Here we introduce set Zh to store the non-empty sets Zχ. The schema
of the algorithm for extracting sets Dh and Zh is presented in Fig. 3. First the
sets Dh and Zh are empty: Dh = {} and Zh = {}. The algorithm reads the set of
pairs (χ, Zχ)x and stores it in array Pairs(M), where M is a counting variable,
N is the total number of pairs. Then it is checked for each pairs from array
Pairs whether set Zχ is empty: Zχ == {}? . If ’yes’, the algorithm increments
counting variable M(M = M + 1) and proceeds to the next pair from array
Pairs. If ’no’, then alternative χ is included into the set Dh(Dh = Dh + χ), set
Dh is saved, the set Zχ is included into set Zh (Zh = Zh + Zχ) and set Zh is
saved. The process is run while M ≤N.
In this iterative algorithm, we separately store the alternatives χ , which can
be chosen by a certian subject, in the set Dh and the joint inﬂuences Zχ , which
result in selection of alternative χ, in the set Zh.
Therefore, we should modify the schema of Inﬂuence Module in BCSAS as
follows. We present elaborated schema, where sub-module ”Solution: Dh” is ac-
companied with sub-module ”Solution: Zh”. Together these sub-modules are
included into the ”Solutions” sub-module.
BCSAS is the fundamental schema of an abstract subject, which is used
through out the RGT. The BCSAS is presented in Fig.4.
This concludes the overview of RGT and description of tasks within the scope
of the general theory. Therefore, we continue with application of the RGT to the
mixed groups of humans and robots.
Decision 
equation of 
a  robot
Decision Module 
Solutions : D
Boolean 
Algebra of
Alternatives
Environment
Decision 
equation of a 
human
Influence Module
Solution: Dh
Realization of 
an alternative
Reflexive 
control
Influences
System of 
Influence eqs.
Structure 
of a Group
Solution: Zh
Solutions
Fig. 4. The Basic Control Schema of an Abstract Subject (BSCAS).
18
Sergey Tarasenko
5 Deﬁning Robots in RGT
As we have noted in the Introduction section, the goal of the robots in mixed
groups of humans and robots is to refrain human subject from choosing risky
actions, which might result in injuries or even threaten live.
It is considered by default that robot follows the program of behavior. Such
program consists of at least three modules. The Module 1 implements robot’s
ability of human-like decision-making based on the RGT. The Module 2 contains
the rules, which refrain robot from making a harm to human beings. The Module
3 predicts the choice of each human subject and suggests the possible reﬂexive
control strategies.
The Modules 1 and 3 are inhereted from the BCSAS of an Abstract Individ-
ual. They correspond to Decision Module and Inﬂuence Module of the BCSAS
(Fig. 4), respectively. Therefore all the properties and meaning of outputs of the
Modules 1 and 3 are the same as the ones for Decision and Inﬂuence modules,
respectively.
The Module 2 is the new module, which is intrinsic for robotic agents studied
in the context of mixed groups of humans and robots. This module is responsible
for extraction of only harmless or non-risky alternatives for human subject.
We suggest to apply Asimov’s Three Laws of robotics [19], which formulate
the basics of the Module 2:
1) a robot may not injure a human being or, through inaction, allow a human
being to come to harm;
2) a robot must obey any orders given to it by human beings, except where such
orders would conﬂict with the First Law;
3) a robot must protect its own existence as long as such protection does not
conﬂict with the First or Second Law.
We consider that these laws are intrinsic part of robots ”mind”, which cannot
be erased or corrupted by any means.
The interaction of Modules 1 and 2 is performed in the Interaction Module
1. The interaction of Modules 3 and 2 is implements in the Interaction Module
2.
The Boolean algebra is ﬁltered according to Asimov’s laws in Module 2.
The output of Module 2 is set U of approved alternatives. This data is then
propagated into interaction modules.
The output of the Module 1 is set D of alternatives, which robot has to choose
under the given joint inﬂuences. In the Interaction Module 1, the conjunction of
sets D and U is performed: D ∩U = DU. If set DU is not empty set, this means
that there are aproved alternatives among the alternatives that robot should
choose in accordance with the joint inﬂuences. Therefore, robot can implement
any alternative from the set DU. If set DU is empty, this means that under given
joint inﬂuences robot cannot choose any approved alternative, therefore robot
will choose an alternative from set U. This is how the Interaction Module 1
works.
The output of the Module 3 contains sets Dh and Zh. The goal of the robot
is to refrain human subjects from choosing risky alternative. This can be done
The Inverse Task
19
Asimov 
Laws’ 
based 
filter
set U of 
approved 
alternatives
Module 2
Decision 
equation of 
a  robot
Module 1
Solutions : D
DU=
= {}?      
U
DU
Boolean 
Algebra of
Alternatives
yes
no
DhU=
= {}?      
U
DhU
yes
no
Environment
Realization of 
an alternative
Reflexive control
Decision 
equation of a 
human
Module 3
Solution: Dh
Influences
Structure 
of a Group
Solution: Zh
Solutions
Frustration
yes
X=DhU,
for∀χ∈X get
(χ, Zχ)
for∀Zχ∈ZU
get (χ, Zχ)
ZU =
= {}?      
get set ZU of 
Zχ≠{}:∀χ∈U
no
Interaction 
Module 1
Interaction 
Module 2
Fig. 5. The Basic Control Schema of a Robotic Agent (BCSRA).
by convincing human subjects to choose alternatives from the set U. First, we
check whether Dh contains any approved alternative. We do so by performing
conjunction of sets Dh and U: Dh ∩U = DhU.
If set DhU is not empty, then it means that it is possible to make a human
subject to choose some non-risky alternative. Therefore, we should choose the
corresponding reﬂexive control strategy from the set Zh. However, if set DhU
is empty, we have to ﬁnd the reﬂexive control strategy that will make human
subject to select approved alternative from set U. For this purpose, we construct
set ZU by including all the joint inﬂuences Zχ for approved alternatives: Zχ ∈
ZU ⇔χ ∈U. Next we check whether set ZU is empty. If set ZU is empty
this means it is impossible to convince a human subject to choose non-risky
alternative. Therefore, the only option of reﬂexive control in this case is to put
this subject into frustration state. However, if set ZU is not empty, this means
that there exist at least one reﬂexive control strategy that results in selection of
alternative from the set of the approved (non-risky) ones.
Therefore, the BCSRA inherits the entire structure of the BCSAS and aug-
ments it with Module 2 of Asimov’s Laws together with Interaction Modules 1
and 2.
The original schema of robot’s control system has been recently presented
in [20]. The BCSRA is extended version of the original schema. The BCSRA
20
Sergey Tarasenko
provides comprehensive approach of how Forward and Inverse tasks are solved
in the robot’s ”mind”.
Thus, in this section we have presented the formalization of robotic agent in
the RGT. We outlined the speciﬁc features of robotic agents, which distinguish
them from other subjects. Furthermore, we provided detailed explanation of how
the Forward and Inverse tasks are solved in the framrework of control system
(BCSRA) of robots.
Next, we proceed with consideration of sample sutiations of interactions be-
tween humans and robots.
6 Extended Sample Analysis of Mixed Groups
Here we elaborate two examples, presented in the previous study [20], of how
robots in the mixed groups can make humans refrain from risky actions. We
discuss the application of the extended schema of robot’s control system and
provide explicit derivation of reﬂexive control strategies, which has been applied
in these examples in the prevous study [20].
6.1 Robots Baby-Sitters
Suppose robots have to play a part of baby-sitters by looking after the kids. We
consider a mixed group of two kids and two robots. Each robot is looking after a
particular kid. Having ﬁnished the game, kids are considering what to do next.
They choose between “to compete climbing the high tree” (action α) and “to
play with a ball” (action β). Together actions α and β represent the active state
1={α, β} = {α} + {β}. Therefore the Boolean algebra of alternatives consists of
four elements: 1) the alternative {α} is to climb the tree; 2) the alternative {β}
is to play with a ball; 3) the alternative 1 = {α, β} means that a kid is hesitating
what to do; and 4) the alternative 0 = {} means to take a rest.
We consider that each kid considers his robot as ally and another kid and
his robot as the competitors. The kids are subjects a and c, while robots are
subjects b and d. The relationship graph is presented in Fig. 6.
a
c
b
d
Fig. 6. The relationship graph for robots baby-sitters examples.
Next we calculate the diagonal form and fold it in order to obtain decision
equation for each subject:
The Inverse Task
21
[a][b]
[c][d]
[ab]
+[cd]
[ab + cd]
= ab + cd .
From two actions α and β, action α is a risky action, since a kid can fall from
the tree and this is real threat for his health or even life. Therefore according to
Asimov’s laws, robots cannot allow kids to start the competition. Thus, robots
have to convince kids not to choose alternative {α}. In terms of alternatives,
the Asimov’s laws serve like ﬁlters which ﬁlter out the risky alternatives. The
remaining alternatives are included into set U. In this case, U = {{β}, {}}.
Next we solve the Inverse taks, regarding alternatives {β} and {}. We conduct
the analysis regarding kid a. This analysis can be further extended for kid c in
the similar manner.
Solution of the Inverse task for kid a with approved alternatives as target
choice. The decision equation for kid a is a = ab+cd. First, we transform it into
canonical form: a = (b + cd)a + cda.
Next we consider system of inﬂuence equations:
b + cd = χ
cd = χ,
(46)
(47)
where alternative χ ∈U.
Regarding eq.(47), eq.(46) is transformed into equation
b + χ = χ
(48)
The solution of eq.(48) directly follows from Lemma 1: χ ⊇b ⊇0. Therefore
for χ = {β} and χ = {} the solutions are {β} ⊇b ⊇0 and b = 0, respectively.
The eq.(47) can be instantly solved according to Lemma 2: χd+χ d ⊇c ⊇χ.
Consider χ = {β} ﬁrst. Then {β}d + {α}d ⊇c ⊇{β}. By varying values of
variable d, we obtain all the pairs (c, d):
d = 1: {β} ⊇c ⊇{β} ⇒c = {β}. Therefore the solution is pair ({β}, 1);
d = 0: {α} ⊇c ⊇{β}. Since {α} ∩{β} = {}, there is no solution;
d ={α} : 0 ⊇c ⊇{β}. Since {β} ⊇{}, there is no solution;
d ={β} : 1 ⊇c ⊇{β}. Therefore there are two solutions (1, {β}) and
({β}, {β}).
Therefore equation cd = {β} has three solutions ({β}, 1), (1, {β}) and
({β}, {β}).
Thus, we have solved both equations from system (46-47). The solutions of
this system are the triplets (b, c, d) of joint inﬂuences, which are all possible com-
binations of solutions of both equations. Since there are two solution of eq.(46)
and three solutions of eq.(47), there are six triplets (b, c, d) in total: (0, {β}, 1)
and ({β}, {β}, 1); (0, 1, {β}) and ({β}, 1, {β}); (0, {β}, {β}) and ({β}, {β}, {β}).
Now we consider the case, when χ = 0 = {}. Then d ⊇c ⊇0. We obtain
pairs (c, d) for all values of variable d:
d = 1: 1 ⊇c ⊇0 ⇒c = 0. Thus, there is only one solution (0,1);
22
Sergey Tarasenko
d = 0: 1 ⊇c ⊇0. Thus, there are four solutions (1, 0), ({α}, 0), ({β}, 0) and
(1, 0);
d = {α}: {β} ⊇c ⊇0. Thus, there are four solutions ({β}, {α}) and (0, {α});
d = {β}: {α} ⊇d ⊇0. Thus, there are four solutions ({α}, {β}) and (0, {β}).
In total, equation cd = 0 has 9 solutions. Therefore system (49-50) also has
9 solutions as triplets (b, c, d): (0, 1, 0), (0, 0, 0), (0, 0, {α}), (0, 0, {β}), (0, 0, 1),
(0, {α}, {β}), (0, {α}, 0), (0, {β}, {α}) and (0, {β}, 0).
We have considered two cases, when both upper and lower bounds of the
interval of decision equation equal to the same alternative. Now we discuss a
new situation, when variable a should take not a single value, but several values.
In this case, we should ﬁnd the joint inﬂuences (b, c, d) that result in selection
of either alternative {β} or {}. Since, {β} ⊇{}, we need to ﬁnd all the triplets
(b, c, d), resulting in the solution of decision equation as interval {β} ⊇a ⊇{}.
Thus, {β} ⊇a∗⊇{}.
Therefore, we need to solve the following system of equations:
b + cd = {β}
cd = 0.
(49)
(50)
The eq.(49) turns into equality b = {β}, and we need to solve eq.(50). How-
ever, this equation has been already solved in the previous example. Therefore we
obtian the solutions of the system (49-50): ({β}, 1, 0), ({β}, 0, 0), ({β}, 0, {α}),
({β}, 0, {β}), ({β}, 0, 1), ({β}, {α}, {β}), ({β}, {α}, 0), ({β}, {β}, {α}) and
({β}, {β}, 0).
Comparing solutions of all three system of inﬂuence equation, we can see
that there are four remarkable solutions ({β}, {β}, {β}) and ({β}, {}, {β});
({β}, 1, {β}) and ({β}, {α}, {β}). The ﬁrst pair of solution results in choice of
only alternative {β}, while second pair of solutions results in selection of eighter
alternative {β} or alternative {}. These four solutions together illustrate that
if b = d = {β}, it is guaranteed that regardless of inﬂuence of kid c, kid a will
choose either of approved alternatives.
By analogy, we can see that among solutions of system (46-47) with χ = {},
there are four solutions (0, 1, 0),(0, 0, 0), (0, {α}, 0) and (0, {β}, 0). Therefore, if
b = d = 0, kid a will choose alternative 0 = {} regardless of inﬂuence of kid c.
These two examples of binding variables b and d were considered in Scenario 1
and Scenario 2 of sample situation with robot baby-sitters, originally presented
in [20].
Summarizing the results of this section, we have shown that robots can suc-
cessfully control kids’ behavior by refraining them from doing risky actions. The
basic of this control is entirely based on the proposed schema of robot’s control
system. We have analyzed all the possible reﬂexive control strategies by solving
three systems of inﬂuence equation: two systems regarding a single alternative
and one system regarding the interval of alternatives. Therefore, we have shown
how the Inverse task can be eﬀectively solved by our proposed algorithm in
situation similar to the real conditions.
The Inverse Task
23
6.2 Mountain-Climbers and Rescue Robot
We consider that there are two climbers in the mountain and rescue robot. The
climbers and robot are communicating via radio. One of the climbers (subject b)
got into diﬃcult situation and needs help. Suggest, he fell into the rift because
the edge of the rift was covered with ice. The rift is not too deep and there is a
thick layer of snow on the bottom, therefore climber is not hurt, but he cannot
get out of the rift himself. The second climber (subject a) wants to rescue his
friend himself (action α), which is risky action. The second option is that robot
will perform rescue mission (action β). Since inaction is inappropriate solution
according to the First Law, the set U of approved alternatives for robot includes
only alternative {β}. The goal of the robot is to refrain the climber a from
choosing alernative {α} and perform rescue mission itself.
We suggest that from the beginning all subjects are in alliance. The cor-
responding graph is presented in Fig. 1c and its polynomial is abc. Therefore
by deﬁnition it is homogenous group and, consequently, it is super-active group
according to Lemma 3.
Thus, any subject in the group is in active state. Therefore, group is un-
controllable (see Section 3.3). In this case, robot makes decision to change his
relationship with the climber b from alliance to conﬂict. Robot can do that, for
instance, by not responding to climber’s orders.
Which reﬂexive control leads to frustration state? Then the polynomial corre-
sponding to the new group is a(b+c). This polynomial has been already broadly
discussed in the Section 3.2. Therefore, we know decision equation for subject a:
a = (b+c)a+a. We have shown as well that subject a can choose only alternative
1 = {α, β}, if appropriate joint inﬂuences are applied (see Section 3.2), overwise
subject a is in frustration state and cannot make any choice. Therefore, in or-
der to put subject a into frustration state, the reﬂexive control strategy should
NOT be selected from the list of solutions (Section 3.2): ({β}, {α}); (1, {α});
({α}, {β}); (1, {β}); (0, 1); ({α}, 1); ({β}, 1); (1, 1) and (1, 0).
Here we provide two examples of such joint inﬂuences (b, c): ({α}, {α}) ⇒
({α} + {α}) = {α} ⊂1 and ({β}, {}) ⇒({β} + {}) = {β} ⊂1.
Whether robot can complete mission regardless of joint inﬂuences of other
subjects? The decision equation for robot c is c = c+(b+a)c. The corresponding
solution interval is 1 ⊇c ⊇(b + a).
Here we analyze all 16 possible reﬂexive control strategies (a, b) that climbers
can apply to robot c.
Examples with emtpy set DU. For (0, b), there will be the same situation
regardless of value of variable b : 1 ⊇c ⊇(b + 0) ⇒1 ⊇c ⊇(b + 1) ⇒c = 1.
For (a, 1), there will be the same situation regardless of value of variable a :
1 ⊇c ⊇(1 + a) ⇒c = 1.
For ({α}, {α}): 1 ⊇c ⊇({α} + {α}) ⇒1 ⊇c ⊇({α} + {β}) ⇒c = 1.
For ({β}, {β}): 1 ⊇c ⊇({β} + {β}) ⇒1 ⊇c ⊇({β} + {α}) ⇒c = 1.
Therefore in these cases set D = {{α, β}}.
Next we consider other pairs (a, b).
24
Sergey Tarasenko
(1, {α}): 1 ⊇c ⊇({α} + 1) ⇒1 ⊇c ⊇{α}. Here set D = {{α, β}, {α}}.
({β}, {α}): 1 ⊇c ⊇({α} + {β}) ⇒1 ⊇c ⊇{α}. Here set D = {{α, β}, {α}}.
({β}, 0): 1 ⊇c ⊇(0+{β}) ⇒1 ⊇c ⊇{α}. Therefore, set D = {{α, β}, {α}}.
Since U = {{β}}, DU = {} for all the cases considered above, robot will
choose alternative {β} from the set U.
Examples with non-empty set DU. Consider the following pairs (a, b):
(1, {β}): 1 ⊇c ⊇({β}+1) ⇒1 ⊇c ⊇{β}. Therefore, set D = {{α, β}, {β}}.
(1, 0): 1 ⊇c ⊇(0 + 1) ⇒1 ⊇c ⊇0. Thus, set D = {{α, β}, {α}, {β}, {}}.
({α}, {β}): 1 ⊇c ⊇({β}+{α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
({α}, {β}): 1 ⊇c ⊇({β}+{α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
({α}, 0): 1 ⊇c ⊇(0 + {α}) ⇒1 ⊇c ⊇{β}. Thus, set D = {{α, β}, {β}}.
Since U = {{β}}, DU = {{β}} for all the cases considered above, robot will
choose alternative {β} from the set DU.
Thus, we have shown that under all 16 reﬂexive control strategies (a, b), robot
c can choose the alternative {β}, which is to perform the rescue mission itself.
Therefore robot will choose alternative {β} regardless of the joint inﬂuences
(a, b) of the climbers.
The discussed example illustrates how robot can transform uncontrollable
group into controllable one by manipulating the relationships in the group. In
the controllable group by its inﬂuence on the human subjects, robot can refrain
the climber a from risky action to rescue climber b. Robot achieves its goal by
putting climber a into frustration state, in which climber a cannot make any
decision. On the other hand, set U of approved alternatives guarantees that
robot itself will choose the option with no risk for humans and implement it
regardless of climber’s inﬂuence.
Therefore, in this section we have illustrated robot’s ability to refrain human
being from risky actions and to perform these risky actions itself. This proves
that our approach achieves both goals of robotic agent: 1) to refrain people
from risky actions and 2) to perform risky actions itself regardless of human’s
inﬂuences.
7 Discussion and Conclusion
Summarizing, the results of this paper, we outline the most important of them.
First of all, we have introduced the Inverse task and developed the ultimate
methods to solve it.
We have provided a comprehensive tutorial to the brand new Reﬂexive
Game Theory recently formulated and proposed by Vladimir Lefebvre [1, 2,
3, 4]. The tutoral contains the detailed description of the Forward and Inverse
tasks together with methods to solve them.
We propose control schemas for both abstract subject (BCSAS) and robotic
agent (BCSRA). These schemas were specially designed to incorporate solution
of the Forward and Inverse tasks, thus providing us with autonomous units
The Inverse Task
25
(individuals, subjects, agents) capable of making decisions in the human-like
manner. We have shown that robotic agents based on BCSRA can be easily
included into the mixed groups of humans and robots and eﬀectively serve their
fundamental goals (refraining humans from risky actions and, if needed, perform
the risky acions itself).
Therefore, we consider that present study provides the comprehensive overview
of the classic RGT proposed by Vladimir Lefebvre [1, 2, 3, 4] and newly developed
self-consistent framework for analysis of diﬀerent kinds of groups and societies,
including human social groups and mixed groups of humans and robots together
with application tutorial of this new framework.
This framework is entirely based on the principles of the RGT and brings
together all its elements. The solution of the Inverse task, presented in this
paper, plays a crutial role in formation of this framework. Therefore, by having
the Inverse task as one of its fundamentals, this framework illustrates the role
of the Inverse task and its relationship with other issues considered in the RGT.
References
1. Lefebvre, V.A.: Lectures on Reﬂexive Game Theory. Leaf & Oaks, Los Angeles
(2010).
2. Lefebvre, V.A.: Lectures on Reﬂexive Game Theory. Cogito-Center, Moscow (2009)
[in Russian].
3. Lefebvre, V.A.: The basic ideas of reﬂexive game’s logic. Problems of research of
systems and structures. pp. 73-79 (1965) [in Russian].
4. Lefebvre, V.A.: Reﬂexive analysis of groups. In: Argamon, S. and Howard, N.
(eds.) Computational models for counterterrorism. pp. 173-210. Springer, Heidel-
berg (2009).
5. Lefebvre, V.A.: Algebra of Conscience. D. Reidel, Holland (1982).
6. Lefebvre, V.A.: Algebra of Conscience. 2nd Edition. Holland: Kluwer (2001).
7. Batchelder, W.H., Lefebvre, V.A.: A mathematical analysis of a natural class of
partitions of a graph. J. Math. Psy. 26, pp. 124-148 (1982).
8. Kobatake, E., and Tanaka, K.: Neuronal Selectivities to Complex Object Features
in the Ventral Pathway of the Macaque Monkey. Journal of Neurophysiology, 71,
3, pp. 856-867 (1994).
9. Koerner, E., Gewaltig, M.-O., Koerner, U., Richter, A., and Rodemann, T.: A
model of computation in neocortical architecture. Neural Networks, 12, pp. 989-
1005 (1999).
10. L¨ucke, J., and von der Malsburg, C.: Rapid processing and unsupervised learning
in a model of the cortical macrocolumn. Neural Computation, 16, pp. 501-533
(2003).
11. Schrander, S., Gewaltig, M.-O., K¨orner, U. and K¨orner, E.: Cortext: A columnar-
model of bottom-up and top-down processing in the neocortex. Neural Networks,
22, pp. 1055-1070 (2009).
12. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mech-
anism of pattern recognitition unaﬀected by shift and position, Biological Cyber-
natics, 36, pp. 193-201 (1980).
13. Riesenhuber, M. and Poggio, T.: Hierarchical models of object recognition in cor-
tex. Nature Neuroscience, 2, 11, pp. 109-125 (1999).
26
Sergey Tarasenko
14. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio.: Robust Object
Recognition with Cortex-like Mechanisms, IEEE Transactions on pattern analysis
and machine intelligence, 29, 3, pp. 411-426 (2007).
15. Hienze, J.: Hierarchy length in orphaned colonies of the ant Temnothorax nylanderi
Naturwissenschaften, 95, 8, pp. 757-760 (2008).
16. Chase, I., D.: Models of hierarchy formation in animal societies. Behavioral Science,
19, 6, pp. 374-382 (2007).
17. Chase I., Tovey C., Spangler-Martin D., Manfredonia M.: Individual diﬀerences
versus social dynamics in the formation of animal dominance hierarchies. PNAS,
99, 9, pp. 5744-5749 (2002).
18. Buston P.: Social hierarchies: size and growth modiﬁcation in clownﬁsh. Nature,
424, pp. 145-146 (2003).
19. Asimov, I.: Runaround. Astounding Science Fiction, March, pp. 94-103 (1942).
20. Tarasenko, S.: Modeling mixed groups of humans and robots with Reﬂexive Game
Theory. In Lamers, M.H., and Verbeek, F.J. (eds.): HRPR 2010, LINCST 59, pp.
108-117 (2011).
Appendix
A When sets A and B are functions of less than total
number of subject minus one variables
Consider groups of four subjects a, b, c and d. Suggest the polynomial corre-
sponding to this group is b(a + d) + c. Next we construct diagonal form and
perform folding operation:
[a] + [d]
[b][a + d]
[b(a + d)]
+[c]
[b(a + d) + c]
=
[b]([a + d] + [a] + [d])
[b(a + d)]
+[c]
[b(a + d) + c]
=
[b]
[b(a + d)]
+[c]
[b(a + d) + c]
=
= b(a + d) + c + b(a + d) + b + c
Next we simplify the resultant expression of diagonal form folding:
b(a + d) + c + b(a + d) + b + c = b(a + d) + c + b(a + d)cb =
b(a + d) + cb + cb + b(a + d)cb = b((a + d) + c + b(a + d)c) + cb =
b((a + d)c + (a + d)c + c + (b + (a + d))c) + cb =
b((a + d)c + (a + d)c + c + bc + (a + d)c) + cb =
b(c + (a + d)c + ((a + d) + (a + d))c) + cb = b((a + d)c + c + c) + cb =
b((a + d)c + 1) + cb = b + cb = b + c
The Inverse Task
27
Consequently,
[b(a + d)] + [b] + [c]
[b(a + d) + c]
= b + c
Therefore, the decision equation includes only two subject variables instead
of four. Consequenly, for subjects a and d the decision equations in canonical
forms are
a = (b + c)a + (b + c)a
(51)
d = (b + c)d + (b + c)d
(52)
Thus, the sets A and B for subjects a and d are equal. The sets A and B are
functions of only variables b and c: A = A(b, c) = b+cb and B = B(b, c) = b+cb.
The canonical forms of decision equations for subjects b and c are:
b = b + cb
(53)
c = c + bc
(54)
Therefore, set A = 1 for both subjects. Set B is a functions of a single
variable: B(c) = c and B(b) = b for subjects b and c, respectively.
B Example of non-homogenous super-active groups
Here we provide an example of non-homogenous super-active group.
Consider the group of four subject a, b, c and d, which is described by poly-
nomial c(ab + b). Let us build the diagonal form and perform its folding:
[a][b]
[ab]
+[d]
[c][ab + d]
[c(ab + d)]
=
([ab] + [a][b]) + [d]
[c][ab + d]
= [c(ab + d)]
=
1
[c][ab + d]
= [c(ab + d)]
=
= [c(ab + d)] + [c][ab + d] = 1 □