COPS AND INVISIBLE ROBBERS:
THE COST OF DRUNKENNESS
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Abstract. We examine a version of the Cops and Robber (CR) game in which the robber is
invisible, i.e., the cops do not know his location until they capture him. Apparently this game
(CiR) has received little attention in the CR literature. We examine two variants: in the ﬁrst the
robber is adversarial (he actively tries to avoid capture); in the second he is drunk (he performs
a random walk). Our goal in this paper is to study the invisible Cost of Drunkenness (iCOD),
which is deﬁned as the ratio cti(G)/dcti(G), with cti(G) and dcti(G) being the expected capture
times in the adversarial and drunk CiR variants, respectively.
We show that these capture
times are well deﬁned, using game theory for the adversarial case and partially observable
Markov decision processes (POMDP) for the drunk case. We give exact asymptotic values of
iCOD for several special graph families such as d-regular trees, give some bounds for grids, and
provide general upper and lower bounds for general classes of graphs. We also give an inﬁnite
family of graphs showing that iCOD can be arbitrarily close to any value in [2, ∞). Finally,
we brieﬂy examine one more CiR variant, in which the robber is invisible and “inﬁnitely fast”;
we argue that this variant is signiﬁcantly diﬀerent from the Graph Search game, despite several
similarities between the two games.
1. Introduction
Cops and Robber (CR) is a game introduced by Nowakowski and Winkler [38] and (indepen-
dently) by Quilliot [44]. CR is played on a ﬁxed, undirected, simple and ﬁnite graph G where
K cops pursue a single robber. Both the cops and the robber are located in the vertices of G
and everybody has full information about everybody else’s current location. In every turn of
the game, ﬁrst the cops and then the robber can move to their new locations along the edges
of the G. The cops win if they capture the robber, i.e., if at least one cop is in the same vertex
as the robber; the robber’s goal is to avoid capture. This is the “classical”, extensively studied
version of the CR game. Many other versions have been proposed and studied; for a review of
the literature see [2, 6, 19] and the book [5].
We call the robber of the classical CR game adversarial: he wants to avoid capture for as long
as possible and plays optimally towards this end. In a recent paper [26] we have studied a CR
variant where the robber is drunk, i.e., he performs a random walk on G (he does not attempt
to avoid capture; essentially he is oblivious of the cops). This is really a one-player game. In
[26] we have compared the adversarial and drunk CR variants and have especially studied the
Cost of Drunkenness (COD), i.e., the ratio of “adversarial” capture time and “drunk” expected
capture times.
In the current paper we are concerned with the much less studied version of the CR game, in
which the robber is invisible to the cops (unless they are located in the same vertex). All the
other rules remain the same as in the classical CR and we examine both the adversarial and the
Date: October 17, 2018.
1
arXiv:1201.0946v1  [cs.DM]  4 Jan 2012
2
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
drunk variants. We will call this game cops-and-invisible-robber (CiR). Our main goal is (as in
[26]) to study the “invisible COD” (iCOD).
CiR and, more generally, invisible robber versions of CR have so far received little attention
in the graph theory literature. To the best of our knowledge, the ﬁrst paper that deals with
the invisible robber is [52]. More recent work in which the cops have either zero, incomplete
or intermittent information about the robber’s location include [12, 13, 25, 51]. In addition,
some works [11, 14, 15, 16] examine the case in which the cops’ incomplete observations are
augmented by the use of traps, alarms, radars and so on. The variant of an intermittently
observed robber is related to another variant, where the robber moves with greater speed than
the cops; this is studied in [7]. All these works share a common approach, according to which
cops use a predetermined move sequence by which capture is guaranteed. Because the cops will
only see the robber when they capture him, this sequence does not depend on actual robber
moves, hence it can be computed before the CiR game starts. Furthermore, the robber can also
compute the same sequence, so he is omniscient in the sense that he knows all cop moves before
the game starts. This approach is also used in graph search (GS) games in which the robber
is also endowed with inﬁnite speed. However, we will argue in Section 7 that GS is a diﬀerent
game from CiR (the seminal paper for graph search is [42]; good recent reviews are [2, 6, 19]).
In this paper we are interested in a diﬀerent approach, which improves the cops’ fortunes.
Since the cops will never see the robber, they must predetermine their strategy; but this can be
randomized, so the actual cop moves do not need to be predetermined at the beginning of the
game. Hence, in CiR, at every time step the robber will know previous cop moves but not the
future ones. As we will show in later sections, the use of randomized strategies can, in some
cases, reduce the number of cops necessary to capture the robber, provided that our goal is to
obtain a ﬁnite expected capture time and so cops win the game after a ﬁnite number of steps
with probability one.
Randomized strategies have not received much attention in the graph theoretic CR literature.
From the few papers which explore this approach, we mention [1, 23, 24] (note that in [1] both
cops and robber are invisible to each other until they occupy the same vertex). Brief mentions
of randomized strategies also appear in [11, 51].
On the other hand, there is a large robotics literature which deals with the (more general)
pursuit/evasion problem; while roboticists do not often use the term “cops and robber”, they
have studied pursuit/evasion on graphs using formulations quite similar to CiR, especially for the
case of the drunk robber; see for example [8, 9, 21, 22, 28, 29, 39, 53, 55] and the review [10]. As
expected, this research is more application-oriented. Also, the operations research community
has studied what is essentially the search for an invisible drunk robber in a graph; we only cite
here three representative works here [29, 49, 50].
Most of the works cited in the previous paragraph handle the invisible drunk robber with tools
from the theory of partially observable Markov decision processes (POMDP; see [31, 35, 48]) and
this is the approach we will use in the current paper. For the invisible adversarial robber, we
believe the “natural” treatment is through game theoretic methods; in fact what is required is
the generalization of POMDP’s to stochastic or Markovian games. The subject was introduced
in [46]; some of the subsequent results can be found in [3, 17, 18, 27, 33, 34, 36, 37, 40, 41, 45, 54];
a paper we have found especially useful is [20].
The rest of the paper is organized as follows. In Section 2 we introduce preliminary notation
and results. In Section 3 we give an extended example which illustrates the basic aspects of the
CiR game and, perhaps more importantly, establishes that the iCOD can become arbitrarily
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
3
large. In Section 4 we prove that the value of the CiR game always exists (for both the adversarial
and drunk variant). In Section 5 we provide general upper and lower bounds, which are then
used and tightened in Section 6 to obtain (almost) exact values for special graph classes, among
them d-regular trees and grids; we also introduce the family of broom graphs and use it to show
that the iCOD can be arbitrarily close to any value in [2, ∞). In Section 7 we brieﬂy study CiR
(and compare it to GS) when the robber has “inﬁnite” speed. In Section 8 we summarize and
discuss our conclusions.
2. Preliminaries
Let G = (V, E) be a ﬁxed undirected, simple, connected and ﬁnite graph. We begin by listing
deﬁnitions, notation, assumptions and a useful theorem.
(1) N = {1, 2, 3, . . .} and N0 = {0, 1, 2, . . .}.
(2) We will always use n to denote the number of vertices of G (i.e., |V (G)| = n).
(3) We assume that the cops-and-robber game (both the CR and CiR versions) are played
by two players, denoted by C (the cop player) and R (the robber player).
(4) C controls K cops (K ∈N, we will denote {1, 2, . . . , K} by [K]). Xk
t ∈V denotes the
position of the k-th cop at time t (k ∈[K], t ∈N0); Xt = (X1
t , X2
t , . . . , XK
t ) ∈V K
denotes the vector of all cop positions at time t.
(5) R controls a single robber, whose position at time t ∈N0 is denoted by Yt ∈V .
(6) Actually R is active only in the adversarial variants. In the drunk variants the robber
performs random walk on G (we could say he is controlled by Chance) according to the
following rules
∀u ∈V, Pr (Y0 = u) =
1
|V |
(1)
∀u ∈V, Pr (Yt+1 = u | Yt = v) =

1
|N(v)|
if u ∈N(v)
0
otherwise.
(2)
(N(v) is the neighbourhood of vertex v; N(v) does not include v).
(7) The moving sequence in the adversarial variants is as follows. At t = 0, ﬁrst C chooses
initial positions X0 ∈V K, then R chooses Y0 ∈V . For t ∈N ﬁrst C chooses Xt ∈V K
and then R chooses Yt ∈V . Movements are possible only along edges of G; that is,
{Xk
t , Xk
t+1} ∈E and {Yt, Yt+1} ∈E, for all k ∈[K] and t ∈N0.
(8) The moving sequence in the drunk variants is the same, except that Y0, Y1, Y2, . . . are
chosen by Chance, according to equations (1)-(2).
(9) The capture time is denoted by T and deﬁned as follows
T = min{t : ∃k ∈[K] such that Xk
t = Yt};
that is, T is the ﬁrst time a cop is located at the same vertex as the robber (note that
this can happen during either a cop move or a robber move). If capture never takes
place, then T = ∞. As will be seen in the sequel, capture time will in general be a
random variable, dependent on the strategies used by the cop and the robber (and also
on K).
In what follows, unless stated otherwise, it will be assumed that (a) C’s goal is to capture the
robber as fast as possible and (b) he plays optimally with respect to this goal; we will summarize
these assumptions by saying that C is adversarial. R can be in one of two modes: adversarial
(he plays optimally to avoid capture for as long as possible) or drunk (he simply performs a
4
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
random walk on G). The cops’ locations are always known to the adversarial R (speciﬁcally, at
time t he knows X0, X1, . . . , Xt). The robber can be visible (his location is known to the cops)
or invisible (his location is unknown).
When the robber is visible and adversarial, T is deterministic [26]. The cop number of G
is denoted by c(G) and deﬁned to be the minimum K for which T < ∞(there is always such
a K, less than or equal to |V |). We deﬁne ct (G) to be the “optimal capture time given that
K = c (G) and C and R play optimally” (this deﬁnition requires some care, see [26]). When the
robber is visible and drunk, he performs a random walk on G as indicated by (1)-(2). In [26] we
show that the following quantity is well deﬁned
dct (G) = E (T | when K = c (G) , C plays optimally, R is drunk) .
Hence the cost of (visible) drunkenness is also well deﬁned by
F(G) = ct(G)
dct(G),
and we obviously have F(G) ≥1 (capturing the adversarial robber is at least as hard as capturing
the drunk one, since the former can always choose to behave as if he were drunk).
Let us now turn to the invisible robber. In [26] we have proved the following.
Theorem 2.1. Suppose that c(G) cops perform a random walk on a connected graph G, starting
from any initial position and the robber is adversarial. Then E (T) < ∞.
It follows that c(G) adversarial cops suﬃce to capture the invisible adversarial robber. On
the other hand, capturing the invisible robber is at least as hard as capturing the visible one,
hence c(G) is also the minimum required number of cops. In short, the cop number of a graph
is the same for the visible and invisible CR version (however, the expected capture time T will
generally be bigger in the invisible variant, comparing to the capture time in the visible one).
In the rest of the paper we will always assume that the number of cops is K = c(G).
For the invisible variant of the game we will deﬁne
cti (G) = E (T | K = c (G) , C and R play optimally) ,
(3)
dcti (G) = E (T | K = c (G) , C plays optimally, R is drunk) ,
(4)
Fi(G) = cti(G)
dcti(G) ≥1.
(5)
Fi (G) is the “invisible cost of drunkenness ”(iCOD). Note that we will always assume that
n ≥2 and so cti(G) ≥dcti(G) > 0. The existence of the quantities deﬁned in (3)-(5) and the
meaning of “optimal play” require careful study, which will be deferred to Section 4; we will
ﬁrst present an extended example in Section 3.
3. The Invisible Cost of Drunkenness Can be Arbitrarily Large
To clarify some of the key CiR concepts, we now present an extended example which involves
the N-star graphs SN, illustrated in Figure 1 and deﬁned as follows (note that, for all N, we
have n = |V | = N + 1).
For N ∈N, SN has vertex set V = {0, 1, . . . , N} and edge set
E = {(0, 1), (0, 2), . . . , (0, N)}. Obviously a single cop can catch the visible robber in SN. So
we will study CiR with a single cop, as well.
A robber strategy generates a robber’s move for every time t, based on the information avail-
able to R at time t; this information is the cop moves X0, X1, . . . , Xt and the robber moves
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
5
Figure 1. The N-star SN.
Y0, Y1, . . . , Yt−1. As will become clear, R may gain an advantage by randomizing his strategies.
Hence a robber strategy σR is an inﬁnite sequence of probability distributions conditioned under
previous moves/positions of both cops and robber:
σR = {Pr (Yt | X0 = x0, . . . , Xt = xt, Y0 = y0, . . . , Yt−1 = yt−1)}∞
t=0 .
(6)
The strategies must also be feasible, i.e., only moves along edges can have positive probabilities.
Similarly a cop strategy σC is an inﬁnite sequence of probability distributions conditioned under
previous moves/positions of only the cops (and an initial probability, for t = 0):
σC = {Pr (Xt | X0 = x0, . . . , Xt−1 = xt−1)}∞
t=1 ∪{Pr (X0)} .
(7)
The conditionals must satisfy the feasibility requirement.
Note that conditioning is on cop
moves only, since the robber is invisible to the cops. However, let us stress that the sequence of
previous moves of the cops aﬀects their strategy for the future, since the fact that the robber
was not seen at these vertices before contains important information. In the adversarial variant,
the expected capture time depends on both σR and σC, hence we will write E (T | σR, σC). In
the drunk variant we will write E (T | σC) instead.
Now let us consider optimal cop and robber strategies for SN. We will ﬁrst deal with the
adversarial variant. Let us ﬁx some N ≥2 (so SN is a tree with N leaves) and let C use the
following strategy: he chooses a permutation u1u2 . . . uN of the set [N] with uniform probability
(Pr (u1u2 . . . uN) =
1
N!) and the cop moves as follows
X0 = u1, X1 = 0, X2 = u2, X3 = 0, X4 = u3, . . . , X2N−2 = uN.
In other words, the cop starts at a random leaf and visits every other leaf in a random order
(without repetitions). Therefore there exists a cop strategy bσC of the form (7) which produces
this sequence.
Now, R sees X0 = u1 before placing the robber and he knows that the cop has no incentive to
stay in place, so he infers that X1 = 0. Hence R knows that Y0 must belong to {u2, u3, . . . , uN}
but he has no incentive to prefer one of these (since he is unaware of the order by which they will
be visited by the cop), and thus he will place the robber equiprobably in one of {u2, u3, . . . , uN}.
After the initial placement, R will never move the robber because the only possible move is
into 0 and this would result in a capture (either the robber runs into the cop during an odd
turn, or vice versa during an even turn). Hence the robber will use the strategy bσR which sets
6
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Y0 = v ∈{u2, u3, . . . , uN} with probability
1
N−1, and Yt = Y0 for t = N. For every initial cop
placement we can compute
E (T | bσR, bσC, X0 = u1) =
1
N −1·2+
1
N −1·4+. . .+
1
N −1·(2N −2) =
2
N −1·(N −1) · N
2
= N
and, since there are N equiprobable choices for u1, we also have
E (T | bσR, bσC) = N.
As we have already argued, bσR gives to R the best possible result (longest expected capture
time) given C uses bσC. In other words,
max
σR E (T | σR, bσC) = E (T | bσR, bσC) = N
from which follows
min
σC max
σR E (T | σR, σC) ≤E (T | bσR, bσC) = N.
(8)
By the same argument, if the cop decided to start at the root and visit the leaves in a randomly
chosen order, the expected capture time of the best robber (choosing uniformly at random any
leaf) would also be N. On the other hand, suppose R uses bσR irrespective of C’s strategy.
Clearly C must use a strategy which visits every vertex and, since he has no reason to prefer
a particular order of visitation, he will do just as well by using bσC (he gains no advantage by
visiting the same vertex twice). Hence we have
N = E (T | bσR, bσC) = min
σC E (T | bσR, σC) ⇒
N = E (T | bσR, bσC) ≤max
σR min
σC E (T | σR, σC) .
(9)
Finally, we know that
max
σR min
σC E (T | σR, σC) ≤min
σC max
σR E (T | σR, σC) .
(10)
From (8)-(10) we see that
max
σR min
σC E (T | σR, σC) = E (T|bσR, bσC) = min
σC max
σR E (T | σR, σC) = N.
In other words, bσR and bσC are optimal strategies for R and C, respectively, and the optimal
expected capture time in the adversarial variant is
cti (SN) = N = n −1,
(11)
for all N ≥2; it is easy to see that (11) also holds for N = 1.
Let us now turn to the drunk variant. In this case only C has to choose a strategy and the
optimal eσC is obvious: he places the cop at u = 0 and just waits there. For any N ≥1, the
robber will start at 0 with probability
1
N+1 or at some u ∈{1, 2, . . . , N} with probability
N
N+1.
In the former case T = 0; in the latter case the robber will move into 0 at t = 1, resulting at
T = 1. Hence
dcti (SN) = E (T | eσC) =
N
N + 1 = n −1
n
.
(12)
From (11) and (12) we get Fi (SN) = N + 1 = n. It follows that the cost of drunkenness Fi(SN)
can attain any integer in N. Our results are summarized in the following.
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
7
Theorem 3.1. For every N ≥1 we have
cti(SN) = N = n −1,
dcti(SN) =
N
N + 1 = n −1
n
,
Fi (SN) = N + 1 = n.
4. The Cost of Drunkenness is Well Defined
In the previous section we have proved that Fi (G) can take arbitrarily large values. Our proof
involved only a particular family of graphs, the N-stars. In this section we will show that iCOD
is well deﬁned for every graph G. The real issue, of course, is whether cti(G) and dcti(G) are
well deﬁned. Settling this question requires the use of game theoretic concepts for cti(G) and
POMDP concepts for dcti(G). For simplicity, we will consider the case of a single cop (K = 1);
the generalization to K > 1 is straightforward. Hence we pick a graph G with c (G) = 1 and
keep it ﬁxed for the rest of the section (but Lemma 4.1 and Theorems 4.2 and 4.4 remain true
for any G).
4.1. Adversarial Robber. Here we show that cti(G) is well-deﬁned. Our notation and analysis
follows very closely [20]. The adversarial variant of CiR is a two player game, which we will call
Γ, played on G. Note that Γ can last an inﬁnite number of turns (the robber is never captured).
We will also make use of auxiliary, truncated games: Γm is the same game as Γ but is played
for a maximum of m turns.
C moves the cop according to a strategy σC. For a rigorous deﬁnition of strategy we need the
following.
(1) AC = V is the set of possible C actions (possible placement of the cop on G).
(2) H(m)
C
⊆V m is the set of feasible m long sequences of cop conﬁgurations.
(3) HC = S∞
m=0 H(m)
C
is the set of all ﬁnite-length feasible cop histories.
(4) P (AC) is the set of all probability functions on C actions.
Hence a cop strategy is a function σC : HC →P (AC), i.e., a function which maps to every
ﬁnite-length history x0, x1, . . . , xt−1 a probability (conditional on x0, x1, . . . , xt−1 when t ≥0)
on the next C move Xt. We are interested in feasible strategies, i.e., those which assign positive
probabilities only to feasible next-step cop conﬁgurations; we will denote the set of all feasible
C strategies by SC.
The situation is almost identical for R. A strategy σR speciﬁes the next robber move at time
t, depending on information available to R at t; this information is the realizations x0, x1, . . . , xt
and y0, y1, . . . , yt−1. We deﬁne the following.
(1) AR = V is the set of possible R actions (possible robber positions).
(2) H(m)
R
⊆V m × V m−1 is the set of feasible m long sequences of cop/robber conﬁgurations.
(3) HR = S∞
m=0 H(m)
R
is the set of all ﬁnite-length feasible cop/robber histories.
Hence a robber strategy is a function σR : HR →P (AR), i.e., a function which maps to every
ﬁnite-length history a probability on the next R move; again, we are interested in feasible strate-
gies, i.e., those which assign positive probabilities only to feasible next-step cop conﬁgurations.
We will denote the set of all feasible R strategies by SR.
A speciﬁc strategy pair (σR, σC), speciﬁes the probabilities p (x0, x1, . . . , xt, y0, y1, . . . yt | σR, σC)
for all cylindrical sets (X0 = x0, X1 = x1, . . . , Xt = xt, Y0 = y0, Y1 = y1, . . . , Yt = yt). Hence, let-
ting eHR = V N×N denote the set of all inﬁnitely long feasible cop/robber histories, (σR, σC)
induces a probability measure on the associated σ-algebra. In short, at the start of the game Γ,
8
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
C chooses σC and R chooses σR, resulting in a well deﬁned expected capture time, conditioned
on σR and σC; this quantity is denoted by E (T | σR, σC, Γ).
The strategies σR and σC can be used in any truncated game Γm as well: C and R will use
them to generate moves only until turn m. Hence the corresponding expected capture time for
Γm is also well deﬁned; it will be denoted by E (T | σR, σC, Γm). Clearly Γ and every Γm are
two-person, zero-sum games.
It is worth emphasizing that C and R choose their strategies simultaneously, before the game
starts. Even though the game rules stipulate that (in every turn) C plays before R, this is
unimportant because (the probabilities of) both players’ moves are speciﬁed by their strategies,
which have been selected before the game starts. (It is a well known fact [4] that, if either player
chooses his optimal strategy then he has no incentive to change it, no matter what strategy the
other player uses.) In fact, the rules can be modiﬁed so that the players play simultaneously: at
the initial turn (t = 0) R plays a “null” move and C plays X0; at all subsequent turns (t ∈N),
R plays Yt−1 and C plays Xt. The game remains the same under these modiﬁed rules.
In Γ, if R uses σR and C uses σC then C pays R E (T | σR, σC, Γ) per game (on the average).
The situation is similar in Γm (for every m ∈N0) except that the game lasts at most m steps
and if the robber has not been caught by the end of the m-th step, he receives a payoﬀof m;
the average payoﬀin Γm is E (T | σR, σC, Γm). In either case (full or truncated game) R tries
to maximize the payoﬀwhile C tries to minimize it.
Consider for a moment pure strategies available to C in the ﬁnite-length game Γm. Each
such strategy, call it sC, will be a collection of deterministic functions mapping ﬁnite length cop
histories x0, x1, . . . , xt−1 to feasible moves:
Xt = f (x0, x1, . . . xt−1) .
Since t ≤m and |V | is ﬁnite, there is a ﬁnite number of cop histories and a ﬁnite number of
pure strategies that C can use. Similarly R has a ﬁnite number of pure strategies. Hence Γm is
a ﬁnite game and has a value [4], which however is generally achieved by mixed strategies bσ(m)
R
and bσ(m)
C :
max
σR∈Sm
R
min
σC∈Sm
C
E (T | σR, σC, Γm) = E

T | bσ(m)
R , bσ(m)
C , Γm

= min
σC∈Sm
C
max
σR∈Sm
R
E (T | σR, σC, Γm)
(13)
(In fact Γm can be summarized by a ﬁnite payoﬀmatrix Q(m) where Q(m)
sr,sC = E (T | sR, sC, Γm)
and sR, sC are understood as classes of pure strategies which are identical up to the m-th turn.)
We sometimes will denote E

T | bσ(m)
R , bσ(m)
C , Γm

by val (Γm), for brevity.
In the inﬁnite-length game Γ there is an inﬁnite number of pure strategies, hence the existence
of a value and optimizing strategies is not guaranteed. Let us deﬁne
val (Γ) = sup
σR∈SR
inf
σC∈SC E (T | σR, σC, Γ)
val (Γ) =
inf
σC∈SC sup
σR∈SR
E (T | σR, σC, Γ) .
Hence R, playing optimally, is guaranteed to receive no less than (arbitrarily close to) val (Γ);
C, playing optimally, is guaranteed to pay no more than (arbitrarily close to) val (Γ); we have
val (Γ) ≤val (Γ). What we want is to show val (Γ) = val (Γ); if equality holds, then we will
denote the common value by val (Γ), the value of the game Γ.
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
9
Existence of val (Γ) follows almost immediately from a theorem proved by Gurevich [20]. A
couple of modiﬁcations of his proof are required. First, we need the following lemma.
Lemma 4.1. Given any graph G, let σC be the strategy according to which the cop performs a
random walk on G. Deﬁne v = supσR∈SR E (T | σR, σC, Γ). Then
lim
m→∞val (Γm) ≤val (Γ) ≤v < ∞.
Proof. From Theorem 2.1 we know that, when the cop random-walks in G, he can catch the
robber in ﬁnite expected time. Hence
sup
σR∈SR
E (T | σR, σC, Γ) = v < ∞.
(14)
Then we have
val (Γ) = sup
σR∈SR
inf
σC∈SC E (T | σR, σC, Γ) ≤sup
σR∈SR
E (T | σR, σC, Γ) = v < ∞.
If we extend the truncated game Γm by one move we get the game Γm+1 and R’s payoﬀcannot
decrease. Hence val (Γm) ≤val (Γm+1), i.e. the sequence {val (Γm)}∞
m=0 is nondecreasing. It is
also bounded, because in the full game Γ, R can do at least as well as in any truncated game Γm.
Hence val (Γm) ≤val (Γm+1) ≤val (Γ), from which follows v = limm→∞val (Γm) ≤val (Γ).
□
Using Lemma 4.1 we can now prove the following.
Theorem 4.2. Given any graph G and the corresponding CiR game Γ played with c (G) cops,
val (Γ) exists and satisﬁes
val (Γ) = lim
m→∞val (Γm) .
Furthermore, there exists a strategy bσC such that
sup
σR∈SR
E (T | σR, bσC, Γ) = val (Γ)
(15)
and, for every ε > 0 there exists an mε and a strategy bσε
R such that
∀m ≥mε : val (Γ) −ε ≤
inf
σC∈SC E (T | bσε
R, σC, Γm) .
(16)
Proof. The theorem is a rephrasing of Gurevich’s Theorem 1 [20] and his proof can also be used
here, except for the following two points.
(1) Gurevich assumes E (T | σR, σC, Γ) is bounded:
∃M : ∀(σR, σC) : |E (T | σR, σC, Γ)| ≤M
This is not necessarily true for CiR. But, as he points out [20, p.372], this assumption
can be removed, provided the sequence {val (Γm)}∞
m=0 is bounded; in the CiR game this
is true because of Theorem 2.1 (as discussed in the proof of Lemma 4.1).
(2) Gurevich studies games in which the two players move simultaneously. In CiR this is
not the case but, since C never sees R’s move, we can (as already mentioned) rearrange
the order of moves and have the moves (Yt−1, Xt) take place simultaneously.
Other than these two points, Gurevich’s proof applies exactly to the current theorem.
□
Remark. From (15) we see that C has an optimal strategy. From (16) we see that, for every
ε > 0, R has an “ε-optimal” strategy, which is uniformly good (i.e., for all m ≥mε).
We now can replace equation (3) with a rigorous deﬁnition of cti (G).
10
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Deﬁnition 4.3. Given a graph G, we deﬁne the invisible capture time of G to be
cti (G) = val (Γ)
where the game Γ is played on G with c (G) cops.
Hence cti (G) is not necessarily an achieved expected capture time, but it can be approximated
within any ε > 0 by using strategies bσε
R and bσC.
4.2. Drunk Robber. We now turn to the drunk variant of CiR. Our goal is to rigorously deﬁne
dcti (G). Since a single player is involved, the situation is simpler than in the adversarial variant.
In fact, as we will now explain, the drunk variant of CiR is a partially observed Markov decision
process (POMDP) [31, 35, 48].
The robber process Yt is a Markov chain on V , with transition probability matrix P. As
explained in [26], the joint process (Xt, Yt) is also a Markov process on the extended state space
(V × V ) ∪{λ}, where λ is the capture state. (Xt, Yt) is governed by the controlled transition
probability matrix P (Ut), where Ut = Xt is the control variable (selected by C) which changes
the transition probabilities [26]. In particular, P (Ut) assigns probability 1 to transitions from
“diagonal” states (Xt, Xt) to the capture state λ. C’s goal is to select the sequence U0, U1, . . .
so as to minimize expected capture time. The state (Xt, Yt) is partially observable by C, since
he never knows Yt.
We will use the notation of Section 4.1 denoting, however, the full one-player game by Γ and
the truncated one-player game by Γm. We also need the following facts.
(1) SC, the space of cop strategies, is compact. This is so by Tychonoﬀ’s theorem, since SC
is a product of probability spaces and the probabilities are on ﬁnite event sets.
(2) For every m ∈N0, E
 T | σC, Γm

is a continuous function of σC (with domain SC). This
is the case because E
 T | σC, Γm

depends continuously on a ﬁnite number of variables.
C must select a strategy σC which minimizes
E
 T | σC, Γ

= E
 ∞
X
t=0
1 (Xt ̸= Yt) | σC, Γ
!
;
here 1 (Xt ̸= Yt) is the indicator function of the event Xt ̸= Yt; this is a typical inﬁnite horizon,
undiscounted POMDP problem. Such problems have been studied by several authors [43, 47]
who prove the existence of a minimizing strategy for a quite general setup using rather involved
proofs. We present a simpler and shorter proof, based on the reduction of Gurevich’s argu-
ment [20] to the one-player case. The proof is useful to illustrate the issues involved, and may
also have some independent interest.
Note that, similarly to the adversarial variant, Pr
 T | σC, Γm

is well deﬁned (for every σC)
and can be extended to Pr
 T | σC, Γ

. Hence E
 T | σC, Γm

, E
 T | σC, Γ

are well deﬁned for
every σC ∈SC. Let us deﬁne
val
 Γm

=
inf
σC∈SC E
 T | σC, Γm

,
val
 Γ

=
inf
σC∈SC E
 T | σC, Γ

.
We then have the following.
Theorem 4.4. Given any graph G and the corresponding CiR game Γ, we have
val
 Γ

= lim
m→∞val
 Γm

.
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
11
Furthermore, there exists a strategy eσC such that
E
 T | eσC, Γ

= val
 Γ

.
Proof. We have
inf
σC∈SC E
 T | σC, Γm

≤
inf
σC∈SC E
 T | σC, Γm+1

≤
inf
σC∈SC E
 T | σC, Γ

.
Hence
val
 Γm

≤val
 Γm+1

≤val
 Γ

≤E
 T | σC, Γ

< ∞,
where σC is the random-walking strategy, which is guaranteed to capture the robber in ﬁnite
expected time. Hence v = limm→∞val
 Γm

exists and v ≤val
 Γ

.
Now, in val
 Γm

= infσC E
 T | σC, Γm

, the inﬁmum is achieved by some eσ(m)
C
(since
E
 T | σC, Γm

is a continuous function of the σC probabilities, which take values in a com-
pact set). Deﬁne
Km =

σC : E
 T | σC, Γm

≤v
	
.
Since E

T | eσ(m)
C , Γm

≤v, every Km is nonempty. Also, val
 Γm

≤val
 Γm+1

≤v implies
that Km+1 ⊆Km. As mentioned, E
 T | σC, Γm

is a continuous function on SC and SC is
compact. For every m, Km is the preimage of the compact set [0, v], hence Km is compact. It
follows that
K∞=
∞
\
m=0
Km
is nonempty. So let us take some eσC ∈K∞. Then
E
 T | eσC, Γ

=
lim
m→∞
m
X
t=0
t · Pr
 T = t | eσC, Γ

=
lim
m→∞
m
X
t=0
t · Pr
 T = t | eσC, Γm

= lim
m→∞E
 T | eσC, Γm

≤v.
In other words, we have val
 Γ

≤v. Hence val
 Γ

= v and is achieved by eσC.
□
We now provide a formal deﬁnition of dcti (G) (to replace equation (4)).
Deﬁnition 4.5. Given a graph G, we deﬁne the invisible drunk capture time of G to be
dcti (G) = val
 Γ

where the game Γ is played on G.
4.3. Discussion. The results of Section 4.2 show that for every graph G there is an optimal
average capture time dcti(G) and the cop player can catch the drunk robber in dcti(G) turns
of the game (on the average) if he plays the optimal strategy eσC.
The actual computation
of dcti(G) and eσC is the subject of ongoing research in the POMDP community. While the
problem is well understood in principle (an optimality equation must be solved, similar to the
one presented in [26] for the CR problem) and despite much eﬀort, currently available POMDP
algorithms are not computationally viable, even for moderate-size graphs (the interested reader
is referred to [32, 35] for an extensive discussion of the issues involved).
The results of Section 4.1 yield similar conclusions for the adversarial robber, with one diﬀer-
ence. Namely, while an optimal strategy bσC is available to C, the best R can do is to approximate
12
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
cti(G) within ε (for any ε > 0) by using an ε-optimal strategy bσε
R. In the computational direc-
tion, even less progress has been achieved than in the drunk robber case (see [45]).
We consider the development of eﬃcient CiR algorithms important, especially from the ap-
plications point of view. Further, since exact algorithms quickly become intractable (even for
relatively small graphs), we believe that the solution will be obtained by algorithms which are
approximate but have performance guarantees.
Such algorithms will probably make use of
domain-speciﬁc heuristics.
5. Some Bounds for General Graphs
In this section we provide a lower bound on dcti(G) and an upper bound on cti(G); both
bounds hold for any G. These results have independent interest and will also be used in Section 6
to obtain dcti(G) and cti(G) for special graph families. As already stated, we assume that K,
the number of cops, equals c(G).
5.1. General lower bound of dcti(G). Let G = (V, E) be any graph and (Xt)t∈N0 any search-
ing schedule for K ∈N cops (the kth cop, k ∈[K], moves from Xk
t−1 to Xk
t at time t ∈N). For
t ∈N0, let Zt = S
k∈[K]{Xk
t } be the set of vertices occupied by the cops at the end of turn t.
We now deﬁne certain conditional probabilities which will be used to obtain the lower bound
on dcti(G).
¯pt(v) : Pr(“at t-th turn, after the cop move, robber is at v” | “robber has not been captured”),
ˆpt(v) : Pr(“at t-th turn, after the robber move, robber is at v” | “before a possible capture”),
pt(v) : Pr(“at the completion of t-th turn, robber is at v” | “robber has not been captured”).
The following remarks should make clearer the meaning of the above probabilities. Eﬀectively,
we break each turn of the game into three phases.
(1) In the ﬁrst phase, the cops move and they may capture the robber or not. The probability
that the robber is at v, given that he has not been captured, is ¯pt(v).
(2) In the second phase, the robber moves but a possible capture (i.e., if he entered a vertex
occupied by a cop) is not yet eﬀected. Hence ˆpt(v) is the probability that the robber is
at v (even if v contains a cop).
(3) In the ﬁnal phase possible captures (i.e., if the robber ran into a cop) are eﬀected and
pt(v) is the probability that the robber is in v given that no capture took place.
So ﬁnally, the probability that the robber is caught at time t ∈N is
ct =
X
u∈Zt

pt−1(u) + ˆpt(u)

.
We will now write the equations which govern the evolution of ¯pt(v), ˆpt(v), pt(v). It is easy
to see that, for every v ∈V , ¯p0(v) = 0 (the robber has not entered the graph yet). Note that
¯p0 is not a probability distribution; one can think of it as a useful function to get the desired
recursion started. It is easy to see that ˆpt(v) = 1/n. Regarding p0(v) we have p0(v) = 0 for
v ∈Z0, and
p0(v) =
ˆp0(v)
1 −P
u∈Z0 ˆp0(u) =
1/n
1 −|Z0|/n =
1
n −|Z0|
for v ∈V \ Z0.
Suppose now that, at a given time t ∈N, the game is still on; that is, the robber is still hiding
somewhere on the graph G. Suppose that we know the distribution of the position of the drunk
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
13
robber, pt−1 : V →[0, 1], at the end of the previous turn (let us repeat that this probability
is conditional on the robber not having been captured). The cops move from Xt−1 to Xt, and
so they capture the robber with probability P
u∈Zt pt−1(u). Conditioning on the fact that the
robber is still not captured, let ¯pt(v) be the probability that the robber is at vertex v after this
cop move. We have ¯pt(v) = 0 for v ∈Zt, and
¯pt(v) =
pt−1(v)
1 −P
u∈Zt pt−1(u)
(17)
for v ∈V \ Zt.
Now, the robber performs a step of his random walk. Let ˆpt(v) be the probability that he is
at vertex v after this move; that is,
ˆpt(v) =
X
u∈N(v)
¯pt(u)
deg(u).
(18)
The probability of the robber being captured at the completion of his move is P
u∈Zt ˆpt(u).
Assuming, as before, that the robber is still lucky at the end of turn t, we get the formula for
the distribution of the robber’s position at the end of turn t (once again, conditioned on the
robber not having been captured). For v ∈Zt: pt(v) = 0; for v ∈V \ Zt:
pt(v) =
ˆpt(v)
1 −P
u∈Zt ˆpt(u).
(19)
Eqs.(17)-(19) along with the initial conditions for t = 0, describe the evolution of ¯pt(v), ˆpt(v),
pt(v) for a given strategy of the cops. Now, we are ready to state a lower bound for dcti(G) for
a general graph G.
Lemma 5.1. Let G be any graph on n > c(G) vertices with maximum degree ∆= ∆(G) and
minimum degree δ = δ(G) such that
∆K
δ(n−K) ≤1/24. Then,
dcti(G) ≥δ · (n −K)
7e∆· c(G) .
Proof. We will use the notation introduced earlier in this subsection. Let G = (V, E) be any
graph, and suppose that K = c(G) cops try to catch the drunk robber on this graph. We
can assume that n > K; the statement is trivially true otherwise. Let us deﬁne the following
deterministic function: M0 =
∆/δ
n−K, and for t ∈N
Mt =
Mt−1
1 −2KMt−1
.
We will show that for any t ∈N0 and any vertex v ∈V we have
max
n
¯pt(v), ˆpt(v), pt(v)
o
≤Mt
deg(v)
∆
.
(20)
We prove the claim by induction. Since for each v ∈V we have
max
n
¯p0(v), ˆp0(v), p0(v)
o
≤
1
n −K = M0
δ
∆≤M0
deg(v)
∆
,
14
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
the base case (t = 0) holds. Suppose now that (20) holds for t −1 ∈N0. It follows immediately
from (17) that for any v ∈V
¯pt(v) ≤
pt−1(v)
1 −P
u∈Zt pt−1(u) ≤Mt−1
deg(v)
∆
1 −KMt−1
≤Mt
deg(v)
∆
.
From (18) we get that
ˆpt(v)
=
X
u∈N(v)
¯pt(u)
deg(u) ≤
X
u∈N(v)
Mt−1
deg(u)
∆
deg(u)(1 −KMt−1)
=
X
u∈N(v)
Mt−1
∆(1 −KMt−1) = Mt−1
deg(v)
∆
1 −KMt−1
≤Mt
deg(v)
∆
for any v ∈V , so the very same bound holds for ˆpt(v). Finally, from (19) we get that
pt(v) ≤
ˆpt(v)
1 −P
u∈Zt ˆpt(u) ≤Mt−1
deg(v)
∆
1 −KMt−1

1 −
KMt−1
1 −KMt−1
−1
=
Mt−1
deg(v)
∆
1 −2KMt−1
= Mt
deg(v)
∆
,
and the proof of the claim is complete.
Suppose that Mt−1 <
3∆
δ(n−K) (which implies that Ms <
3∆
δ(n−K) for 0 ≤s < t, since Ms is
increasing with s). Then,
Mt ≤Mt−1

1 −
6∆K
δ(n −K)
−1
≤. . . ≤M0

1 −
6∆K
δ(n −K)
−t
≤
∆
δ(n −K) exp

7∆Kt
δ · (n −K)

,
where the last inequality follows since
1
1−x ≤e7x/6 for x ≤1/4 and
∆K
δ(n−K) ≤1/24. Therefore, it
takes at least τ = δ(n−K)
7∆K
steps for Mt to reach
3∆
δ(n−K). During this time period, the probability
that we catch the robber at a given time t ≤τ is
ct =
X
u∈Zt

pt−1(u) + ˆpt(u)

≤2KMt ≤
6K∆
δ(n −K).
Hence, the probability that the robber is still not caught at time τ is at least

1 −
6K∆
δ(n −K)
τ
≥e−1.
Finally, we get that the expected capture time is at least τ/e, and the proof is ﬁnished.
□
5.2. General upper bound of cti(G). In this subsection, we investigate the adversarial robber
case providing a universal upper bound for the capture time.
Lemma 5.2. Let G be any graph of n vertices with maximum degree ∆= ∆(G), cop number
c(G), and diameter D = D(G). Let bT denote the maximum number of steps it takes for c(G)
cops to catch the visible robber, when they use an optimal strategy (independently of the position
of the robber). Then,
cti(G) ≤( bT + D) · (∆+ 1)
bT · n.
(21)
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
15
Proof. We will introduce a cop strategy by which the adversarial robber will be captured in at
most ( bT + D) · (∆+ 1) bT · n time steps (even if he knows the strategy in advance and plays
optimally against it).
The cop strategy is executed in rounds, each round consisting of one or more steps of the
game. In each round the cops ﬁrst move to those vertices from which they can capture the
visible robber (independently of his position) in at most bT steps. (Such a bT < ∞exists for
every G, since c(G) cops suﬃce to capture the visible robber in ﬁnite time.) Note that in the
ﬁrst round the cops start immediately from the aforementioned vertices. At any rate, once the
cops are there, they uniformly at random guess the current position of the robber, and then at
each of the next bT steps they uniformly at random guess the behaviour of the robber and apply
their optimal move for this guessed behaviour. The round is over bT steps after the cops arrived
at their optimal starting position. If the robber has not been captured by then, the cops start
the next round, following the same strategy.
In each round, it takes at most D steps for the cops to go back to the optimal starting position;
after that they move bT additional steps in which the robber might be caught. In order to catch
the robber during one round, it suﬃces that the cops guess correctly the position of the robber
at the start of the round (which happens with probability 1
n) and also guess correctly the move
of the robber in each of the following bT steps. Since the robber at each vertex has at most ∆+1
choices (he can move to any of the at most ∆neighbours, or he can stand still), in each step the
probability of the cops making the right choice is
1
∆+1. Thus, capture in a round happens with
probability at least
1
n·(∆+1) b
T . Since the bounds on the number of a round hold independently
of the starting point, consecutive rounds are independent, and the expected number of rounds
until the robber is caught is at most n · (∆+ 1) bT. Hence, cti ≤( bT + D) · (∆+ 1) bT · n and the
proof is complete.
□
Remark. The bound of (21) does not mean that cti(G) is O(n). Both ∆and bT depend on G
and hence on n, the number of vertices.
6. The Cost of Drunkenness for Special Graph Families
We will now show that, by restricting ourselves to some particular graph families, we can
obtain non-trivial bounds on the iCOD. Paths and cycles were considered in [26], so we state
the results only (Subsection 6.1). The complete d-ary tree of depth L is studied next (Subsec-
tion 6.2), and then we study the grid (Subsection 6.3). Finally, in Subsection 6.4, we give an
example of a family of graphs (brooms) showing that Fi(G) can be arbitrarily close to any value
in [2, ∞).
6.1. Paths and Cycles. The following two theorems have been proved in [26].
Theorem 6.1. Let Pn be the path of n vertices. Then cti (Pn) = n −1 and
n
2

1 −O
log n
n

≤dcti(Pn) ≤n −1
2
.
In particular, dcti = (1 + o(1)) · n
2 and the cost of drunkenness is
Fi(G) = cti(Pn)
dcti(Pn) = 2 + o(1).
16
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Theorem 6.2. Let Cn be the cycle of n vertices. Then cti (Cn) = n−1
2
and
n
4

1 −O
log n
n

≤dcti(Cn) ≤n −1
4
.
In particular, dcti(Cn) = (1 + o(1))n
4 and the cost of drunkenness is
Fi(G) = cti(Cn)
dcti(Cn) = 2 + o(1).
6.2. Trees. We restrict ourselves to Td,L, the complete d-ary tree of depth L, for d ≥2. This
tree has n = dL+1−1
d−1
vertices and M = dL leaves. We know that one cop suﬃces to capture
the robber on any tree, so let us consider a game played by a single cop and a robber. In this
subsection, we will show the following results.
Theorem 6.3. Let G = Td,L be the complete d-ary tree of depth L with n = dL+1−1
d−1
vertices.
Then,
2LdL−1 · (1 −o(1))d −1
d
≤cti(G) ≤2LdL−1 · (1 −o(1)).
In particular, cti(G) = Θ(n log n).
Theorem 6.4. Let G = Td,L be the complete d-ary tree of depth L with n = dL+1−1
d−1
vertices.
Then,
dcti(G) = Θ(n).
The following corollary is an immediate implication of these two theorems.
Corollary 6.5. Let G = Td,L be the complete d-ary tree of depth L with n = dL+1−1
d−1
vertices.
The cost of drunkenness of G is
Fi(G) = cti(G)
dcti(G) = Θ(log n).
Proof of Theorem 6.3. Since c(G) = 1, the (invisible) robber is chased by a single cop.
To
simplify the notation we use Xt = X1
t for a position of the cop at time t (this time the vector
Xt has one coordinate).
In order to give an upper bound on cti(G), we provide a strategy for the cop and show that,
independently of the robber’s behaviour, in expectation, the cop will catch the robber after
at most a certain number of steps. Denote by a preleaf a vertex at distance 1 from any leaf.
Consider the following strategy of the cop:
(1) Start at the root,
(2) Choose uniformly at random a preleaf v and go there,
(3) Choose a random permutation of the leaves below v and visit them in this order, always
returning to the preleaf v,
(4) Return to the root,
(5) Repeat from step 2 down.
Similarly to Lemma 5.2, a round is a sequence of steps starting at the root, visiting a preleaf and
all its leaves and going back to the root. Note that each round consists of 2L + 2(d −1) steps.
Observe that the cop’s strategy in step (2) is equivalent to choosing at each layer 0 ≤i ≤L−2 a
random vertex among all neighbours at layer i + 1 with probability 1
d. Note that, independently
of the robber’s strategy, he is caught in each round with probability
1
dL−1. Indeed, provided that
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
17
the robber is in the subtree below the cop (at some step 0 ≤i ≤L −2), the probability that
this is also true at the next round is 1/d. If this property is preserved from the beginning of
the round until step L −2, the robber has no chance to survive and is caught after visiting all
leaves.
Since all rounds are independent, the expected number of rounds needed to capture the robber
using this strategy is dL−1, and thus
cti ≤dL−1(2L + 2(d −1)) −L −(d −1),
(22)
since in the last round the cop does not have to return to the root (and so he saves L moves),
and he has to visit only half of the leaves (in expectation) before catching the robber (and so
he saves another 2(d −1) −2 d
2 = d −1 moves, in expectation).
For the lower bound, we provide a strategy for the robber against which any cop will need
at least a certain number of rounds, in expectation. The robber strategy depends on the cop’s
moves and can be brieﬂy described as follows: the robber always tries to be (after his move)
at distance 2 from the cop. As a result, the distance between players is never larger than 3.
Moreover, the robber is trying to stay at the layer above the cop, if it is possible. Let us now
describe the robber strategy in more detail.
(1) For t = 0. After the cop decides to start at X0, the robber selects a vertex Y0 at distance
2 from the cop.
(a) If X0 is located at layer i ≥2, then Y0 is the unique vertex at the layer i −2.
(b) If X0 is at the ﬁrst layer, then the robber chooses (uniformly at random) a vertex
that is also at the ﬁrst layer but is diﬀerent from X0.
(c) Finally, if X0 is the root of Td,L, then the robber chooses (uniformly at random) any
vertex at the second layer.
(2) For t ≥1. The cop moves from Xt−1 to Xt, the robber is at Yt−1, and is about to move.
There are a number of possibilities to deal with.
(a) dist(Xt, Yt−1) = 0: the game ends.
(b) dist(Xt, Yt−1) = 1 and no neighbour of Yt−1 is at distance 2 from Xt (the robber is
at a leaf): the robber stays at the same vertex; that is, Yt = Yt−1.
(c) dist(Xt, Yt−1) = 1 and there is a neighbour of Yt−1 at distance 2 from Xt: the robber
chooses (uniformly at random) a vertex at distance 2 from Xt that is as close to the
root as possible.
(d) dist(Xt, Yt−1) = 2: the robber does nothing; that is, Yt = Yt−1.
(e) dist(Xt, Yt−1) = 3: there is a unique neighbour of Yt−1 at distance 2 from Xt; the
robber goes there.
It is easy to check that, under this strategy, the distance between the cop and the robber never
becomes greater than 3.
Assume that the robber uses the randomized strategy described above; we will show that,
even if the cop is aware of this, the capture time will be Ω(n log n) and so cti(G) will be of at
least the same order. Observe that the the robber will only be caught in a leaf. Moreover, any
optimal cop strategy must start either at the root or at a neighbour of the root, as otherwise
the robber chooses a vertex above the cop and the cop is forced to move towards the root. If
the cop starts at the root, then he catches the robber on a leaf if, when in layers 0 ≤i ≤L −2,
he chooses always the subtree of the robber. The advantage of starting at a neighbour of the
root is this: since the cop knows the strategy used by the robber, he infers that the robber is in
one of the d−1 other subtrees (not d, as before). This is a much bigger advantage comparing to
18
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
the additional step back to the root. Hence, this strategy is slightly better and we will consider
only this one. (Of course, this applies to the ﬁrst round of moves only, so starting from the root
has exactly the same asymptotic expected capture time.) Note that the probability of choosing
the right subtree after going back to the root is
1
d−1 · ( 1
d)L−2, since at consecutive layers there is
no possibility to obtain partial information about the position of the robber except by checking
leaves. The cop does not necessarily have to check all leaves (although this is clearly the best
strategy), and thus in order to get a lower bound we will not count the time it takes to check
these leaves (we count only the time it takes him to check one leaf). Also, after having exploited
all leaves of a preleaf (we can assume the cop did this, since we do not count these extra steps),
the cop has to turn back to the root to continue exploiting other subtrees, as otherwise the
robber will never be caught.
By the same argument, avoiding the subtree the cop comes from, the probability that in the
next sequence of steps from the root to a leaf the cop catches the robber is again
1
d−1( 1
d)L−2.
Since one way to one leaf and back takes at least 2L steps, the expected capture time for any
cop strategy is at least
cti(G) ≥2L · (d −1) · dL−2 −L,
(23)
since in the last round the cop does not have to get back to the root. Thus, combining (22)
and (23), we see that cti(G) = Θ(LdL) = Θ(n log n).
□
Proof of Theorem 6.4. We consider the drunken robber performing a random walk on G = Td,L,
starting from a vertex chosen uniformly at random. The lower bound follows from Lemma 5.1,
since ∆(G) = d + 1, δ(G) = 1, and c(G) = 1. We have
dcti(G) ≥
n −1
7e · (d + 1) = Ω(n).
(24)
For an upper bound on dcti(G), we analyze the expected capture time of a cop standing still at
the root vertex. Denote by ej the expected capture time if the robber starts at level j (the root
is considered to be at level 0). By deﬁnition, e0 = 0, eL = 1 + eL−1 and
ej = 1 +
1
d + 1ej−1 +
d
d + 1ej+1
for 1 ≤j ≤L −1. Since eL−1 = 1 +
d
d+1(1 + eL−1) +
1
d+1eL−2, we have eL−1 = 2d + 1 + eL−2.
Similarly, eL−2 = 2d2 + 2d + 1 + eL−3, and in general
ej = 2
L−j
X
k=0
dk −1 + ej−1
for 1 ≤j ≤L −1. Thus,
e1
=
2
L−1
X
k=0
dk −1,
e2
=
2
L−2
X
k=0
dk −1 + 2
L−1
X
k=0
dk −1,
eL−1
=
2
L−1
X
r=1
r
X
k=0
dk −(L −1) = 2
L−1
X
r=1
dr+1 −1
d −1
−L + 1
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
19
and,
eL
=
2
L−1
X
r=1
dr+1 −1
d −1
−L + 2 =
2
d −1
L
X
r=2
dr −2L −2
d −1 −L + 2
=
2dL+1 −2
(d −1)2 −2 + 2d
d −1 −2L −2
d −1 −L + 2
=
2dL+1 −2
(d −1)2 −2d + 2L
d −1
−L + 2.
Since ei ≥ei−1 for all 1 ≤i ≤L,
dcti(G) ≤eL = O(n).
(25)
Combining (24) with (25), we obtain dcti(G) = Θ(n), and the proof is complete.
□
6.3. Grids. In this subsection, we investigate the square grid PN□PN. The result can be easily
generalized to rectangle grids. The adversarial robber case seems to be non-trivial. Our upper
bound follows from the general upper bound provided in Lemma 5.2. We have no good lower
bound so determining the order of cti(PN□PN) for a grid remains an open problem. We do
much better in the drunk robber case showing that dcti(PN□PN) is linear.
Theorem 6.6. Let PN□PN be the square N × N-grid with n = N 2 vertices. Then
cti(PN□PN) = O(n3/25
√n).
Proof. Since D = 2N, bT = N −1, ∆= 4, it follows from Lemma 5.2 that
cti(PN□PN) ≤(3N −1)n5N−1 = O(n3/25
√n),
and the result holds.
□
Theorem 6.7. Let PN□PN be the square N × N-grid with n = N 2 vertices. Then
dcti(PN□PN) = Θ(n).
Proof. Since c(PN□PN) = 2, ∆= 4, and δ = 2, it follows from Lemma 5.1 that dcti(G) ≥
n
28e =
Ω(n) and thus the lower bound follows.
On the other hand, for the upper bound, consider the strategy of two cops standing still at
antipodal corners of the grid. Assume, without loss of generality, that the ﬁrst cop is at the
top right corner of the grid, and the second one at the bottom left corner. For i ∈{1, 2},
denote by di(t) = dist(Xi
t, Yt) the distance of the i-th cop from the robber at time t.
Set
di(t) = 0, if the robber was caught by time t by cop i. Note that di(t) ≤2N for any t. Since
the robber is performing a random walk, on any interior point of the grid and also on the two
antipodal corners not occupied by the cops, with probability 1
2 the value of di(t) increases by
1 and d3−i(t) simultaneously decreases by 1 (for both i ∈{1, 2}). On the remaining points of
the border, the probability that di(t) increases by 1 and simultanouesly d3−i decreases by 1 is
2
3 if di(t) < d3−i(t), and it is 1
3 otherwise, for both i ∈{1, 2}. Thus, the robber’s movements
can be seen as moves on the truncated integer line {0, 2N}, corresponding to the distance of
(say) the ﬁrst cop, and the ﬁrst time the robber hits any of the two endpoints, he is caught
by one cop. Our goal is now to couple the standard random walk (the walk starting at any
integer position, and going with probability 1
2 to the right and with probability 1
2 to the left) on
this truncated integer line with the robber’s movements, taking into account the special border
conditions of the grid. Let (Zt) be a sequence of independent random variables, where for each t,
20
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Pr(Zt = 1) = Pr(Zt = −1) = 1
2. Let (Bt) be another sequence of random variables coupled with
Zt, taking into account the diﬀerent transition possibilities on the border: provided that the
robber is not yet caught by time t, set Bt = 0 if Yt−1 is either in the inside of the grid or one of
the two corner points not occupied by the cops, and otherwise set Bt = (d1(t) −d1(t −1) −Zt).
Interpreting a positive value of Zt as a step of the robber going down or going left and a negative
value as a step going up or going right at time t, and recalling that d1 increases if and only if d2
decreases, we have
d1(t)
=
d1(0) + Pt
j=1 Zj + Pt
j=1 Bj
d2(t)
=
d2(0) −Pt
j=1 Zj −Pt
j=1 Bj,
provided that the game is still on. Since Pt
j=1 Bj is the same for both values, at any time t
either
d1(t) ≤d1(0) +
t
X
j=1
Zj ≤2N +
t
X
j=1
Zj
or
d2(t) ≤d2(0) −
t
X
j=1
Zj ≤2N −
t
X
j=1
Zj
holds. Set now t = cn for some constant c > 0. Since |di(t) −di(t −1)| ≤1 for any i ∈{1, 2}
and any t, the robber is caught by the ﬁrst cop by time cn, if Pcn
j=1 Zj ≤−2N, whereas in
the second case he is caught by the second cop if Pcn
j=1 Zj ≥2N. Since Pr(Pcn
j=1 Zj ≥2N) =
Pr(Pcn
j=1 Zj ≤−2N), in both cases the robber is caught with probability at least the one that
the simple random walk on the integers at time cn is at a position bigger or equal to 2N (recall
that the simple random walk is the standard random walk starting at the origin). Denote by
qk(r) the probability that the simple random walk on the integers, starting at position k ∈N
has at least once reached the position 0 by time r. By Theorem 2.17 of [30]
qk(r) ≥1 −12k
√r .
Applying the theorem with k = 2N = 2√n(1 + o(1)) and r = cn we see that with probability
at least (1 + o(1))(1 −24
√c) the simple random walk starting at 2N passes the origin by time
cn. In other words, with at least this probability, there exists some time r ≤2N such that
Pr
j=1 Zj ≥2N. Conditioning under this event, by symmetry we get that with probability at
least 1
2, the value of Pcn
j=1 Zj ≥2N. Thus, with probability at least
c′ = (1 + o(1))1
2

1 −24
√c

,
di(cn) = 0 for some i ∈{1, 2}. This holds independently of the starting position of the robber,
and thus, if the robber is not yet caught by time cn, the same lower bound holds for the
probability of being caught in the next round of cn steps.
Hence, the expected number of
rounds until the robber is caught is at most 1
c′, and thus dcti(PN□PN) ≤1
c′cn = O(n).
□
6.4. Brooms. In this section we consider a family of graphs which we call brooms. The broom
graph B(c, n), with n vertices and a parameter c (0 < c ≤1) is illustrated in Fig. 2. It consists
of a path with cn vertices, joined at one endpoint (the center of the broom) with a star of
(1 −c)n vertices. The end of the broom is the other endpoint of the path.
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
21
Figure 2. The broom graph B(c, n).
Bounding cti(B(c, n)) and dcti(B(c, n)) is interesting as an illustration of the game-theoretic
approach. Perhaps more importantly, Corollary 6.10 shows that, for large n and appropriately
selected c, Fi(B(c, n)) can be arbitrarily close to any value in [2, ∞). Note that Corollary 6.10
holds for every (suﬃciently large) n, i.e., it is a property of the entire broom family.
Theorem 6.8. dcti(B(c, n)) = (1 + o(1))c2n
2 .
Theorem 6.9. cti(B(c, n)) = (1 + o(1))n.
Corollary 6.10. For any 0 < c ≤1,
Fi(B(c, n)) = cti(B(c, n))
dcti(B(c, n)) = (1 + o(1)) 2
c2.
Hence, for every a ∈[2, ∞) there exists c ∈(0, 1] such that Fi(B(c, n)) = (1 + o(1))a.
Proof of Theorem 6.8. One possible strategy for the cop is the following: start at the center of
the broom, wait there for one step, and then go to the end of the broom. Since the robber’s
position is distributed uniformly at random, and the robber has to move in every step, we have
the following cases.
(1) With probability 1−c the robber starts on a leaf of the star and is caught in O(1) steps.
(2) With probability c the robber starts on any vertex of the path and (starting at t ≥2)
the cop starts moving towards the end. In this case the problem is reduced to catching a
drunk robber on a path of cn vertices and this, by Theorem 6.1, takes (1 + o(1))cn
2 steps
in expectation
Hence
dcti(B(c, n)) ≤(1 −c) · O(1) + c · (1 + o(1))cn
2 = (1 + o(1))c2n
2 .
On the other hand, this is clearly the best strategy: with probability c the robber is on one
of the vertices of the path and, since the vertices of the star do not help, in this case, again
by Theorem 6.1, the expected time to be caught is (1 + o(1))cn/2.
Hence, dcti(B(c, n)) =
(1 + o(1))c2n
2 .
□
Proof of Theorem 6.9. For an upper bound we assign the following strategy for the cop. He
starts at the end of the broom, moves to the center and, once there, he cheks all leaves in
random order without repetitions. The robber’s best response is to start at a randomly selected
leaf and stay there until caught (actually he could move into a diﬀerent leaf or even outside of
the star as long as the cop is “suﬃciently distant”, but this will not change expected capture
time). The expected capture time under these strategies is the sum of two terms:
22
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
(1) the cop needs cn steps to go from the end to the center;
(2) once there, the cop will on the average need to check (1 −c) n/2 leaves before he captures
the robber and every such check requires two steps (one from the center to the leaf and
one back), except the very last move which requires only one step.
Hence
cti (B (c, n)) ≤cn + 2 (1 −c) n/2 = n.
A lower bound for cti(B(c, n)) can be established by describing a robber strategy and proving
that it is optimal. Before we describe such a strategy we need some additional notation. Let us
assign the “coordinate” 0 to the end of the broom and the “coordinate” cn to the center. The
cop’s initial position X0 can then be written as bn, where b ∈[0, c] (actually this excludes the
possibility that X0 is a leaf but, as we will soon see, we need not concern ourselves with this
case). Note that b is a parameter of the cop’s strategy; also, the robber will observe b before he
makes his ﬁrst move.
Before the game starts, the robber announces that he will use the following strategy: he will
go to the end of the broom with probability q = q(b) = b, and to a randomly chosen leaf with
probability 1 −q. The cop is aware of this and his only reasonable responses (after having
started at bn) are the following.
(1) With probability p, the cop can go towards the end of the broom, and then back to the
center to sweep all leaves.
(2) With probability 1 −p he can go to the center, sweep a randomly chosen x-fraction of
the leaves (for some x ∈[0, 1]), then move to the end of the broom, then go back to the
center, sweeping all leaves (of course, for x = 1, he will have captured the robber by the
time he reaches the end, at the latest; so he will not need to revisit the center).
We do not consider the possibility that the cop starts at a leaf because this does not inﬂuence
the asymptotic behavior of the expected capture time; this justiﬁes our previous claim that the
cop’s initial position can be parametrized by bn.
It is easily seen that the above family of strategies (parametrized by (b, p, x)) guarantees
capture (of course capture may take place before the full schedule is executed). Note that if c is
exactly 1, we have a path, and cti(B(c, n)) = (1+o(1))n. Also, if b is exactly 0, the cop starts at
the end of the broom, and the robber’s only reasonable strategy is to hide at a randomly chosen
leaf. In this case, the expected capture time is also (1 + o(1))n. Excluding these cases from the
following analysis, we can break down the expected capture time into the following cases.
R starts at
C moves towards
Prob.
Distance Travelled
end
end
bp
bn
a leaf
end
(1 −b) p
(b + c + (1 −c)) n
end
center
b (1 −p)
((c −b) + 2x(1 −c) + c)n
a leaf
center
(1 −b) (1 −p) x
((c −b) + x(1 −c)) n
a leaf
center
(1 −b) (1 −p) (1 −x)
((c −b) + (2x(1 −c) + 2c + (1 −c)))) n
Table 1. The possible ways in which the robber can be captured.
The case in which the robber hides in the leaves and the cop starts by moving towards the
center is broken down to two subcases (and hence requires two rows in the above table).
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
23
(1) In the ﬁrst subcase, the cop checks an x-fraction of the leaves and captures the robber,
before visiting the end of the broom. This happens with probability x and the average
number of leaves checked is x · (1 −c) · n/2.
(2) In the second subcase, the cop checks x-fraction of the leaves, fails to capture the robber,
visits the end of the broom and then returns to check all the leaves (and so capture the
robber with certainty). This happens with probability 1 −x and the number of leaves
checked is x · (1 −c) · n during the ﬁrst visit and (1 −c) · n/2 (on the average) during
the second visit.
The expected capture time for a given value of c is E(T) = (1 + o(1))fc(p, b, x) · n and
fc(p, b, x) · n is obtained by multiplying the entries of the last two columns of Table 1, adding
and performing some algebra. We ﬁnally get
fc (b, p, x)
=
(−1 + b + c + p −bc −bp −cp + bcp) x2 +
(1 + b −3c −p + bc −bp + 3cp −bcp) x + (2c −2b + 2bp −2cp + 1) .
Hence fc(b, p, x) is a quadratic function in x, and (after some factorizations) can be rewritten
as follows: fc(b, p, x) = a2x2 + a1x + a0, with
a2
=
−(1 −p)(1 −b)(1 −c)
a1
=
(1 −p)(1 −3c + bc + b)
a0
=
2(1 −p)(c −b) + 1.
Since a2 is negative (unless p = 1, where the whole function is zero), the parabola is downward
concave, and thus it achieves its minimum either at x = 0 or at x = 1. If x = 0, fc(b, p, x) =
a0 = 1 + 2(1 −p)(c −b) ≥1. If x = 1, fc(b, p, x) = a2 + a1 + a0 = 1, and thus the minimum
is achieved there. Thus, cti(B(c, n)) ≥(1 −o(1))n; this, together with the upper bound, shows
that the robber strategy is optimal and that cti(B(c, n)) = (1 + o(1))n.
□
Remark. The broom B(c, n) is a combination of the path Pn and the star SN: high c values
make B(c, n) more like a path and low values more like a star.
Accordingly, high c values
bring Fi(B(c, n)) closer to Fi(Pn) = 2 and low values increase Fi(B(c, n)) unboundedly, which
is similar to the behavior of Fi(SN) for large N.
7. The “Infinite Speed”Robber
In this section we change the rules of the game slightly. Suppose the robber is still invisible
but now is endowed with “inﬁnite speed”. In this case, what is iCOD?
We use the term “inﬁnite speed”for brevity. What we actually assume is that the robber has
speed s ∈N, i.e., during his turn he can traverse at most s edges (as long as he does not go
through a vertex containing a cop) and we examine what happens in the limit when s tends to
inﬁnity. Let us note that the adversarial robber can choose to traverse fewer edges or even stay
in place. As for the drunk robber, he simply performs s steps of a random walk on G; if during
one of these steps he runs into a cop, he is captured. Finally, the cops will still have unit speed.
We will use the notation cti (G, s) to denote the expected capture time given that the robber
is invisible, adversarial and has speed s; dcti (G, s) and Fi (G, s) are deﬁned similarly. Obviously
we have cti (G, 1) = cti (G), dcti (G, 1) = dcti (G), Fi (G, 1) = Fi (G). We are interested in the
limits
lim
s→∞cti (G, s) ,
lim
s→∞dcti (G, s) ,
lim
s→∞Fi (G, s) .
24
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
In the case of drunk robber, it is actually quite easy to ﬁnd lims→∞dcti (G, s). Suppose C
has c(G) cops, and he keeps them stationary at some vertices u1, u2, . . . , uc(G). As s becomes
large, the robber essentially performs a random walk on G with ui (i ∈[c(G)]) being absorbing
vertices. Since a random walker visits every vertex of G in ﬁnite expected time, the probability
that the robber will hit an absorbing state during the ﬁrst turn (conditional on not starting at
absorbing state) approaches 1 as s →∞. Hence lims→∞dcti (G, s) = n−c(G)
n
.
Next let us note that, for the adversarial robber, s does not need to increase all the way to
inﬁnity. With s = |V | = n, the robber can in one turn reach any vertex in G. In other words
lims→∞cti (G, s) = cti (G, n). In short,
lim
s→∞Fi (G, s) = lims→∞cti (G, s)
lims→∞dcti (G, s) = cti (G, n)
n−c(G)
n
=
n
n −c(G)cti (G, n) .
Hence we now must study cti (G, n). Let us ﬁrst establish that cti (G, n) exists. The game
theoretic analysis of Section 4 holds for any value of s; a feasible robber strategy now speciﬁes
(probabilities on) moves which traverse at most s edges without crossing a cop-occupied vertex.
Hence cti (G, n) is well deﬁned.
It is easy to see that lims→∞Fi (G, s) can take any value in N. For example, for the path Pn we
see immediately that lims→∞Fi (Pn, s) = lims→∞
n
n−1cti (Pn, s) = n. As an additional example,
let us consider lims→∞Fi (SN, s) on the star SN. When CiR is played on SN, the adversarial
inﬁnite-speed robber can move to any leaf in his turn; hence he has more options than the
unit-speed robber, who must remain in his original vertex for the entire game. However, let C
control a single cop, who starts at vertex 0, searches a randomly selected leaf (with repetitions)
at odd times and returns to 0 at even times. It is easy to see that this strategy is optimal and
its expected capture time (i.e., cti (G, s)) is (for any s ≥2):
cti (G, s) =

∞
X
t=1
N −1
N
t−1 1
N (2t −1)

−1 = 2N −1.
(One time step is subtracted because after capture the cop does not need to return to vertex 0.)
For the drunk robber we will have (for all s): dcti (G, s) =
N
N+1. Hence
lim
s→∞Fi (SN, s) = (2N −1) · (N + 1)
N
= Θ(n).
The example of the inﬁnite speed robber on SN illustrates an additional interesting point.
Recall the graph search (GS) game [42]. Similarly to CiR, GS involves a team of searchers (cops)
and a fugitive (robber) who is invisible, adversarial and inﬁnitely fast [19]. GS and CiR diﬀer
in one respect: in GS the fugitive is assumed to reside in the edges, not the vertices of G. But
there are GS versions of node search and mixed search. Furthermore, in “classical”GS the cops
are not restricted to move along edges; but there is a variant of GS, called internal search, in
which this restriction is imposed. Let us simply state, without giving details, that the capture
mechanism of CiR is equivalent to that of internal mixed GS (the interested reader can check
this fact using [19] and the references therein).
We denote by c∞
i (G) the minimum number of cops required to capture the invisible, adver-
sarial and inﬁnitely fast robber on G; we denote by s (G) the minimum number of searchers
required to guarantee capture of the fugitive in G, using
internal mixed search. One would
assume that these two games are equivalent and that s (G) = c∞
i (G). However this is not the
case: in the SN example we have c∞
i (SN) = 1 < 2 = s (SN). Hence, at least for some graphs,
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
25
we can have c∞
i (G) < s (G). The reason for this is that in GS the fugitive is assumed to know
in advance the entire itinerary of the searchers (i.e., he is omniscient), while in CiR the robber
only knows the past cop moves (and it suﬃces that the cops have a strategy that guarantees
capture in ﬁnite time with probability one, not to have a strategy that catches the robber in
ﬁnite number of steps.)
8. Conclusion
We have examined CiR, the invisible-robber version of the CR game. Our main interest was
in the cost of drunkenness Fi (G) which, as we have shown, exists for every graph G and can
get arbitrarily close to any value in [2, ∞). To establish these results we have used concepts
from game theory and the theory of partially observable Markov decision processes. For several
families of graphs including stars, d-regular trees and grids we found (almost) exact bounds, and
moreover we gave general upper and lower bounds. Finally, we have brieﬂy examined the CiR
variant in which the robber is inﬁnitely fast and we have found, somewhat surprisingly, that the
corresponding cop number ci (G) can be smaller than the (internal mixed) search number s (G).
Our work can be extended in several ways. We ﬁrst state some open problems.
(1) For a general graph G prove: |V (G)| ≥2 ⇒Fi(G) ≥2. We conjecture (but have not
been able to prove) that this is true, because we have not found any graph G with both
|V (G)| ≥2 and Fi(G) < 2.
(2) For the square grid, ﬁnd an asymptotically exact value of cti(PN□PN). We have obtained
a subexponential upper bound on cti(PN□PN) but we have not been able to ﬁnd a
nontrivial lower bound. By “nontrivial” we mean a tight bound; e.g., if a superpolynomial
lower bound is found we will be very close to obtaining an asymptotically exact value
for cti(PN□PN).
(3) As mentioned, Fi (G) can get arbitrarily close to any value in [2, ∞). On the other hand,
there are values in [2, ∞) which cannot be actually achieved by Fi(G) (for example Fi(G)
cannot equal an irrational number). Prove that: for any a ∈[2, ∞) ∩Q there exists a G
such that a = Fi (G).
(4) Find necessary and suﬃcient conditions for c (G) = c∞
i (G).
The equality holds for
paths, stars and cliques; for what other types of graphs does it hold?
(5) Find necessary and suﬃcient conditions for c∞
i (G) = s (G), where s(G) is the internal
mixed search number of G.
Finally, we list several future research directions.
(1) We have obtained tight Fi(G) bounds for paths, cycles, trees and grids. Can similar
bounds be obtained for additional graph families (e.g., planar, bipartite, series-parallel,
chordal, high girth, geometric)? What about random graphs?
(2) The use of algorithms becomes necessary for graphs which cannot be treated analytically.
We want to develop tractable algorithms for the computation of cti (G), dcti (G),Fi (G)
and also of optimal search strategies. It is well established in the literature that exact
algorithms are computationally intractable; hence we intend to develop approximate
algorithms, along with performance guarantees.
(3) We have studied the cost of drunkenness under a game theoretic model of the adversarial
robber. As mentioned in Section 1, there is an alternative approach, in which the cops
optimize their strategy under the assumption of an omniscient robber. We want to study
the cost of drunkenness under this “worst-case” assumption.
26
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
(4) We have studied the cost of drunkenness, but our results can also be used to initiate a
study (for both the adversarial and drunk variants) of the cost of invisibility. This can be
expressed by the ratios cti(G)
ct(G) , dcti(G)
dct(G) , the behavior of which we will study in the future.
Acknowledgement: We thank D. Thilikos (several discussions with him led us to the study
of the CiR game) and A. Gurel-Gurevich (he provided help with Theorem 4.2).
References
[1] M. Adler, H. Racke, N. Sivadasan, C. Sohler, and B. Vocking, Randomized pursuit-evasion in graphs, Lecture
Notes in Computer Science 2380 (2002), 901–912.
[2] B. Alspach, Searching and sweeping graphs: a brief survey, Le Matematiche 59 (2006), no. 1–2, 5–37.
[3] T. Bewley and E. Kohlberg, The asymptotic theory of stochastic games, Mathematics of Operations Research
1 (1976), no. 3, 197–208.
[4] K.G. Binmore, Game theory: a very short introduction, vol. 173, Oxford University Press, USA, 2007.
[5] A. Bonato and R. Nowakowski, The game of cops and robbers on graphs, AMS, 2011.
[6] A. Bonato and B. Yang, A survey of graph searching, Springer, 2011.
[7] J˜A Chalopin, V. Chepoi, and N. Nisse, Cop and robber games when the robber can hide and ride, Arxiv
preprint arXiv:1001.4457 (2010).
[8] T.H. Chung and J.W. Burdick, A decision-making framework for control strategies in probabilistic search,
IEEE International Conference on Robotics and Automation. ICRA, IEEE, 2007, pp. 4386–4393.
[9]
, Multi-agent probabilistic search in a sequential decision-theoretic framework, IEEE International
Conference on Robotics and Automation. ICRA, IEEE, 2008, pp. 146–151.
[10] T.H. Chung, G.A. Hollinger, and V. Isler, Search and pursuit-evasion in mobile robotics, Autonomous Robots
31 (2011), no. 4, 299–316.
[11] N. Clarke, The eﬀects of replacing cops and searchers with technology, Master’s thesis, Dalhousie University,
1999.
[12] N.E. Clarke, A game of cops and robber played with partial information, Congressus numerantium 166
(2004), 145–159.
[13]
, A witness version of the cops and robber game, Discrete Mathematics 309 (2009), no. 10, 3292–3298.
[14] N.E. Clarke and E.L. Connon, Cops, robber, and alarms, Ars Combinatoria 81 (2006), 283–296.
[15] N.E. Clarke and R.J. Nowakowski, Cops, robber, and photo radar, Ars Combinatoria 56 (2000), 97–104.
[16]
, Cops, robber and traps, Utilitas Mathematica 60 (2001), 91–98.
[17] H. Everett, Recursive games, Contributions to the Theory of Games 3 (1957), 47–78.
[18] J.A. Filar and K. Vrieze, Competitive Markov decision processes, Springer Verlag, 1997.
[19] F.V. Fomin and D.M. Thilikos, An annotated bibliography on guaranteed graph searching, Theoretical Com-
puter Science 399 (2008), no. 3, 236–245.
[20] O. Gurel-Gurevich, Pursuit–evasion games with incomplete information in discrete time, International Jour-
nal of Game Theory 38 (2009), no. 3, 367–376.
[21] G. Hollinger, A. Kehagias, and S. Singh, Probabilistic strategies for pursuit in cluttered environments with
multiple robots, IEEE International Conference on Robotics and Automation, ICRA, IEEE, 2007, pp. 3870–
3876.
[22] G. Hollinger, S. Singh, J. Djugash, and A. Kehagias, Eﬃcient multi-robot search for a moving target, The
International Journal of Robotics Research 28 (2009), no. 2, 201.
[23] V. Isler, S. Kannan, and S. Khanna, Randomized pursuit-evasion with local visibility, SIAM Journal on
Discrete Mathematics 20 (2007), no. 1, 26–41.
[24] V. Isler and N. Karnad, The role of information in the cop-robber game, Theoretical Computer Science 399
(2008), no. 3, 179–190.
[25] D.V. Jeliazkova, Aspects of the cops and robber game played with incomplete information, Master’s thesis,
Acadia University, 2006.
[26] A. Kehagias and P. Pralat, Some remarks on cops and drunk robbers, Arxiv preprint arXiv:1106.2414 (2011).
COPS AND INVISIBLE ROBBERS: THE COST OF DRUNKENNESS
27
[27] A. Krausz and U. Rieder, Markov games with incomplete information, Mathematical Methods of Operations
Research 46 (1997), no. 2, 263–279.
[28] H. Lau, S. Huang, and G. Dissanayake, Optimal search for multiple targets in a built environment, IEEE/RSJ
International Conference on Intelligent Robots and Systems, IROS, IEEE, 2005, pp. 3740–3745.
[29]
, Discounted mean bound for the optimal searcher path problem with non-uniform travel times, Eu-
ropean journal of operational research 190 (2008), no. 2, 383–397.
[30] D.A. Levin, Y. Peres, and E.L. Wilmer, Markov chains and mixing times, American Mathematical Society,
2009.
[31] M.L. Littman, A.R. Cassandra, and L.P. Kaelbling, Eﬃcient dynamic-programming updates in partially
observable Markov decision processes, Tech. Report CS-95-19, Comp. Science Dep., Brown University, 1996.
[32] W.S. Lovejoy, A survey of algorithmic methods for partially observed Markov decision processes, Annals of
Operations Research 28 (1991), no. 1, 47–65.
[33] J.F. Mertens, Stochastic games, Handbook of game theory with economic applications 3 (2002), 1809–1832.
[34] J.F. Mertens and A. Neyman, Stochastic games, International Journal of Game Theory 10 (1981), no. 2,
53–66.
[35] G.E. Monahan, A survey of partially observable Markov decision processes: Theory, models, and algorithms,
Management Science 28 (1982), no. 1, 1–16.
[36] A. Neyman and S. Sorin, Stochastic games and applications, NATO Science Series C, vol. 570, Springer,
2003.
[37] A.S. Nowak, Universally measurable strategies in zero-sum stochastic games, The Annals of Probability 13
(1985), no. 1, 269–287.
[38] R. Nowakowski and P. Winkler, Vertex-to-vertex pursuit in a graph, Discrete Mathematics 43 (1983), no. 2-3,
235–239.
[39] S.C.W. Ong et al., Planning under uncertainty for robotic tasks with mixed observability, The International
Journal of Robotics Research 29 (2010), no. 8, 1053.
[40] M. Orkin, An approximation theorem for inﬁnite games, Proc. Amer. Math. Soc 36 (1972), no. 1, 212–216.
[41]
, Recursive matrix games, Journal of Applied Probability 9 (1972), no. 4, 813–820.
[42] T. Parsons, Pursuit-evasion in a graph, Lecture Notes in Mathematics, pp. 426–441, Springer, 1978.
[43] S.D. Patek, Partially observed stochastic shortest path problems with approximate solution by neurodynamic
programming, IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 37
(2007), no. 5, 710–720.
[44] A. Quilliot, Jeux et pointes ﬁxes sur les graphes, these de 3eme cycle, Ph.D. thesis, Universite de Paris VI,
1985, pp. 131–145.
[45] T.E.S. Raghavan and J.A. Filar, Algorithms for stochastic games - a survey, Mathematical Methods of
Operations Research 35 (1991), no. 6, 437–472.
[46] L.S. Shapley, Stochastic games, Proceedings of the National Academy of Sciences of the United States of
America 39 (1953), no. 10, 1095.
[47] S. Singh and V. Krishnamurthy, The optimal search for a Markovian target when the search path is con-
strained: the inﬁnite-horizon case, IEEE Transactions on Automatic Control 48 (2003), no. 3, 493–497.
[48] E.J. Sondik, The optimal control of partially observable Markov processes over the inﬁnite horizon: Dis-
counted costs, Operations Research 26 (1978), no. 2, 282–304.
[49] L.D. Stone, Theory of optimal search, Military Applications Section, Operations Research Society of Amer-
ica, 1989.
[50]
, What’s happened in search theory since the 1975 Lanchester Prize?, Operations Research 37 (1989),
no. 3, 501–506.
[51] A. Tang, Zero visibility cops, Master’s thesis, Dahlousie University, 2004.
[52] R. Tosic, Vertex-to-vertex search in a graph, Graph Theory (Dubrovnik 1985), 1985, pp. 233–237.
[53] R. Vidal, O. Shakernia, H.J. Kim, D.H. Shim, and S. Sastry, Probabilistic pursuit-evasion games: theory,
implementation, and experimental evaluation, IEEE Transactions on Robotics and Automation 18 (2002),
no. 5, 662–669.
[54] N. Vieille, Stochastic games: Recent results, Handbook of game theory with economic applications 3 (2002),
1833–1850.
[55] E.M. Wong, F. Bourgault, and T. Furukawa, Multi-vehicle Bayesian search for multiple lost targets, IEEE
International Conference on Robotics and Automation, ICRA, IEEE, 2005, pp. 3169–3174.
28
ATHANASIOS KEHAGIAS, DIETER MITSCHE, AND PAWE L PRA LAT
Department of Mathematics, Physics and Computer Sciences, Aristotle University of Thes-
saloniki, Thessaloniki GR54124, Greece
E-mail address: kehagiat@auth.gr
Department of Mathematics, Ryerson University, Toronto, ON, Canada, M5B 2K3
E-mail address: dmitsche@ryerson.ca
Department of Mathematics, Ryerson University, Toronto, ON, Canada, M5B 2K3
E-mail address: pralat@ryerson.ca