1 Sensor management: Past, Present, and Future Alfred O. Hero III, Fellow, IEEE and Douglas Cochran, Senior Member, IEEE Abstract —Sensor systems typically operate under re- source constraints that prevent the simultaneous use of all resources all of the time. Sensor management becomes relevant when the sensing system has the capability of actively managing these resources; i.e., changing its op- erating configuration during deployment in reaction to previous measurements. Examples of systems in which sensor management is currently used or is likely to be used in the near future include autonomous robots, surveillance and reconnaissance networks, and waveform-agile radars. This paper provides an overview of the theory, algorithms, and applications of sensor management as it has developed over the past decades and as it stands today. Index Terms —Active adaptive sensors, Plan-ahead sens- ing, Sequential decision processes, Stochastic control, Multi-armed bandits, Reinforcement learning, Optimal decision policies, Multi-stage planning, Myopic planning, Information-optimized planning, Policy approximation, Radar waveform scheduling I. I NTRODUCTION Advances in sensor technologies in the last quarter of the 20th century led to the emergence of large numbers of controllable degrees of freedom in sens- ing devices. Large numbers of traditionally hard-wired characteristics, such as center frequency, bandwidth, beamform, sampling rate, and many other aspects of sensors’ operating modes started to be addressable via software command. The same period brought remarkable advances in networked systems as well as deployable autonomous and semi-autonomous vehicles instrumented with wide ranges of sensors and interconnected by networks, leading to configurable networked sensing systems. These trends, which affect a broad range of sensor types, modalities, and application regimes, have continued to the present day and appear unlikely to abate: new sensing concepts are increasingly manifested with device technologies and system architectures that are well suited to providing agility in their operation. A. O. Hero is with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109- 2122, USA. D. Cochran is with the School of Mathematical and Statistical Sciences and the School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, AZ 85287- 5706, USA. The term “sensor management,” as used in this pa- per, refers to control of the degrees of freedom in an agile sensor system to satisfy operational constraints and achieve operational objectives. To accomplish this, one typically seeks a policy for determining the optimal sensor configuration at each time, within constraints, as a function of information available from prior mea- surements and possibly other sources. With this per- spective, the paper casts sensor management in terms of formulation and approximation of optimal planning policies. This point of view has led to a rich vein of research activity that extends and blends ideas from control, information theory, statistics, signal processing, and other areas of mathematical, statistical, and com- putational sciences and engineering. Our viewpoint is also slanted toward sensor management in large-scale surveillance and tracking systems for civilian and de- fense applications. The approaches discussed have much broader utility, but the specific objectives, constraints, sensing modalities, and dynamical models considered in most of the work summarized here have been drawn from this application arena. Within its scope of attention, the intention of this paper is to provide a high-level overview; references are given to guide the reader to derivations of mathematical results, detailed descriptions of algorithms, and specifications of application scenarios and systems. The list of references, while extensive, is not exhaustive; rather it is represen- tative of key contributions that have shaped the field and led to its current state. Moreover, there are several areas relevant or related to sensor management that are not within the scope of this survey. These include purely heuristic approaches to sensor management and schedul- ing as well as adaptive search methods, clinical treatment planning, human-in-the-loop systems such as relevance feedback learning, robotic vision and autonomous navi- gation (path planning), compressive and distilled sensing, and robust sensing based on non-adaptive approaches. The most comprehensive recent survey on sensor management of which the authors are aware is the 2008 book [1]. This volume consists of chapters written collaboratively by numerous current contributors to the field specifically to form a perspicuous overview of the main methods and some noteworthy applications. The 1998 survey paper by A. Cassandra [2], while not arXiv:1109.2363v1 [stat.AP] 12 Sep 2011 2 devoted to sensor management, describes a few appli- cations of partially observed Markov decision process (POMDP) methods in the general area of sensor man- agement and scheduling, thereby illustrating conceptual connections between sensor management and the many other POMDP applications summarized in the paper. The earlier 1982 survey paper by G. E. Monahan [3] does not consider sensor management applications, but gives an excellent overview of the base of theory and algorithms for POMDPs as they were understood a few years before sensor management was becoming established as an appreciable area of research. A 2000 paper by G. W. Ng and K. H. Ng [4] provides an overview of sensor management from the perspective of sensor fusion as it stood at that time. This point of view, although not emphasized in this paper or in [1], continues to be of interest in the research literature. Another brief survey from this period is given by X.-X. Liu et al . in [5], and a short survey of emerging sensor concepts amenable to active sensor management is given in [6]. Several doctoral dissertations on the topic of sensor management have been written in the past fifteen years. Most of these include summaries of the state of the art and relevant literature at the time they were composed. Among these are the dissertations of G. A. McIntyre (1998) [7], D. Sinno (2000) [8], C. M. Kreucher (2005) [9], R. Rangarajan (2006) [10], D. Blatt (2007) [11], J. L. Williams (2007) [12], M. Huber (2009) [13], and K. L. Jenkins (2010) [14]. The remainder of this paper is organized as follows. Section II describes the basic goals and defines the main components of a sensor management system. In Section III, the emergence of sensor management is recounted within a historical context that includes both the advancement of statistical methods for sequential definition, collection, and analysis of samples and the rise of sensor technologies and sensing applications enabling and calling for sensor management. Section IV gives an overview of some of the current state of the art and trends in sensor management and Section V describes some of the future challenges and opportunities faced by researchers in the field. II. D ESCRIPTION OF SENSOR MANAGEMENT The defining function of sensor management is dy- namic selection of a sensor, from among a set of available sensors, to use at each time during a mea- surement period in order to optimize some metric of performance. Time is usually partitioned into a sequence of epochs and one sensor is to be chosen in each epoch, thereby creating a discrete-time problem. The term “sensor management” most often refers to closed- loop solutions to problems of this nature; i.e, the next sensor to employ is chosen while the sensor system is in operation and in view of the results obtained from prior sensor measurements. The term “sensor scheduling” is sometimes used to refer to feed-forward schemes for sensor selection, though this usage is not standardized and the two expressions are used interchangeably in some literature. In current applications of sensor man- agement, and especially in envisioned future applica- tions, the sensors available for selection in each time epoch are actually virtual sensors, each representing one choice of configuration parameters affecting the physical configurations and operating modes of a collection of sensors, sensor suites, sensor platforms, and the way data are processed and communicated among interconnected subsystems. With this perspective, selecting a sensor really means determining the values to which the avail- able controllable degrees of freedom in a sensor system should be set. Figure 1 illustrates the basic elements and operation of a closed-loop sensor management system. Once a sensor is selected and a measurement is made, information relevant to the sensing objective is distilled from the raw sensor data. This generally entails fusion of data representing disparate sensing modalities (e.g., optical and acoustic) and other properties, and further combining it with information gleaned from past measurements and possibly also side information from sources extrinsic to the sensor system. The fusion and signal processing com- ponents of the loop may produce ancillary information, such as target tracks or decisions about matters external to the sensor manager (e.g., direct an aircraft to take eva- sive action to avoid collision). For the purposes of sensor management, they must yield a state of information on the basis of which the merit of each possible sensor selection in the next time epoch may be quantified. Such quantification takes many forms in current approaches, from statistical (e.g., mean risk or information gain) to purely heuristic. From this point, the sensor manager must optimize its decision as to which sensor to select for the next measurement. The notion of state is worthy of a few additional words. Heuristically, the state of information should rep- resent all that is known about the scenario being sensed, or at least all that is relevant to the objective. Often this includes information about the physical state of the sensor system itself (e.g., the position and orientation of the air vehicle carrying one of the video sensors), which may constrain what actions are possible in the next step and thus the set of virtual sensors available to select in the upcoming epoch. Knowledge of the physical state 3 Sensor Selector System Predict Performance Signal Processing S1 S3 S2 Beam steering Sensor Fusion Estimates. tracks, decisions Information maximization Risk minimization Information gain Mean risk Optimization Physical configuration Waveform selection Likelihood update Information fusion Linear combining Matched filter Image formation Signal estimation Fig. 1. Conceptual block diagram of a sensor management system. The sensor selector selects among sensor actions S1, S2, and S3 based on the output of the optimizer. The optimizer attempts to optimize a system performance metric, such as information gain or mean risk associated with decisions or estimates produced by signal processing algorithms that operate on fused sensor data. frequently has utility extrinsic to the sensor manager, so some literature distinguishes physical and information states and their coupled dynamical models as depicted in Figure 2. This diagram evinces the similarity of sensor management and feedback control in many important respects, and indeed control theory is an important ingredient in current perspectives on sensor management. But sensor management entails certain aspects that give it a distinctive character. Chief among these is in the role of sensing. In traditional feedback control, sensors are used to ascertain information about the state of a dynamical plant. This information informs the control action through a control law or policy which in turn affects the state. In sensor management, the state of information is directly affected by the control action; i.e., rather than helping to decide what control action to invoke, the act of sensing is itself the control action. Sensor management is motivated and enabled by a small number of essential elements. The following para- Physical Dynamics Management Sensor Information Dynamics Control State Fig. 2. A control-theoretic view of sensor management casts the problem as that of optimally controlling a state, sometimes regarded as consisting of separate information and physical components, through the selection of measurement actions. graphs describe these and explain the roles they play in the current state of the subject. First, a summary of waveform-agile radar is given to provide the context of a current application for the more general descriptions that follow. A. Sensor management application – Waveform-agile radar Among the most well developed focus applications of sensor management is real-time closed-loop scheduling of radar resources. The primary feature of radar systems that makes them well suited for sensor management is that they offer several controllable degrees of freedom. Most modern radars employ antenna arrays for both the transmitter and receiver, which often share the same antenna. This allows the illumination pattern on transmit as well as the beam pattern on receive to be adjusted sim- ply by changing parameters in a combining algorithm. This ability has been capitalized upon, for example, by adaptive signal processing techniques such as adaptive beamforming on both transmit and receive and more recently by space-time adaptive processing (STAP). The ability for the transmitter to change waveforms in a limited way, such as switching between a few pre-defined waveforms in a library, has existed in a few radar systems for decades. Current radar concepts allow transmission of essentially arbitrary waveforms, with constraints coming principally from hardware limitations such as bandwidth and amplifier power. They also remove traditional re- strictions that force the set of transmit antenna elements to be treated as a phased array (i.e., all emitting the same waveform except for phase factors that steer the beam pattern), thereby engendering the possibility of the 4 transmit antennas simultaneously emitting completely different waveforms. This forms the basis of one form of so-called multi-input multi-output (MIMO) radar. Two more aspects of the radar application stand out in making it a good candidate for sensor management. One is that pulse-Doppler radars have discrete time epochs intrinsically defined by their pulse repetition intervals and often also by their revisit intervals [15]. Also, in radar target tracking applications there are usually well defined performance metrics and well developed dynam- ical models for the evolution of the targets’ positions, velocities, and other state variables. These metrics and models directly enhance the sensor manager’s ability to quantitatively predict the value of candidate measure- ments before they are taken. In view of these appealing features, it is no surprise that radar applications have received a large amount of attention as sensor management has developed. The idea of changing the transmitted waveform in a radar system in an automated fashion in consideration of the echo returns from previously transmitted waveforms dates to at least the 1960s, though most evidence of this is anecdotal rather than being documented in the research literature. The current generation of literature on closed- loop waveform management as a sensor management application began with papers of D. J. Kershaw and R. J. Evans [16], [17] and S. M. Sowelam and A. H. Tew- fik [18], [19] in the mid-1990s, roughly corresponding to the ascension of sensor management literature in broader contexts. Among the early sensor management papers that focused on closed-loop beam pattern management were those of V. Krishnamurthy and Evans in the early 2000s [20], [21]. Several contributions by numerous authors on these and related radar sensor management applications have appeared in the past decade. Among the topics addressed in this recent literature are radar waveform scheduling for target identification [22], target tracking [23], clutter and interference mitigation [24], [25], and simultaneously estimating and tracking pa- rameters associated with multiple extended targets [26]. There has also been recent interest in drawing insights for active radar and sonar sensor management from biological echolocation systems [27] and in designing optimal libraries of waveforms for use with radar systems that support closed-loop waveform scheduling [28]. B. Controllable Degrees of Freedom Degrees of freedom in a sensor system over which control can be exercised with the system in operation provide the mechanism through which sensors can be managed. In envisioned applications, they include di- verse sets of parameters, including physical configu- ration of the sensor suite, signal transmission charac- teristics such as waveform or modulation type, signal reception descriptors ranging from simple on/off state to sophisticated properties like beamform. They also include algorithmic parameters that affect local versus centralized processing trade-offs, data sharing protocols and communication schemes, and typically numerous signal processing choices. Many characteristics of current and anticipated sensor systems that are controllable during real-time opera- tion were traditionally associated with subsystems that were designed independently. Until relatively recently, transduction of physical phenomena into electrical sig- nals, analog processing, conversion to digital format, and digital processing at various levels of information abstraction were optimized according to performance criteria that were often only loosely connected with the performance of the integrated system in its intended function. Further integrated operation of such subsystems generally consisted of passing data downstream from one to the next in a feed-forward fashion. Integrated real-time authority over controllable degrees of freedom spanning all of this functionality not only allows joint optimization of systemic performance metrics but also accommodates adaptation to changing objectives. In the radar sensor management example, the ease and immediacy of access (i.e., via software command) to crucial operating parameters such as antenna patterns and waveforms provides the means by which a well conceived algorithm can manage the radar in each time epoch. C. Constraints The utility of sensor management emerges when it is not possible to process, or even collect, all the data all the time. Operating configurations of individual sensors or entire sensor systems may be intrinsically mutually exclusive; e.g., the transmitter platform can be in position A or in position B at the time the next waveform is emitted, but not both. One point of view on configurable sensors, discussed in [29], imagines an immense suite of virtual sensor systems, each defined by a particular operating configuration of the set of physical sensors that comprises the suite. Limitations preventing an individual sensor from being in multiple configurations at the same time are seen as constraints to be respected in optimizing the configuration of the virtual sensor suite. This is exactly the case in the waveform-agile radar example, where only one waveform can be transmitted on each antenna element at any given time. 5 Restrictions on communications and processing re- sources almost always constrain what signal processing is possible in networked sensor applications. Collecting all raw data at a single fusion center is seldom possible due to bandwidth limitations, and often to constraints imposed by the life and current production of batteries as well. So it is desirable to compress raw data before transmission. But reducing the data at the nodes requires on-board processing, which is typically also a limited resource. D. Objective Quantification When controllable degrees of freedom and constraints are present, sensor management is possible and war- ranted. In such a situation, one would hope to treat the selection of which sensing action to invoke as an optimization problem. But doing so requires the merit of each possible selection to be represented in such a way that comparison is possible; e.g., by the value of a cost or objective functional. The value of a specified set of data collection and processing choices generally depends on what is to be achieved. For example, one set of measurements by a configurable chemical sensor suite may be of great value in determining whether or not an analyte is an explosive, but the best data to collect to determine the species of a specimen already known to be an explosive may be quite different. Moreover, the objective may vary with time or state of knowledge: once a substance is determined to be an explosive, the goal shifts to determining what kind of explosive it is, then how much is present, then precisely where it is located, etc. Consequently, predic- tively quantifying the value of the information that will be obtained by the selection or a particular sensing action is usually difficult and, at least in principle, requires a separate metric for each sensing objective that the system may be used to address. The use of surrogate metrics, such as information gain discussed in Section IV, has proven effective in some applications. With this approach, the role of a metric designed specifically for a particular sensing objective is undertaken by a proxy, usually based on information theoretic measures, that is suited to a broader class of objectives. This approach sacrifices specificity in exchange for relative simplicity and robustness, especially to model mismatch. Management of radar beamforms and waveforms for target tracking, though not trivial, is one of the most tractable settings for objective quantification. The pa- rameters can be chosen to optimize some function of the track error covariance, such as its expected trace or determinant, at one or more future times; e.g., after the next measurement, after five measurement epochs, or averaged over the next ten epochs. Computation or ap- proximation of such functions is assisted by the tracker’s underlying model for the dynamical evolution of the target states. The use and effectiveness of waveform management in such applications is discussed in [1, Ch. 10], which also cites numerous references. III. H ISTORICAL ROOTS OF SENSOR MANAGEMENT It has long been recognized that appropriate collec- tion of data is essential in the design of experiments to test hypotheses and estimate quantities of interest. R. A. Fisher’s classical work [30], which encapsulated most of the ideas on statistical design of experiments developed through the first part of the 20th century, primarily addressed the situation in which the compo- sition of the sample to be collected is to be determined in advance of the experiment. In the early 1950s, the idea of using closed-loop strategies in experiment de- sign emerged in connection with sequential design of experiments. In his 1951 address to the Meeting of the American Mathematical Society [31], H. Robbins observed: A major advance now appears to be in the making with the creation of a theory of the sequential design of experiments, in which the size and composition of the samples are not fixed in advance but are functions of the observations themselves. Robbins attributes the first application of this idea to Dodge and Romig in 1929 [32] in the context of indus- trial quality control. They proposed a double sampling scheme in which an initial sample is collected and analyzed, then a determination about whether to collect a second sample is based on analysis of the first sample. This insight was an early precursor to the development of sequential analysis by Wald and others during the 1940s [33], and ultimately to modern methods in statistical signal processing such as sequential detection [34]. In the interim, H. Chernoff made substantial advances in the statistical study of optimal design of sequences of experiments, particularly for hypothesis testing and parameter estimation [35], [36]. Many results in this vein are included in his 1972 book [37]. Also in 1972, V. V. Fedorov’s book [38] presented an overview of key results, many from his own research, in optimal experimental design up to that time. The relevance of a portion of Fedorov’s work to the current state of sensor management is noted in Section IV. One view of the raison d’être for sensors, particularly among practitioners of sensor signal processing, is to 6 collect samples to which statistical tests and estimators may be applied. From this perspective, the advancement of sensor signal processing over the latter half of the 20th century paralleled that of experimental design. By the early 1990s, a rich literature on detection, estimation, classification, target tracking and related problems had been compiled. Nearly all of this work was predicated on the assumption that the data were given and the goal was to process it in ways that are optimally informative in the context of a given application. There were a few notable cases in which it was assumed the process of data collection could be affected in a closed-loop fashion based on data already collected. In sequential detection theory, for example, the data collection is continued or terminated at a given time instant (i.e., binary feedback) depending on whether a desired level of confidence about the fidelity of the detection decision is supported by data already collected. An early example of closed- loop data collection involving a dynamic state was the “measurement adaptive problem” treated by L. Meier et al. in 1967 [39]. This work sought to simultaneously optimize control of a dynamic plant and the process of collecting measurements for use in feedback. Another is given in a 1972 paper of M. Athans [40] that considers optimal closed-loop selection of the linear measurement map in a Kalman filtering problem. One of the first contexts in which the term “sensor management” was used in the sense of this discussion 1 was in automating control of the sensor systems in military aircraft (see, e.g., [42]). In this application, the constrained resource is the attention of the pilot, particularly during hostile engagement with multiple adversaries, and the objective of sensor management is to control sensor resources in such a way that the most important information (e.g., the most urgent threats) are emphasized in presentation to the pilot. Applications associated with situational awareness for military aircraft continue to be of interest, and this early vein of applica- tion impetus expanded throughout the 1990s to include scheduling and management of aircraft-based sensor assets for surveillance and reconnaissance missions (see, e.g., [43], [44] and [1, ch. 11]). Also beginning in the 1980s, sensor management was actively pursued under the label of “active vision” for applications in robotics [45]. This work sought to exercise feedback control over camera direction and sometimes other basic parameters (e.g., zoom or focal distance) to improve the ability of robotic vision systems 1 The phrase comes up in various literature in ways that are related to varying degrees to our use in this paper. To maintain focus, we have omitted loosely related uses of the term, such as in clinical patient screening applications [41]. to contribute to navigation, manipulation, and other tasks entailed in the robot’s intended functionality. The rapid growth of interest in sensor management be- ginning in the 1990s can be attributed in large part to de- velopments in sensor and communications technologies. New generations of sensors, encompassing numerous sensing modalities, are increasingly agile. Key operating parameters, once hard-wired, can be almost instantly changed by software command. Further, transducers can be packaged with A/D converters and microprocessors in energy efficient configurations, in some cases on a single chip, creating sensors that permit on-board adaptive processing involving dynamic orchestration of all these components. At the same time, the growth of networks of sensors and mobile sensor platforms is contributing even more controllable degrees of freedom that can be managed across entire sensor systems. From a purely mathematical point of view, it is almost always advantageous to collect all available data in one location (i.e., a “fusion center”) for signal processing. In to- day’s sensor systems, this is seldom possible because of constraints on computational resources, communication bandwidth, energy, deployment pattern, platform motion, and many other aspects of the system configuration. Even highly agile sensor devices are constrained to choose only one configuration from among a large collection of possibilities at any given time. These years spawned sensor management approaches based on the modeling sensor management as a deci- sion process, a perspective that underpins most current methods as noted in Section IV. Viewing sensor man- agement in this way enabled tapping into a corpus of knowledge on control of decision processes, Markov decision processes in particular, that was already well established at the time [46]. Initial treatments of sensor management problems via POMDPs, beginning with D. Castañón’s 1997 paper [47], were followed shortly by other POMDP-based ideas such as the work of J. S. Evans and Krishnamurthy published in 2001–2002 [48], [49]. These were the early constituents of a steady stream of contributions to the state of the art summarized in Section IV-B. A formidable obstacle to the practicality of the POMDP approach is the computational complexity entailed in its implementation, particularly for methods that look more than one step ahead. Consequently, the need for approximation schemes and the potential merit of heuristics to provide computational tractability was recognized from the earliest work in this vein. The multi-armed bandit (MAB) problem is an impor- tant exemplar of a class of multi-stage decision problems where actions yielding large immediate rewards must be balanced with others whose immediate rewards are 7 smaller, but which hold the potential for greater long- term payoff. While two-armed and MAB problems had been studied in previous literature, the origin of index policy solutions to MAB problems dates to J. C. Gittins in 1979 [50]. As discussed in Section IV-C, under certain assumptions, an index solution assigns a numerical index to each possible action at the current stage of an infinitely long sequence of plays of a MAB. The indices can be computed by solving a set of simpler one-armed bandit problems and their availability reduces the decision at each stage to choosing the action with the largest index. The optimality of Gittins’ index scheme was addressed by P. Whittle in 1980 [51]. As with POMDPs, the MAB perspective on sensor management started receiving considerable research at- tention around 2000. Early applications of MAB method- ology to sensor management include the work of Kr- ishnamurthy and R. J. Evans [20], [21] who considered a multi-armed bandit model with Markov dynamics for radar beam scheduling. The 2002 work of R. Washburn et al. [52], although written in the context of more general dynamic resource management problems, was influential in the early develop of MAB approaches to sensor management. A theory of information based on entropy concepts was introduced by C. E. Shannon in his classic 1948 paper [53] and was subsequently extended and applied by many others, mostly in connection with communi- cation engineering. Although Shannon’s theory is quite different than that of Fisher, sensor management has leveraged both in various developments of information- optimized methods. These were introduced specifically to sensor management in the early 1990s by J. Manyika and H. Durrant-Whyte [54] and by W. W. Schmaedeke [55]. As remarked in Section IV, information-based ideas were applied to particular problems related to sensor management even earlier. Fisher’s information theory was instrumental in the development of the theory of optimal design of experiments, and numerous examples of applications of this methodology have appeared since 2000; e.g., [56], [57]. Measures of information led to sensor management schemes based on information gain, which developed into one of the central thrusts of sensor management research over the past decade. Some of this work is summarized in Section IV-D, and a more complete overview of these methods in provided in [1, Ch. 3]. From foundations drawing on several more classical fields of study, sensor management has developed into a well-defined area of research that stands today at the crossroads of the disciplines upon which it has been built. Key approaches that are generally known to researchers in the area are discussed in the following section of this paper. But sensor management is an active discipline, with new work and new ideas appearing regularly in the literature. Some noteworthy recent developments include work by V. Gupta et al. which introduces random scheduling algorithms that seek optimal mean steady state performance in the presence of probabilistically modeled effects [58], [59], [60]. K. L. Jenkins et al. very recently proposed the use of random set ideas, similar to those applied in some approaches to multi-target track- ing, in sensor management [61], [62]. These preliminary investigations have resulted in highly efficient algorithms for certain object classification problems. Also very recently, D. Hitchings et al. introduced new stochastic control approximation schemes to obtain tractable algo- rithms for sensor management based on receding horizon control formulations [63]. They also proposed a stochas- tic control approach for sensor management problems with large, continuous-valued state and decision spaces [64]. Despite ongoing progress, sensor management still holds many unresolved challenges. Some of these are discussed in Section V. IV. S TATE OF THE ART IN SENSOR MANAGEMENT The theory of decision processes provides a unifying perspective for the state of the art in sensor management research today. A decision process, described in more detail below, is a time sequence of measurements and control actions in which each action in the sequence is followed by a measurement acquired as a result of the previous action. With this perspective, the design of a sensor manager is formulated as the specification of a decision rule, often called a policy, that generates realizations of the decision process. An optimal policy will generate decision processes that, on the average, will maximize an expected reward; e.g., the negative mean- squared tracking error or the probability of detection. A sound approach to sensor management will either approximate an optimal policy in some way or else at- tempt to analyze the performance of a proposed heuristic policy. In this section we will describe some current approaches to design of sensor management policies. The starting point is a formal definition of a decision process. A. Sensor management as a decision process Assume that a sensor collects a data sample y t +1 at time t after taking a sensing action a t . It is typically assumed that the possible actions are se- lected from a finite action space A , that may change 8 over time. The selected action a k depends only on past samples { y k , y k − 1 . . . . , y 1 } and past actions { a k − 1 , a k − 2 . . . . , a 0 } , and the initial action a 0 is de- termined offline. The function that maps previous data samples and actions to current actions is called a policy. That is, at any time t , a policy specifies a mapping γ t and, for a specific set of samples, an action a t = γ t ( { a k } k 0 and a state sequence { s k } k> 0 , describing the environ- ment or a target in the environment. The state s k might be continuous (e.g., the position of a moving target) or discrete (e.g., s k = 1 when the target is moving and s k = 0 when it is not moving). It is customary to model the state as random and the data sample y k as having been generated by the state s k in some random manner. In this case, there exists a conditional distribution of the state sequence given the data sequence and the average reward at time t can be defined through the statistical expectation E [ R t ( { a k } k