Multi-Objective Design of State Feedback Controllers Using
Reinforced Quantum-Behaved Particle Swarm Optimization
Kaveh Hassani∗and Won-Sook Lee†
School of Computer Science and Electrical Engineering
University of Ottawa, Canada
Abstract
In this paper, a novel and generic multi-objective design paradigm is proposed which
utilizes quantum-behaved PSO(QPSO) for deciding the optimal conﬁguration of the
LQR controller for a given problem considering a set of competing objectives. There
are three main contributions introduced in this paper as follows. (1) The standard
QPSO algorithm is reinforced with an informed initialization scheme based on the
simulated annealing algorithm and Gaussian neighborhood selection mechanism. (2)
It is also augmented with a local search strategy which integrates the advantages of
memetic algorithm into conventional QPSO. (3) An aggregated dynamic weighting
criterion is introduced that dynamically combines the soft and hard constraints with
control objectives to provide the designer with a set of Pareto optimal solutions and
lets her to decide the target solution based on practical preferences. The proposed
method is compared against a gradient-based method, seven meta-heuristics, and
the trial-and-error method on two control benchmarks using sensitivity analysis
and full factorial parameter selection and the results are validated using one-tailed
T-test. The experimental results suggest that the proposed method outperforms
opponent methods in terms of controller eﬀort, measures associated with transient
response and criteria related to steady-state.
1
Introduction
Optimal control theory refers to the controller design patterns that simultaneously sat-
isfy the physical constraints of the controlled process and optimize some predetermined
performance criteria. The evolution of optimal control theory has led to the emergence of
linear quadratic regulators (LQR) – an optimal multivariable feedback control approach
that improves the stability and minimizes the excursion in state trajectories of a system
while requiring minimum controller eﬀort. Applying LQR technique to a controllable
linear time-invariant (LTI) system results in a set of optimal feedback gains that mini-
mizes a quadratic criterion and stabilizes the system [31]. The LQR approach has been
utilized in vast variety of real world engineering applications such as but not limited to
missile guidance [46,47], ﬂight control [26], multiple spacecraft formation [50], control-
ling unmanned vehicles [18], active car suspension [45], ABS break system [22], power
∗email: kaveh.hassani@uottawa.ca; Corresponding author
†email: wslee@uottawa.ca
The complete paper appears in Applied Soft Computing
DOI: http://dx.doi.org/10.1016/j.asoc.2015.12.024
1
arXiv:1607.00765v1  [cs.NE]  4 Jul 2016
converters [38,40], active power ﬁlter [25], and tuning PID controllers [28]. Essentially,
LQR controllers minimize a quadratic cost function also known as performance index
that consists of two penalty matrices including state (Q) and control (R) weighting ma-
trices. These two parameters are main design parameters to be selected by the designer
and greatly inﬂuence the behavior of the controller. It is worth noting that it is not a
trivial task to decide these two matrices. In general, deciding the LQR parameters for
a given process is a continuous, multimodal and multi-objective optimization problem.
For the simplicity purposes, in most of the applications the problem is modeled as a
single-objective which limits the designer to only one conﬁguration. In the proposed
approach, the problem is considered as a multi-objective problem and as a result the
designer is provided with a set of Pareto optimal solutions which lets her to decide the
target solution based on some practical preferences.
Traditionally, weighting matrices are determined by the trial-and-error method in
which a domain expert adjusts the weighting matrices intuitively and then reﬁnes them
iteratively to obtain a satisfactory performance. This method is not feasible for high-
dimensional systems and even for simpler systems is labor-intensive and time-consuming.
Bryson’s method [23] is another iterative method in which initial state and feedback
variables are normalized with respect to their largest permissible and then are uti-
lized to initialize the weighting matrices. Then, similar to trial-and-error method, the
weighting matrices are gradually reﬁned to approach the minimum index value. Pole
placement [44] is another popular technique for determining the weighting matrices in
which the matrices are decided based on the given poles. This approach neither guaran-
tees a good performance nor the satisfaction of the constraints. Other approaches have
been proposed as well such as utilizing asymptotic modal properties [15] and express-
ing the system as an explicit function of the weighting matrix elements [57]. Yet they
suﬀer from the similar deﬁciencies. These classic approaches are labor-intensive, time-
consuming and do not guarantee the expected performance. They only aim to minimize
the quadratic performance index and ignore the other competing or incommensurable
control objectives such as minimizing the overshoot, rise-time, settling-time, and the
steady-state error.
In order to tackle these problems, some studies have utilized soft computing tech-
niques including but not limited to particle swarm optimization (PSO) [56], artiﬁcial
bee colony (ABC) [5], ant colony optimization (ACO) [9], genetic algorithm (GA) [61],
diﬀerential evolution (DE) [34], memetic algorithm (MA) [64], artiﬁcial immune sys-
tems (AIS) [43], imperialist competitive algorithm (ICA) [42], neural networks [60], and
fuzzy systems [55]. These methods can explore the search hyperspace in an informed
manner and converge to the optimal solutions in a few iterations using a combination
of knowledge sharing and individual explorations. The main problem with most of the
computational intelligence techniques is that they are prone to premature convergence
which causes them to get trapped within the local optima.
In this paper, a novel and generic multi-objective design paradigm is proposed which
utilizes a global convergence guaranteed variation of particle swarm optimization (PSO)
called quantum-behaved PSO (QPSO) for deciding the optimal conﬁguration of the
LQR controller for a given problem considering a set of competing objectives including
the quadratic performance index, overshoot, rise-time, settling-time, steady-state error,
and integrated absolute error. The proposed method is called reinforced multi-objective
quantum-behaved PSO (RMO-QPSO). The rationale behind the selection of QPSO as
the core optimizer is as follows. (1) The experimental studies suggest that PSO as the
predecessor of QPSO outperforms GA, DE, MA, ACO, and ABC in terms of success
2
rate, solution quality and processing time [6, 11]. It is also experimentally shown that
PSO is scalable, requires less computational resources, and its processing time grows at a
linear rate with respect to the size of the problem [1]. (2) it is theoretically guaranteed
that the QPSO converges to the global optimum; (3) it is less sensitive to the bias
problem as it has only one parameter, and (3) it outperforms other PSO variations in
ﬁnding the optimal solutions [51].
There are three main contributions introduced in this paper as follows. (1) The
standard QPSO algorithm is reinforced with an informed initialization based on the
simulated annealing and Gaussian neighborhood selection mechanism.
(2) It is also
augmented with a local search strategy which integrates the advantages of memetic
algorithm into conventional QPSO. (3) Finally, an aggregated dynamic weighting crite-
rion is introduced that dynamically combines the soft and hard constraints with control
objectives to provide the designer with a set of Pareto optimal solutions and lets her
to decide the target solution based on practical preferences.
As far as the authors’
knowledge is concerned, this is the ﬁrst time that a multi-objective derivation of PSO
is applied for deciding the conﬁguration of LQR controllers.
Without loss of generality, RMO-QPSO is utilized to decide the conﬁguration of LQR
controllers for two control benchmarks including: (1) stabilizing an inverted pendulum
system, and (2) controlling a ﬂight landing system. In order to have comparative studies,
nine diﬀerent techniques (i.e.
one trial-and-error based method, one gradient based
technique, and seven stochastic meta-heuristics) including the trail-and-error method,
Levenberg-Marquardt optimization (LM), GA, DE, ABC, PSO, QPSO, chaotic PSO
(CPSO), and adaptive inertia weighted PSO (AIWPSO) are utilized to decide the LQR
parameters of the same benchmarks. In order to mitigate the bias problem and have
a fair comparison among the diﬀerent techniques, full factorial parameter selection and
sensitivity analysis [29] are exploited for all of the applied techniques to ﬁnd the best
parameter setting including population size, iteration number, etc. The comparative
results are also validated using one-tailed T-test to investigate whether the experimental
results are statistically signiﬁcant.
The paper is organized as follows. Section 2 provides the mathematical foundation
of LQR controllers. In section 3 an overview of related works is presented. In section
4, we investigate the concept of multi-objective QPSO. In section 5, we present the
proposed technique for optimal tuning of LQR controllers and in section 6 we discuss
the experimental results. Section 7 concludes the paper.
2
Linear Quadratic Regulators
The LQR is an optimal multivariable feedback control approach that improves the sta-
bility and minimizes the excursion in the state trajectories of a system while requiring
minimum controller eﬀort. From a Mathematical point of view, for a controllable LTI
system with a state-space model shown in 1, the LQR approach constructs a linear state
feedback law as depicted in 2.
 ˙x(t) = Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
(1)
u(t) = −Kx(t)
(2)
In these equations, x(t) denotes an n-dimensional state vector,y(t) presents an r-
dimensional output vector, and u(t) is an m-dimensional control vector. K ∈ℜn×m is
3
the optimal state-feedback gain matrix. The control law in 2 minimizes the quadratic
performance index shown in 3 which integrates the state and control energies through
the time. In other words, it minimizes the distance between the process outputs and
the desired outputs with minimum control energy.
J =
Z ∞
0
(xT Qx + uT Ru)dt
(3)
In 3 Q ∈ℜn × n denotes a symmetric positive semi-deﬁnite state weighting (state
penalty) matrix and R ∈ℜm × m denotes a symmetric positive deﬁnite control weight-
ing (control penalty) matrix. The control gain matrix K is given by 4.
K = R−1BT P
(4)
where P is a unique symmetric positive semi-deﬁnite solution to the algebraic Riccati
equation shown in 5.
PA + AT P + Q −PBR−1BT P = 0
(5)
3
Related Works
One of the most successful yet less applied techniques for optimal designing of LQR
controllers is PSO. The superiority of PSO over GA in ﬁnding optimal weighting matrices
has been experimentally shown in some studies [13, 63]. In [37] PSO, GA and trial-
and-error methods are utilized to adjust LQR weighting matrices which is applied to
controlling an aircraft landing ﬂare system. It is concluded that LQR design based on
PSO is more eﬃcient and robust compared to other methods. In [59] authors compared
the performance of the ordinary LQR, the LQR with prescribed degree of stability
(LQRPDS) and the PSO-based LQR in controlling distribution static compensator and
showed that PSO-based design results in best performance under diﬀerent operating
conditions.
In [49] a method is proposed to determine the weighting matrices using PSO with
pole region constraint for controlling a ﬂexible-link manipulator. In [37] it is suggested
that PSO-based LQR produces better result compared to trial-and-error approach for
the active suspension system. In [42] authors applied ordinary LQR, PSO-based LQR,
AWPSO-based LQR and ICA-based LQR for optimal load frequency control and con-
cluded that AWPSO-based LQR outperforms other approaches in terms of settling-time
and maximum overshoot. A PSO-based optimal real-time LQR controller for stabilizing
an inverted pendulum system is proposed in [14]. In [3] wavelet-PSO is proposed for
tuning LQR controllers and is applied to an optimal structural control. In [24] it is con-
cluded that contrary to trial-and-error approach, a PSO-based state feedback controller
does not have sub-optimal performance in case of partial state feedback. Hassani et
al. [17] proposed a quantum-behaved PSO algorithm for tuning a given LQR controller.
They exploited a conventional weighted aggregation of control objectives which can only
provide the designer with one solution. More studies on PSO-based LQR design can be
found in [26,48,56,62].
They are a few studies that exploit multi-objective design for tuning LQR controllers.
In [33] a multi-objective evolutionary algorithm is presented to control a double inverted
pendulum system. They concluded that multi-objective approach results in shorter ad-
justing time and smaller amplitude value deviating from steady-state in comparison
4
with the pole assignment design approach. [36] also applies a multi-objective evolution-
ary algorithm to stabilize a double inverted pendulum system. Their results suggest
that multi-objective approach results in shorter adjusting time and smaller amplitude
value deviating from steady-state in comparison to non-dominated sorting GA. As far as
the authors’ knowledge is concerned, it is the ﬁrst time that a multi-objective derivative
of the PSO algorithm is applied for optimal design of LQR controllers.
4
Multi-Objective Quantum-Behaved PSO
4.1
Multi-Objective Optimization
MOO is a common practice in engineering applications and refers to optimizing two
or more competing, confronting or incommensurable objective functions simultaneously.
In contrast to single-objective problems in which optimal solution is well-deﬁned, in
multi-objective problems it is not possible to reach the global optimum for all objectives
at the same point. Instead there is a whole set of optimal trade-oﬀs which forms the
solution set called Pareto optimal set. Although it is possible to treat a problem with
multi-objective nature as a single-objective problem, it is not an eﬃcient approach due to
two main reasons: (1) the objectives must be aggregated to a single weighted objective,
and (2) the optimizer must run for several times to obtain the Pareto solutions.
A
MOO problem over an n-dimensional search space X is deﬁned as ﬁnding the vector
X∗= [x∗
1, x∗
2, . . . , x∗
n] ∈X such that:
 Minimize
fi(x),
i = 1 . . . k
Subject to
gj(x) ⩽0,
j = 1 . . . m
(6)
fi(x) and gj(x) are vectors of k objective functions and m constraints deﬁned over
X, respectively. Mostly it is not possible to ﬁnd such a global and acceptable solution
for all conﬂicting objectives. A feasible solution is to ﬁnd a set of solutions in a way
that each solution satisﬁes 6 without being dominated by other solutions. A solution
u = [u1, . . . , uk] is said to dominate solution v = [v1, . . . , vk] if and only if fi(u) ⩽
fi(v)∀i = 1 . . . k and fi(u) < fi(v) for at least one objective function and is denoted by
u ≺v. A feasible solution x is Pareto optimal if and only if there is no other solution
that dominates x. The set of all Pareto optimal solutions is referred as Pareto optimal
set and the set of corresponding objective function values is called the Pareto front. The
goal of a given MOO method is to ﬁnd the Pareto optimal set which results in the best
possible Pareto front regarding the given objectives and constraints.
4.2
Particle Swarm Optimization
The PSO algorithm [10] is a population-based optimization technique that explores
the search space by simulating the choreographed motion of the birds based on socio-
cognitive characteristics of ﬂocking individuals. A PSO algorithm consists of a swarm
of |S| particles each containing a position and a velocity vectors and a scalar ﬁtness
value. For a D-dimensional objective function, the position and velocity vectors of each
particle are D-dimensional as well. The position vector represents the position of the
particle within the D-dimensional search space and encodes a potential solution to the
optimization problem at hand. The velocity vector determines the direction and mag-
nitude of the changes in position vector through time. Each particle shares information
with other particles through the best position seen so far in the swarm history and
5
follows its trajectory toward the global optimum based on Newtonian mechanics. A
domain-dependent ﬁtness function is used to determine how strong a given particle is
(i.e. how close the particle is to the global optimum). The velocity vector of the particle
i in a D-dimensional space is updated in through time using 7.
⃗vi(t + 1) = w(t) ×⃗vi(t) + cp × ⃗rp ⊗(⃗xi,pbest(t) −⃗xi(t)) + cg × ⃗rg ⊗(⃗xgbest(t) −⃗xi(t)) (7)
⃗vi(t) denotes the D-dimensional velocity vector of the particle i whose values are
deﬁned within the range of [vmin, vmax]. w(t) is the inertia weight that controls the
momentum of the particle by weighting the contribution of its previous velocity values.
The inertia weight is equivalent to an inﬁnite impulse response low pass ﬁlter. cp and cg
are two positive parameters known as cognitive and social learning rates that determine
the weight of cognitive aspects and social pressure in decision making process, respec-
tively. rp and rg are D-dimensional vectors of randomly generated values by a uniform
distribution in the range of [0, 1]. ⊗denotes point-wise vector multiplication operator.
⃗xi,pbest is the best position seen by the particle i up to time t (pbest), xgbest(t) is the
best position seen by whole swarm up to time t (gbest), and ⃗xi(t) is the position vector
of particle i in time t. The inertia factor is computed by 8.
w(t) = wmax −(wmax −wmin) × t/T
(8)
wmax and wmin are maximum and minimum values of inertia, respectively. t and T
denote current iteration and maximum number of iterations, respectively. The position
vector of particle i is updated using 9.
⃗xi(t + 1) = ⃗xi(t) + ⃗vi(t + 1)
(9)
Without loss of generality, the best personal and the best global positions for a
minimization problem are computed by 10 and 11, respectively. In these equations, f
denotes the ﬁtness function (i.e. performance index or cost function).
⃗xi,pbest(t + 1) =
 ⃗xi,pbest(t),
f (⃗xi(t + 1)) ≥f (⃗xi,pbest(t))
⃗xi(t + 1),
f (⃗xi(t + 1)) ≤f (⃗xi,pbest(t))
(10)
⃗xgbest(t + 1) = arg min
⃗xi,pbest
f (⃗xi,pbest (t + 1))
(11)
The PSO algorithm has been exploited in optimizing various engineering problems
such as electric power systems [2], wireless-sensor networks [27], image processing [4,35],
robotics [21,41], to name only a few.
4.3
Quantum-behaved PSO
The main drawback of the PSO algorithm is that it does not guarantee the global
convergence and is prone to the premature convergence [58]. The quantum-behaved
PSO (QPSO) [52–54] is a global convergence guaranteed optimizer that belongs to the
bare-bones PSO (BBPSO) category in which particles do not have a velocity vector and
their position is updated by sampling a prob-ability distribution of interest. The QPSO
was inspired by quantum mechanics and trajectory analysis of PSO [51]. In [7], the
trajectory analysis of the particles showed that each particle oscillates around its local
attractor. It was also mathematically shown that if the upper limits of cognitive and
6
social learning rates are selected properly, each particle will converge toward its local
attractor. The local attractor is interpreted as the center of gravity toward which the
particle careens while declining its kinetic energy. If the search space is stationary (i.e.
which is the case in most of the practical applications) there will be no periodic orbits
also known as unstable equilibria in the search hyperspace. Although satisfying this
condition guarantees the convergence, there is no systematic way to select those limits.
The local attractor of the ith particle denoted by pi is computed using 12.
⃗pi(t) = (cp × ⃗xi,pbest(t) + cg × ⃗xgbest(t))
cp + cg
(12)
Contrary to PSO algorithm in which the particles follow their trajectory toward
their local attractor based on Newtonian mechanics, in QPSO algorithm particles obey
the quantum mechanics in which trajectory is not a valid concept. It is assumed that
particles are attracted to a quantum potential ﬁeld which is centered on their local
attractors. In QPSO, the state of a particle is expressed by its wave function Ψ (x, t).
The probability of particle i being in position xi is computed by probability density
function |Ψ (x, t) |2 whose form depends on the potential ﬁeld in which the particle is
moving. One possible model is quantum Delta potential well model [52]. In this model,
it is assumed that a particle moves in a Delta potential well ﬁeld in the search space with
center p calculated by 12. In the quantum model, only the probability density function
of the particle position is available while for computing the objective function the exact
position of the particle is required. To address this issue,Monte Carlo sampling method
is used to simulate the measure-ment process from the wave function which estimates
the position of the ith particle as expressed by 13.
⃗xi(t + 1) = ⃗pi(t) ± |⃗xi(t) −⃗pi(t)| ⊗ln (1 \ ⃗u)
g
(13)
In this equation, pi(t) denotes the local attractor of particle i computed using 12,
u is a D-dimensional randomly generated vector following a uniform distribution in the
range of [0, 1], and g is a control parameter greater than ln
√
2. The QPSO algorithm is
summarized in Algorithm 1.
|T| and |S| denote the number of iterations and swarm size,respectively. U[0, 1] is a
uniform random number in the range of [0, 1] and f(x) is an arbitrary objective function.
The QPSO algorithm has some advantages over other variations of the PSO algorithm.
It is less sensitive to the bias problem as it has only one parameter to be tuned. It
is theoretically guaranteed that the QPSO converges to the global optimum. It is also
experimentally shown that the QPSO algorithm outperforms other PSO variations in
ﬁnding the optimal solutions and has stronger exploring capabilities [51].
For more
comprehensive evaluations on the QPSO algorithm, please refer to [51]. The QPSO
algorithm has been successfully applied to a vast variety of engineering problems such
as but not limited to system identiﬁcation [12], image processing [30], power systems [65],
neural network training [32], brain-computer interfacing [16], etc.
5
Proposed Approach
5.1
Individual Representation
In order to encode the problem of deciding the optimal weights of a given LQR, each
particle is considered as a potential solution that encodes the elements of weighting
7
Algorithm 1: The standard QPSO algorithm.
Result: returns the position vector of the global best particle
begin
Initialize the current positions randomly
for t = 1 to T do
for i = 1 to |S| do
Calculate ﬁtness f (⃗xi(t))
Update personal best (⃗xi,pbest(t)) using 10
Update global best (⃗xgbest(t)) using 11
cp, cg ∼U[0, 1]
Compute the local attractor ⃗pi(t) using 12
for d = 1 to |D| do
u ∼U[0, 1]
L = (1/g) × |⃗xi,d(t) −⃗pi,d(t)|
⃗xi,d(t) = ⃗pi,d(t) −L × ln (1/u) with probability 0.5
otherwise ⃗xi,d(t) = ⃗pi,d(t) + L × ln (1/u)
return ⃗xgbest
matrices into its position vector. The state and control weighting matrices consist of n2
and m2 elements respectively which results in the total number of n2 + m2 elements to
be decided. Finding the optimum solution within a (n2 + m2)-dimensional space is a
challenging task due to the curse-of-dimensionality eﬀect (e.g. 250 features for a system
with 15 state and 5 control variables). A practical solution which is widely exploited
in the real world applications is to deﬁne both matrices in diagonal form to reduce the
search space to (n+m) dimensions. Hence, as shown in 14, the particles’ position vector
is deﬁned as an (n+m) dimensional vector that represents the concatenation of diagonal
state and control weighting matrices.
⃗xi(t) = [xi,1(t), . . . , xi,n+m(t)] = [Q1,1(t), . . . , Qn,n(t), R1,1(t), . . . , Rm,m(t)]
(14)
This simple trick reduces the number of features from n2 + m2 to n + m and as a
result enhances the exploration capabilities and reduces the computational costs while
reserving the accuracy.
5.2
Evaluation Criteria
To decide the evaluation criteria, the optimization goals are categorized to hard con-
straints and soft objectives. The former category refers to those constraints that must
be satisﬁed. If a solution does not satisfy a given hard constrain, it is considered as
an infeasible solution. On the other hand, those solutions that satisfy the hard con-
straints are feasible solutions that compete within the feasible sub-space to reach the
better regions by optimizing the soft objectives. According to the deﬁnition in Section
2, Q and R matrices must be symmetric positive matrices and therefore those solutions
with negative values are considered as infeasible solutions. This is the only hard con-
straint addressed in the proposed approach. The infeasible solutions are treated using
a stochastic approach inspired by local search strategy in MA [20]. These solutions are
8
either repaired with probability Pr or debilitated by being penalized with probability
(1Pr). The repair probability is deﬁned by 15.
Pr (⃗xi(t)) = 1 −


n+m
X
j=1
H (xi,j(t))
n + m


(15)
In this equation, xi(t) denotes the position vector of particle i in time t, n and m
represent the cardinality of diagonal Q and R matrices respectively and H(.) is the
Heaviside step function that maps negative values to zero and positive values to one.
The repair probability decreases proportional to the number of negative values within
the position vector of a given particle. The repair strategy is thoroughly discussed in
Section 5.3.
Three classes of soft objectives are considered as follows. (1) A given LQR controller
must minimize the quadratic performance index J deﬁned by 3. (2) An optimal con-
troller also should minimize the time-domain control objectives that correspond to the
transient response. The transient response of a system may be described using two com-
peting factors including the swiftness and the closeness of the response to the desired
response. The swiftness of the response is measured by the rise-time Tr. The closenes-
sis measured by the overshoot OS and settling-time Ts. (3) The steady-state behavior
of the system is addressed by minimizing the steady-state error Ess. Furthermore, in
order to augment the multi-objective aspect of the evaluation criteria, each particle
is rewarded using16 which is the diﬀerence between the number of solutions that are
dominated by that particle and the number of solutions that dominate the particle.
R (⃗xi) =


X
∀x∈S,x̸=xi
xi ≺x

−


X
∀x∈S,x̸=xi
xi ≻x


(16)
The domination reward is normalized using a sigmoid function and aggregated with
the violation penalty into a uniform weighted penalty-reward factor deﬁned by 17.
fP−R (⃗xi) =

1
1 + e−|S|×R(⃗xi)

× Pr (⃗xi(t)) × φ
(17)
In this equation, fP−R denotes the penalty-reward factor and φ denotes the penalty
for violating the hard constraint. In order to aggregate the soft objectives in a multi-
objective fashion, dynamic weighted aggregation (DWA) method in exploited in which
the gradual changes (set by the change frequency F) of the weights force the particles
to keep moving on the Pareto front. For a problem with two competing objectives, the
dynamics of the weights is deﬁned by 18.
w1(t) = | sin (2πt/F) |,
w2(t) = 1 −w1(t)
(18)
It has been shown that DWA outperforms bang-bang weighted aggregation (BWA)
in case of convex Pareto front and is identical to BWA with concave Pareto front. DWA
is also competitive with population-based non-Pareto approaches such as VEGA and
VEPSO in terms of optimality while outperforms them in terms of eﬃciency [39]. The
aggregated multi-objective criterion based on DWA is deﬁned as shown in 19.
f (⃗xi) = fP−R (⃗xi) × ((w1(t) × (log10(J) + Ess)) + ((w2(t) × (OS + Ts −TR))))
(19)
9
As shown in 19, the quadratic performance index is normalized using a logarithmic
operation.
This is because the quadratic performance index is mostly much bigger
than other elements in real world applications. The dynamically aggregated objective
function shown in 19 lets the designer to decide the priority between minimizing the
control eﬀort and steady-state response, and optimizing the transient response while it
automatically satisﬁes the hard constraint of the system. In other words, it provides the
designer with a set of Pareto optimal solutions and lets her to decide the target solution
based on practical preferences.
5.3
Optimization process
The standard QPSO algorithm introduced in [52] is augmented with two mechanisms in-
cluding informed initialization and repairing strategy. Mostly, population-based heuris-
tics construct an initial population of random individuals and then exploit an iterative
optimization to push the population toward the global optima.The informed initializa-
tion refers to initializing the population based on a priori knowledge and has been shown
that is more eﬃcient than naive and random initialization [8]. In order to reinforce the
algorithm with informed initialization, simulated annealing search [19] is exploited which
is a fast search strategy that can escape the local optima due to its stochastic nature.
This algorithm optimizes the problem at hand by emulating the cooling process of a
solid until it reaches the minimum energy conﬁguration.
In this algorithm, in each
step a random neighbor of the current state is chosen. In case that the selected neigh-
bor is ﬁtter than the cur-rent state, the search continues exploring from the neighbor
state.Otherwise, it moves to the neighbor state with a probability proportional to the
current temperature and the diﬀerence between the ﬁtness values of the current and
the neighbor states. This prob-ability decreases as the temperature reduces through the
time. The proposed informed initialization based on simulated annealing is shown in
Algorithm 2.
Algorithm 2: The informed initialization strategy.
Result: returns a diverse and semi-optimal set of solutions
begin
Initialize population:⃗xi ∼U[0, Xmax],
i = 1 . . . |S|
for i = 1 to |S| do
w1, w2 ∼U[0, 1]
for t = 1 to Tini do
for d = 1 to D do
Succd (⃗xi) = min (max (N (⃗xi,d, σ) , 0) , Xmax) with probability
pSucc
Calculate f (⃗xi) and Succ (⃗xi) using 14 with current w1 and w2
∆E = f (⃗xi) −f (Succ (⃗xi))
T = e−αt
if ∆E > 0 then
f (⃗xi) ←f (Succ (⃗xi))
else
f (⃗xi) ←f (Succ (⃗xi)) with probability e∆E/T
return ⃗xi,
i = 1 . . . |S|
10
The neighbor state (Succ(xi)) is selected using Gaussian mutation operator adopted
from evolutionary computation in which N(x, σ) is a normal distribution of random
variable x with standard deviation of σ. The simulated annealing search is performed
for Tini iterations on each particle which is fairly smaller than the number of iterations
within the main optimizer. This process results in an initial swarm that is close to the
Pareto optimal. It is noteworthy that to provide the initial swarm with a good degree
of diversity,the aggregation weights are set randomly for each particle rather than using
18.
The repair strategy is introduced to map the infeasible solutions into the feasible sub-
space. As mentioned in Section 5.2, a particle with negative values within its position
vectors violates the described hard constraint and is considered as an infeasible solution.
A naive repair strategy is to randomly replace the negative values with random positive
values. A more profound strategy is to exploit the strong feasible solutions to guide the
infeasible particles to those regions of feasible sub-space that are more likely to be close to
the Pareto optimal. For this purpose, the cross-over and tournament selection operations
are adopted from evolutionary computations. The repair strategy is as follows. For
each infeasible particle, two feasible particles that are strongest in a randomly selected
neighborhood are selected. Then, for each negative value embedded within the infeasible
particles, the corresponding values of two selected particles are combined in a linear
weighted manner. The negative value of the particle is then overwritten by the resulted
value. The algorithm of repair strategy is shown in Algorithm 3.
Algorithm 3: The repair strategy to map the infeasible particles to the feasible
sub-space.
Data: ⃗xi : position of infeasible particle i
Result: returns a feasible solution
begin
⃗p1, ⃗p2 ←Tournament-Selection()
for d = 1 to D do
λ ∼U[0, 1]
if ⃗xi,d < 0 then
⃗xi,d = λ⃗p1,d + (1 −λ) ⃗p2,d with probability 0.5
otherwise ⃗xi,d = (1 −λ) ⃗p1,d + λ⃗p2,d
return ⃗xi
Using informed initialization and repair strategy, we introduce our proposed rein-
forced MO-QPSO algorithm for optimal tuning of LQR controllers in Algorithm 4.
In this algorithm, |T|, |S|, and D denote the number of iterations, swarm size, and
the number of diagonal elements of the weighting matrices, respectively. As shown, the
proposed algorithm returns all non-dominant particles which build the Pareto optimal
set. These Pareto optimal solutions are then presented to the designer and she decides
the ﬁnal solution based on some domain-dependent and practical considerations and
her intuitions and expertise. The schematic of the integration of proposed RMO-QPSO
algorithm into the closed-loop system is illustrated in Figure 1.
11
Algorithm 4: Proposed reinforced RMO-QPSO algorithm for optimal tuning of
LQR controllers.
Result: returns set of optimal diagonal Q and R matrices
begin
Informed-Initialization(): Algorithm 2
for t = 1 to |T| do
for i = 1 to |S| do
Calculate P from Riccati equation: 5
Calculate the feedback gain K: 4
Simulate the closed-loop system: 2
Calculate ﬁtness f (⃗xi(t)): 19
Repair (⃗xi(t)) with probability Pr (⃗xi(t)) : 15 and Algorithm 3
Update personal best ⃗xi,pbest(t): 10
Update global best ⃗xgbest(t): 11
cp, cg ∼U[0, 1]
Compute the local attractor using ⃗pi(t): 12
for i = 1 to |S| do
u ∼U[0, 1]
L = (1/g) × |⃗xi,d(t) −⃗pi,d(t)|
⃗xi,d(t) = ⃗pi,d(t) −L × ln (1/u) with probability 0.5
otherwise ⃗xi,d(t) = ⃗pi,d(t) + L × ln (1/u)
return all non-dominant particles
Plant
B
A
C
  
D
RMO-QPSO
K
+
+
+
y
x
u
ẋ
 
+
-
ur
+
Figure 1: Schematic of the proposed RMO-QPSO within the closed-loop system.
12
θ 
lcosθ 
l
u
x
y
lsinθ 
mg
M
w
h
u
q
θ 
µ
 
δ
 
(a)
(b)
Figure 2: Schematics of evaluation benchmarks. (a) Inverted pendulum system with
two degrees of freedom. (b) Aircraft landing ﬂare system.
6
Experimental results
6.1
Modeling
Without loss of generality, two classic control benchmarks are used for evaluating the
introduced method.
These benchmarks are an inverted pendulum system with two
degrees of freedom and an aircraft landing ﬂare system. The schematic of these two
systems is illustrated in Figure 2.
The inverted pendulum system consists of a pendulum mounted on a movable cart
which is restricted to linear motion. The control objective is to maintain the unstable
equilibrium position by moving the cart along a horizontal track. The pendulum initially
starts in an upright position with two acting forces including a vertical gravity force (mg)
and an external horizontal force u. The state space representation of the system derived
from Lagrange equation and linearized using Taylor series expansion is depicted in 20.


˙x
˙υ
˙θ
˙ω

=


0
1
0
0
0
0
−mg
M
0
0
0
0
1
0
0
g(m+M)
lM
0

×


x
υ
θ
ω

+


0
1
M
0
−1
lM

×

u

(20)
In this equation, x denotes the position of the cart and v is its linear velocity, θ is the
angle of the pendulum with respect to the vertical direction, and ω is its angular velocity.
The simulation parameters are set as follows. The mass of the cart and pendulum are
set to 0.5 kg and 0.2 kg, respectively. The length of the pendulum and the gravitational
acceleration are set to 0.6 m and 9.81 ms2, respectively. The initial condition of the
system is deﬁned as [x(0)υ(0)θ(0)ω(0)]T = [0009]T . The second system is an aircraft
landing ﬂare system with the simpliﬁed motion shown in 21.
13


˙u
˙w
˙q
˙θ
˙h
˙e


=


0.058
0.065
0
0.171
0
1
0.303
0.685
1.109
0
0
0
0.072
0.685
0.947
0
0
0
0
0
1
0
0
0
0
−1
0
1.133
0
0
0
0
0
0
0
0.571


×


u
w
q
θ
h
e


+


0
0
0.119
0.054
0
0.074
1.117
0
0.115
0
0
0
0
0
0
0
0.571
0


×


µ
γ
δ


(21)
In this equation, u denotes the longitudinal velocity, w presents the vertical velocity,
q is the angular velocity about pitch with respect to the ground, θ denotes the pitch with
respect to the ground, h is the height, e is the forward acceleration caused by throttle
action, µ is the elevator angle, δ is the spoiler angle, and γ is the throttle value. The
simulated landing trajectory is described by h′ + 0.2h = 0 and the initial condition is
deﬁned as [u(0)w(0)q(0)θ(0)h(0)e(0)]T = [52.513150.5]T . Considering the dynamics of
this system, we use the integrated absolute error(IAE) measure rather than steady-state
error. Hence, in 19 we replace the Ess term with IAE measure. Using landing trajectory
and considering h′ = w + 1.133 × θ from 21, the IAE measure is computed using 22.
IAE =
Z tf
0
|e(t)|dt =
Z tf
0
|w(t) + 1.133 × θ + 0.2 × h(t)|dt
(22)
6.2
Experimental Setup with Sensitivity Analysis
In order to conduct comparative studies on the performance of the proposed algorithm,
the two benchmark systems are stabilized using the trail-and-error, LM, GA, DE, ABC,
PSO, CPSO, AIWPSO, and RMO-QPSO methods. In order to eliminate the eﬀect of
bias (i.e.sensitivity of the meta-heuristics to their parameter setting) in the comparisons,
sensitivity analysis based on full factor design [29] is utilized. Although full factorial
design is not feasible for systems with high degrees of freedom such as NN but it is
completely tractable for most of the meta-heuristics as there are a few parameters to
be decided.
In order to carry out this task, each system is stabilized using a ﬁxed
set of parameter values for ﬁve times and the performance criterion is averaged. This
process is repeated for all possible parameter settings and as soon as all the settings
are evaluated, the setting that has resulted in the best performance is selected as the
parameter setting for the meta-heuristic at hand. In order to generate the all possible
parameter settings, a systematic and incremental approach is utilized. For each given
parameter, a predetermined range is deﬁned and then starting from the lower boundary,
the value of the parameter is incremented by a deﬁned step size until it reaches the
upper boundary. These ranges and step sizes are deﬁned as follows.
The range of iterations T for all competing methods except trail-and-error is set to
[25,150] with step size of 25. For LM optimizer, the sensitivity analysis is performed on
the blending factor λ within the range of [5,15] and step size of 1. For population-based
meta-heuristics, the range of population size |S| is set to[10,100] with increment factor
of 10. For the GA, single point uniform cross-over, tournament selection mechanism
with the neighborhood radius of 5 chromosomes, and Gaussian mutation operators are
utilized. The cross-over probability pc is deﬁned in the range of [0.3, 0.9] with step size
of 0.1 and the mutation probability pm is deﬁned in the range of [0.05, 0.3] with the
step size of 0.05. Considering that contrary to GA, DE mostly relies on the mutation
operator, the parameter setting for DE is deﬁne by switching the ranges of the cross-over
14
Table 1: Acquired parameter settings for simulations by sensitivity analysis.
Method
Parameter setting
Trail-error
—
LM
λ ←10,
T ←150
GA
|S| = 70,
T ←150,
pc ←0.9,
pm ←0.1
DE
|S| = 50,
T ←150,
pc ←0.15,
pm ←0.8
ABC
|S| = 30,
T ←175
PSO
|S| = 40,
T ←50,
cp ←0.7,
cg ←1.5
CPSO
|S| = 20,
T ←175
AIWPSO
|S| = 40,
T ←70,
wmin ←0.05,
wmax ←0.95
MO-QPSO
|S| = 20,
T ←75,
g ←1.5ln
√
2
and mutation probabilities used in GA. The ABC setting is as follows. The percentage
of onlooker bees is set to 50% of the colony, the employed bees are set to 50% of the
colony and the number of scout bees is selected as one. In the PSO algorithm, the range
of the both cognitive and social parameters is set to [0.5, 2.0] with a step size of 0.1. The
CPSO and AIWPSO settings are identical to the PSO setting. The chaotic sequence
ofCPSO is generated by logistic maps. Finally, the control parameter of RMO-QPSO
g is set to the range of [0,1]. As suggested in literature [52], the control parameter is
updated by coeﬃcient of ln
√
2.The results of parameter settings for diﬀerent optimizers
using full factorial sensitivity analysis are summarized in Table 1. As shown, the best
parameter settings for two benchmarks are selected using full factorial design. These
acquired parameter settings will mitigate the bias sensitivity of the opponent methods
and will result in fair comparisons.
6.3
Simulation Results
The nine competitive methods are applied to stabilize the two benchmarks based on
the parameter settings decided by sensitivity analysis. The trial-and-error method and
the MA optimizer are run for one time as they do not have stochastic nature. On the
other hand, the stochastic meta-heuristics are run for ﬁfty times to approximate their
behavior based on the acquired averages and standard deviations. The results are shown
in Table 2.
As shown in Table 2, in case of inverted pendulum system,results indicate that all of
the methods have successfully eliminated the steady-state error. In terms of overshoot
the trial-and-error out-performs other methods. This is due to the slow response of
the designed controller by expert. In terms of the rise-time RMO-QPSO and CSPO
outperform other methods where as in terms of settling-time RMO-QPSO reaches the
steady state faster than others. Also,RMO-QPSO and DE reach the best quadratic
performance index.In controlling the ﬂight landing system, RMO-QPSO outperforms
other techniques in terms of integrated absolute error, overshoot,rise-time and settling
time. In terms of minimizing the quadratic index CPSO and RMO-QPSO perform best.
The results suggest in the inverted pendulum system, RMO-QPSO enhances the
rising time, settling time, and the quadratic performance index by 4%, 8%, and 2%,
respectively in comparison to the second best optimizer.
Considering the fairly fast
dynamics of the inverted pendulum system, these enhancements can boost the stabi-
lization process in turn. Also, the results suggest that for the ﬂight landing system,
15
Table 2: Acquired results from stabilizing two benchmark problems (inverted pendulum
on the left and ﬂight landing on the right).
Optimizer
Ess
OS
Tr
Ts
J
IAE
OS
Tr
Ts
J
Trial-error
Mean
0
0.23
0.46
2.09
26,000
31.45
0.22
9.25
18.23
24,500
SD
–
–
–
–
–
–
–
–
–
–
LM
Mean
0
0.51
0.49
1.84
24,200
26.08
0.21
9.03
14.32
26,700
SD
–
–
–
–
–
–
–
–
–
–
GA
Mean
0
0.85
0.54
1.19
22,000
20.32
0.19
6.53
12.31
24,200
SD
0
0.06
0.07
0.12
740.9
4.58
0.07
1.46
2.34
512.5
DE
Mean
0
0.62
0.46
1.11
20,500
14.35
0.11
2.51
11.22
24,220
SD
0
0.04
0.04
0.08
230.5
5.26
0.04
0.84
0.98
251.3
ABC
Mean
0
0.43
0.47
1.13
20,570
10.78
0.17
4.76
11.84
20,550
SD
0
0.02
0.06
0.08
268.0
1.56
0.04
0.79
1.87
98.5
PSO
Mean
0
0.89
0.48
1.11
21,600
11.21
0.18
5.23
11.97
22,000
SD
0
0.07
0.09
0.06
231.6
3.35
0.06
0.62
1.36
112.0
CPSO
Mean
0
0.46
0.43
1.15
20,700
10.06
0.19
4.89
11.93
19,300
SD
0
0.02
0.02
0.07
212.3
3.24
0.11
0.97
0.76
104.3
AIWPSO
Mean
0
0.59
0.47
1.20
20,850
9.64
0.15
3.91
11.46
20,580
SD
0
0.07
0.05
0.07
225.0
2.01
0.05
1.12
1.12
115.6
RMO-QPSO
Mean
0
0.67
0.43
1.02
20,500
8.89
0.07
1.42
7.27
19,300
±
0
0.10
0.09
0.70
220.1
6.21
0.23
1.03
2.32
472.1
RMO-QPSO enhances the integrated absolute error, over-shoot, rising time, settling
time, and the quadratic performance index by 8%, 50%, 70%, 50%, and 6%, respec-
tively. It is noteworthy that due to slower dynamics of the ﬂight landing system, the
proposed algorithm has resulted in greater enhancements in comparison with the in-
verted pendulum system. All in all, the results suggest that RMO-QPSO outperforms
trail-and-error,gradient-based and stochastic techniques. It is also shown that ABC,DE,
CPSO and AIWPSO achieve better performance in comparison to trial-and-error, ML,
GA and PSO.
In order to prove that the results shown in Table 2 carry statistical signiﬁcance,
pairwise one-tailed t-test is performed between the results of the RMO-QPSO and the
other stochastic methods.
The results of the t-test are summarized in Table 3.
In
a standard t-test, the null hypothesis is rejected if the corresponding p-value is less
than 0.05. As shown in Table 3,the t-tests performed among the results of diﬀerent
objective of RMO-QPSO and opponent stochastic methods reveals that all p-values are
smaller than the threshold value. In other words, all the performed t-tests reject the
null hypothesis and hence the simulation results are in fact statistically valid.
The responses of the pendulum’s angle and its angular velocity to the initial condi-
tions are depicted in Figure 3. The responses of the spoiler angle, elevator angle, and
throttle value of the ﬂight landing system to the initial conditions are illustrated in Fig-
ure 4. It is noteworthy that for the purpose of readability, the diagrams only illustrate
the responses of the systems under the control of four best performed methods.
16
Table 3: Acquired results from stabilizing two benchmark problems (inverted pendulum
on the left and ﬂight landing on the right).
Optimizer
OS
Tr
Ts
J
IAE
OS
Tr
Ts
J
GA
p-value
0.02
0.01
0.00
0.00
0.00
0.01
0.00
0.02
0.01
DE
p-value
0.01
0.00
0.00
0.00
0.03
0.00
0.02
0.00
0.01
ABC
p-value
0.00
0.02
0.04
0.03
0.00
0.03
0.02
0.04
0.00
PSO
p-value
0.00
0.02
0.00
0.00
0.02
0.04
0.00
0.03
0.00
CPSO
p-value
0.00
0.00
0.04
0.01
0.00
0.03
0.01
0.00
0.04
AIWPSO
p-value
0.03
0.04
0.00
0.00
0.02
0.00
0.00
0.01
0.00
7
Conclusion
In this paper, a novel and generic multi-objective design paradigm is proposed which
utilizes a global convergence guaranteed variation of particle swarm optimization (PSO)
called quantum-behaved PSO (QPSO) for deciding the optimal conﬁguration of the
LQR controller for a given problem considering a set of competing objectives. There
are three main contributions introduced in this paper as follows.
(1) The standard
QPSO algorithm is reinforced with an informed initialization scheme based on the sim-
ulated annealing algorithm and Gaussian neighborhood selection mechanism. (2) It is
also augmented with a local search strategy which integrates the advantages of memetic
algorithm into conventional QPSO. (3) Finally, an aggregated dynamic weighting crite-
rion is introduced that dynamically combines the soft and hard constraints with control
objectives to provide the designer with a set of Pareto optimal solutions and lets her
to decide the target solution based on practical preferences.
As far as the authors’
knowledge is concerned, this is the ﬁrst time that a multi-objective derivation of PSO
is applied for deciding the conﬁguration of LQR controllers.
The proposed method is compared against a gradient-based method, seven meta-
heuristics, and the trial-and-error method on two control benchmarks including inverted
pendulum system and ﬂight landing system using sensitivity analysis and full factorial
parameter selection and the results are validated using one-tailed T-test. The exper-
imental results suggest that in inverted pendulum system, RMO-QPSO enhances the
rising time, settling time, and the quadratic performance index by 4%, 8%, and 2%,
respectively in comparison to the second best optimizer. Also, the results suggest that
for the ﬂight landing system, RMO-QPSO enhances the integrated absolute error, over-
shoot, rising time, settling time, and the quadratic performance index by 8%, 50%, 70%,
50%, and 6%, respectively. All in all, the results suggest that RMO-QPSO outperforms
trail-and-error, gradient-based and stochastic techniques. It is also shown that ABC,
DE, CPSO and AIWPSO achieve better performance in comparison to trial-and-error,
ML, GA and PSO. A possible direction for future studies is to enhance the RMO-QPSO
in a way that it can address the online optimization of the LQR based on the signal
stream.
17
1.0
0.6
0.4
0.2
0
-0.2
0.8
Angle
Time
0
1
2
3
4
5
6
DE
CPSO
ABC
MO-QPSO
10
6
4
2
0
-2
8
Angular velocity 
Time
0
1
2
3
4
5
6
DE
CPSO
ABC
MO-QPSO
(a)
(b)
Figure 3: The responses of inverted pendulum system to the initial conditions.
(a)
Response of the angle. (b) Response of the angular velocity.
-5
-10
-15
-20
-25
0
Spoiler angle
Time
0
5
10
15
20
25
30
ABC
AIWPSO
DE
MO-QPSO
-7
0
-1
-2
-3
-4
2
1
-5
-6
Elevator angle
Time
0
5
10
15
20
25
30
ABC
AIWPSO
DE
MO-QPSO
10
5
0
-5
15
Throttle value
Time
0
5
10
15
20
25
30
(a)
(b)
(c)
ABC
AIWPSO
DE
MO-QPSO
Figure 4: The responses of the ﬂight landing system to the initial conditions.
(a)
Response of the spoiler angle. (b) Response of the elevator angle. (c) Response of the
throttlevalue.
18
References
[1] Bahriye Akay. A study on particle swarm optimization and artiﬁcial bee colony
algorithms for multilevel thresholding. Applied Soft Computing, 13(6):3066–3091,
2013.
[2] Mohammed R AlRashidi and Mohamed E El-Hawary. A survey of particle swarm
optimization applications in electric power systems. IEEE Transactions on Evolu-
tionary Computation, 13(4):913–918, 2009.
[3] Fereidoun Amini, N Khanmohammadi Hazaveh, and A Abdolahi Rad. Wavelet pso-
based lqr algorithm for optimal structural control using active tuned mass dampers.
Computer-Aided Civil and Infrastructure Engineering, 28(7):542–557, 2013.
[4] Akhilesh Chander, Amitava Chatterjee, and Patrick Siarry. A new social and mo-
mentum component adaptive pso algorithm for image segmentation. Expert Systems
with Applications, 38(5):4998–5004, 2011.
[5] Sun Changhao and Haibin Duan.
Artiﬁcial bee colony optimized controller for
unmanned rotorcraft pendulum. Aircraft Engineering and Aerospace Technology,
85(2):104–114, 2013.
[6] Pinar Civicioglu and Erkan Besdok. A conceptual comparison of the cuckoo-search,
particle swarm optimization, diﬀerential evolution and artiﬁcial bee colony algo-
rithms. Artiﬁcial Intelligence Review, 39(4):315–346, 2013.
[7] Maurice Clerc and James Kennedy. The particle swarm-explosion, stability, and
convergence in a multidimensional complex space. IEEE transactions on Evolu-
tionary Computation, 6(1):58–73, 2002.
[8] Dipankar Dasgupta, German Hernandez, Deon Garrett, Pavan Kalyan Vejandla,
Aishwarya Kaushal, Ramjee Yerneni, and James Simien. A comparison of multi-
objective evolutionary algorithms with informed initialization and kuhn-munkres
algorithm for the sailor assignment problem. In Proceedings of the 10th annual
conference companion on Genetic and evolutionary computation, pages 2129–2134.
ACM, 2008.
[9] Ali Douik, Liouane Hend, and Hassani Messaoud. Optimised eigenstructure assign-
ment by ant system and lqr approaches. IJCSA, 5(4):45–56, 2008.
[10] Russ C Eberhart, James Kennedy, et al. A new optimizer using particle swarm
theory. In Proceedings of the sixth international symposium on micro machine and
human science, volume 1, pages 39–43. New York, NY, 1995.
[11] Emad Elbeltagi, Tarek Hegazy, and Donald Grierson.
Comparison among ﬁve
evolutionary-based optimization algorithms.
Advanced engineering informatics,
19(1):43–53, 2005.
[12] Gao Fei, Li Zhuo-Qiu, and Tong Heng-Qing.
Parameters estimation online for
lorenz system by a novel quantum-behaved particle swarm optimization. Chinese
Physics B, 17(4):1196, 2008.
19
[13] S Amir Ghoreishi and Mohammad Ali Nekoui. Optimal weighting matrices de-
sign for lqr controller based on genetic algorithm and pso. In Advanced Materials
Research, volume 433, pages 7546–7553. Trans Tech Publ, 2012.
[14] Liu Guoping, Xiao Genfu, and Yang Xiaohui. The lqr real-time control for the
inverted pendulum based on pso. In Electrical and Control Engineering (ICECE),
2010 International Conference on, pages 2363–2366. IEEE, 2010.
[15] Charles Harvey and Gunter Stein.
Quadratic weights for asymptotic regulator
properties. IEEE Transactions on Automatic Control, 23(3):378–387, 1978.
[16] Kaveh Hassani and Won-Sook Lee. An incremental framework for classiﬁcation
of eeg signals using quantum particle swarm optimization. In 2014 IEEE Inter-
national Conference on Computational Intelligence and Virtual Environments for
Measurement Systems and Applications (CIVEMSA), pages 40–45. IEEE, 2014.
[17] Kaveh Hassani and Won-Sook Lee. Optimal tuning of linear quadratic regulators
using quantum particle swarm optimization. In Proceedings of the International
Conference on Control, Dynamic Systems, and Robotics (CDSR14), 2014.
[18] Wilmar Hernandez and Norberto Ca˜nas.
Non-linear control of an autonomous
ground vehicle. In IECON 2011-37th Annual Conference on IEEE Industrial Elec-
tronics Society, pages 2678–2683. IEEE, 2011.
[19] Lester Ingber.
Simulated annealing: Practice versus theory.
Mathematical and
computer modelling, 18(11):29–57, 1993.
[20] Hisao Ishibuchi, Tadashi Yoshida, and Tadahiko Murata. Balance between genetic
search and local search in memetic algorithms for multiobjective permutation ﬂow-
shop scheduling.
IEEE transactions on evolutionary computation, 7(2):204–223,
2003.
[21] Wisnu Jatmiko, Kosuke Sekiyama, and Toshio Fukuda. A pso-based mobile robot
for odor source localization in dynamic advection-diﬀusion with obstacles envi-
ronment: theory, simulation and measurement. IEEE Computational Intelligence
Magazine, 2(2):37–51, 2007.
[22] Tor Arne Johansen, Idar Petersen, Jens Kalkkuhl, and Jens Ludemann.
Gain-
scheduled wheel slip control in automotive brake systems. IEEE Transactions on
Control Systems Technology, 11(6):799–811, 2003.
[23] MA Johnson and MJ Grimble. Recent trends in linear optimal quadratic multivari-
able control system design. In IEE Proceedings D-Control Theory and Applications,
volume 134, pages 53–71. IET, 1987.
[24] Srinivas Bhaskar Karanki, Mahesh K Mishra, and B Kalyan Kumar. Particle swarm
optimization-based feedback controller for uniﬁed power-quality conditioner. IEEE
transactions on power delivery, 25(4):2814–2824, 2010.
[25] Bachir Kedjar and Kamal Al-Haddad. Dsp-based implementation of an lqr with
integral action for a three-phase three-wire shunt active power ﬁlter. IEEE Trans-
actions on Industrial Electronics, 56(8):2821–2828, 2009.
20
[26] Arsalan H Khan, Zhang Weiguo, Shi Jingping, and Zeashan H Khan. Optimized
reconﬁgurable modular ﬂight control design using swarm intelligence.
Procedia
Engineering, 24:621–628, 2011.
[27] Raghavendra V Kulkarni and Ganesh Kumar Venayagamoorthy. Particle swarm
optimization in wireless-sensor networks: A brief survey. IEEE Transactions on
Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(2):262–
267, 2011.
[28] E Vinodh Kumar and Jovitha Jerome. Lqr based optimal tuning of pid controller
for trajectory tracking of magnetic levitation system. Procedia Engineering, 64:254–
264, 2013.
[29] Yooyoung Lee, James J Filliben, Ross J Micheals, and P Jonathon Phillips. Sensitiv-
ity analysis for biometric systems: A methodology based on orthogonal experiment
designs. Computer Vision and Image Understanding, 117(5):532–550, 2013.
[30] Xiujuan Lei and Ali Fu. Two-dimensional maximum entropy image segmentation
method based on quantum-behaved particle swarm optimization algorithm. In 2008
Fourth International Conference on Natural Computation, volume 3, pages 692–696.
IEEE, 2008.
[31] Frank L Lewis and Vassilis L Syrmos. Optimal control. John Wiley & Sons, 1995.
[32] Shouyi Li, Ronggui Wang, Weiwei Hu, and Jianqing Sun. A new qpso based bp
neural network for face detection. In Fuzzy information and engineering, pages
355–363. Springer, 2007.
[33] Yong Li, Jianchang Liu, and Yu Wang. Design approach of weighting matrices for
lqr based on multi-objective evolution algorithm. In Information and Automation,
2008. ICIA 2008. International Conference on, pages 1188–1192. IEEE, 2008.
[34] Hend Liouane, Ibtissem Chiha, Ali Douik, and Hassani Messaoud. Probabilistic
diﬀerential evolution for optimal design of lqr weighting matrices. In 2012 IEEE
International Conference on Computational Intelligence for Measurement Systems
and Applications (CIMSA) Proceedings, pages 18–23. IEEE, 2012.
[35] Madhubanti Maitra and Amitava Chatterjee. A hybrid cooperative–comprehensive
learning based pso algorithm for image segmentation using multilevel thresholding.
Expert Systems with Applications, 34(2):1341–1350, 2008.
[36] Mohammad Ali Nekoui and Hassan Heidari Jame Bozorgi. Weighting matrix se-
lection method for lqr design based on a multi-objective evolutionary algorithm.
In Advanced Materials Research, volume 383, pages 1047–1054. Trans Tech Publ,
2012.
[37] Arfah Syahida Mohd Nor, Hazlina Selamat, and Ahmad Jais Alimin.
Optimal
controller design for a railway vehicle suspension system using particle swarm op-
timization. Jurnal Teknologi, 54(1):71–84, 2012.
[38] Carlos Olalla, Ramon Leyva, Abdelali El Aroudi, and Isabelle Queinnec. Robust
lqr control for pwm converters: an lmi approach. IEEE Transactions on industrial
electronics, 56(7):2548–2558, 2009.
21
[39] Konstantinos E Parsopoulos and Michael N Vrahatis. Particle swarm optimization
method in multiobjective problems. In Proceedings of the 2002 ACM symposium
on Applied computing, pages 603–607. ACM, 2002.
[40] M Bayati Poodeh, S Eshtehardiha, and M Namnabat. Optimized state controller
on dc-dc converter. In 2007 7th Internatonal Conference on Power Electronics,
pages 153–158. IEEE, 2007.
[41] Jim Pugh and Alcherio Martinoli. Inspiring and modeling multi-robot search with
particle swarm optimization. In 2007 IEEE Swarm Intelligence Symposium, pages
332–339. IEEE, 2007.
[42] Elyas Rakhshani. Intelligent linear-quadratic optimal output feedback regulator for
a deregulated automatic generation control system. Electric Power Components and
Systems, 40(5):513–533, 2012.
[43] SA Panimadai Ramaswamy, Ganesh K Venayagamoorthy, and SN Balakrishnan.
Optimal control of class of non-linear plants using artiﬁcial immune systems: Ap-
plication of the clonal selection algorithm. In 2007 IEEE 22nd International Sym-
posium on Intelligent Control, pages 249–254. IEEE, 2007.
[44] Mehrdad Saif. Optimal linear regulator pole-placement by weight selection. Inter-
national Journal of Control, 50(1):399–414, 1989.
[45] Yahaya Md Sam, Mohd Ruddin Hj Ab Ghani, and Nasarudin Ahmad. Lqr controller
for active car suspension. In TENCON 2000. Proceedings, volume 1, pages 441–444.
IEEE, 2000.
[46] Andrey V Savkin, PUBUDU N Pathirana, and FARHAN A Faruqi. Problem of
precision missile guidance: Lqr and h control frameworks. IEEE Transactions on
Aerospace and Electronic Systems, 39(3):901–910, 2003.
[47] Andrey V Savkin, PUBUDU N Pathirana, and FARHAN A Faruqi. Problem of
precision missile guidance: Lqr and h control frameworks. IEEE Transactions on
Aerospace and Electronic Systems, 39(3):901–910, 2003.
[48] Mahmud Iwan Solihin, Rini Akmeliawati, et al. Pso-based optimization of state
feedback tracking controller for a ﬂexible link manipulator. In Soft Computing and
Pattern Recognition, 2009. SOCPAR’09. International Conference of, pages 72–76.
IEEE, 2009.
[49] Mahmud Iwan Solihin, Ari Legowo, Rini Akmeliawati, et al. Comparison of lqr and
pso-based state feedback controller for tracking control of a ﬂexible link manipula-
tor. In Information Management and Engineering (ICIME), 2010 The 2nd IEEE
International Conference on, pages 354–358. IEEE, 2010.
[50] Scott R Starin, RK Yedavalli, and Andrew G Sparks. Design of a lqr controller
of reduced inputs for multiple spacecraft formation ﬂying. In American Control
Conference, 2001. Proceedings of the 2001, volume 2, pages 1327–1332. IEEE, 2001.
[51] Jun Sun, Wei Fang, Xiaojun Wu, Vasile Palade, and Wenbo Xu. Quantum-behaved
particle swarm optimization: Analysis of individual particle behavior and parameter
selection. Evolutionary computation, 20(3):349–393, 2012.
22
[52] Jun Sun, Bin Feng, and Wenbo Xu. Particle swarm optimization with particles
having quantum behavior. In Congress on Evolutionary Computation, 2004.
[53] Jun Sun, Wenbo Xu, and Bin Feng. A global search strategy of quantum-behaved
particle swarm optimization. In Cybernetics and Intelligent Systems, 2004 IEEE
Conference on, volume 1, pages 111–116. IEEE, 2004.
[54] Jun Sun, Wenbo Xu, and Bin Feng.
Adaptive parameter control for quantum-
behaved particle swarm optimization on individual level. In 2005 IEEE Interna-
tional Conference on Systems, Man and Cybernetics, volume 4, pages 3049–3054.
IEEE, 2005.
[55] Chin-Wang Tao, Jin-Shiuh Taur, and YC Chen. Design of a parallel distributed
fuzzy lqr controller for the twin rotor multi-input multi-output system. Fuzzy Sets
and Systems, 161(15):2081–2103, 2010.
[56] Shang-Jeng Tsai, Chih-Li Huo, Ying-Kuei Yang, and Tsung-Ying Sun. Variable
feedback gain control design based on particle swarm optimizer for automatic ﬁghter
tracking problems. Applied Soft Computing, 13(1):58–75, 2013.
[57] J Tyler and F Tuteur. The use of a quadratic performance index to design multi-
variable control systems. IEEE Transactions on Automatic Control, 11(1):84–92,
1966.
[58] Frans Van Den Bergh.
An analysis of particle swarm optimizers.
PhD thesis,
University of Pretoria, 2006.
[59] P Harsha Vardhana, B Kalyan Kumar, and Mahesh Kumar. A robust controller
for dstatcom. In 2009 International Conference on Power Engineering, Energy and
Electrical Drives, pages 546–551. IEEE, 2009.
[60] Lukasz Wiklendt, Stephan Chalup, and Rick Middleton. A small spiking neural net-
work with lqr control applied to the acrobot. Neural Computing and Applications,
18(4):369–375, 2009.
[61] Chaiporn Wongsathan and Chanapoom Sirima. Application of ga to design lqr
controller for an inverted pendulum system. In Robotics and Biomimetics, 2008.
ROBIO 2008. IEEE International Conference on, pages 951–954. IEEE, 2009.
[62] Xin Xiong and Zhou Wan. The simulation of double inverted pendulum control
based on particle swarm optimization lqr algorithm. In 2010 IEEE International
Conference on Software Engineering and Service Sciences, pages 253–256. IEEE,
2010.
[63] Xian Qiang Zeng, Lan Jing, Ze En Yao, and Yu Hui Guo. A pso-based lqr controller
for accelerator pwm power supply. In Advanced Materials Research, volume 490,
pages 71–75. Trans Tech Publ, 2012.
[64] Jing Zhang, Lixiang Zhang, and Jing Xie. Application of memetic algorithm in
control of linear inverted pendulum. In 2011 IEEE International Conference on
Cloud Computing and Intelligence Systems, pages 103–107. IEEE, 2011.
23
[65] Zhang Zhisheng. Quantum-behaved particle swarm optimization algorithm for eco-
nomic load dispatch of power system. Expert Systems with Applications, 37(2):1800–
1803, 2010.
24