Research Article
An SQP Method for Solving the Nonlinear Minimax Problem
Department of Computational Science and Mathematics, Guilin Institute of Electronic Technology, Guilin,541004, Peoples Republic of China
The minimax optimization problem can be stated as
(1) |
where fj(j=1~m) are real-valued functions defined on Rn.
The objective function Mf(x) has discontinuous first partial derivatives at points where two or more of the functions fj(x) are equal to Mf even if fj(x)(j=1~m) have continuous first partial derivatives. Thus we can not use directly the well known gradient methods to minimizeFor excellent works on this topic and its applications was reported[2-7].
The minimax problem (1) is equivalent to the following nonlinear programming:
(2) |
According to (2), the K-T condition of (1) is obtained as follows:
(3) |
where:
The active set is denoted as follows:
(4) |
A lot of approaches have been proposed for solving the problem (1), most of them transform the minimax problem into the nonlinear programming problem (2) and solves it by well-established methods. Vardi[8] uses some slack variables to handle inequality constraints of (2), then solves only equality-constrained minimization problem taking advantage of trust-region strategy.
Charalambous and Conn[9] propose an algorithm to solve the minimax problem directly, in which two distinct search directions are necessary to compute: The first, the horizontal direction, attempts to reduce Mf(x) whilst, at the same time, keeping those functions whose values are close to Mf(x), approximately equal. The second, the vertical direction, amounts to attempting to decrease the error to within which those functions are equal to Mf(x) by means of linearization. Then, under assumptions that the minimax solution is unique, that fj(j=1~m) are convex and twice differentiable and that the strict complementarity condition is satisfied, global convergence is proven. However, computational effort is large and the rate of convergence is at most linear.
Due to its superlinear convergence rate, the sequential quadratic programming (SQP) methods are considered among the most effective methods for solving nonlinear programming problems. A new algorithm is proposed by Zhou and Tits[10] with taking advantage of the idea of SQP algorithms to solve the problem (1). There, in order to avoid the Maratos effect, it takes a nonmonotone line search along a direction d which is obtained by solving a QP subproblem. However, it is observed that, with such a line search and the direction d, it may not ensure that the full step of one is accepted. For this reason, it is necessary to perform a nonmonotone arc search along based on a correction which is obtained by solving another QP subproblem. Moreover, its convergence rate is two-step supperlinear.
A SQP type algorithm is proposed by Xue[11] to solve the problem (1). However, in order to obtain the one-step superlinear convergence rate, there it is necessary to make two additional assumptions which are too strong: Firstly, the entire sequence generated by the algorithm converges to the K-T point of (1); Secondly, the stepsize is always equal to one after finite iterations.
In this paper, a new SQP algorithm is proposed to solve the problem (1) directly according to the K-T condition (3). This algorithm overcomes the shortcomings just pointed out. It performs a monotone line search along a direction which is obtained by solving only one QP subproblem. With such a particular line search, we point out, because of the intrinsic properties of the minimax problem (1) in contrast with the nonlinear programming, unlike[10] there does not exist any Maratos-like effect and the search direction is not necessary to revise any more. Global convergence is obtained without the above-mentioned strong assumptions in[9]. Under some suitable conditions which are weaker than those in[11] one-step superlinear convergence rate is proven.
Description of algorithm: The following algorithm is proposed for solving problem (1).
Algorithm A
Step 1-Initialization and date: x0 ∈ Rn, B0 ∈ Rnxn a symmetric positive definite matrix.
Step 2-Computation of the main search direction dk: Compute (zk,dk) by solving the following quadratic problem at xk:
(5) |
Let λk be the corresponding multipliers vector. If (zk,dk)= (0, 0), STOP;
Step 3-The line search: Compute tk, the first number t of the sequence {1,½,¼,.....} satisfying
(6) |
Step 4-Updates: Compute a new symmetric definite positive matrix Bk+1. Like Vardi[8] we use Powell's modification[12]
(7) |
where:
(8) |
Let xk+1 = xk + tkdk,k = k+1. Go back to step 2.
Global convergence of algorithm: It is first shown that Algorithm A is well defined. The following general assumptions are true throughout the paper.
A1: fj(j=1~m) are continuously differentiable.
A2: For all x∈Rn, the vectors are linearly independent.
A3: There exist two constants 0<a≤b, such that a||d||2≤dTBkd≤b||d||2, for all k, for all d∈Rn.
Lemma 1: The solution (zk,dk) of the quadratic subproblem (5) is unique and zk+½(dk)T Bkdk≤0.
Proof: Since (0,0) is a feasible point of (5) and Bk is positive definite, it is clear that the claim holds.
Lemma 2: Let (zk,dk) be the solution of the QP (5) at xk, the following are equivalent:
(i) | (zk,dk)=(0,0); |
(ii) | zk=0; |
(iii) | dk=0; |
(iv) | zk+½(dk)T Bkdk=0. |
Proof: (i)⇒(ii). It is clear; (ii)⇒(iii) Since zk=0, then dk=0 by Lemma 1; (iii)⇒(iv). Since I(xk)={j|fj(xk)=Mf(xk),j∈I}is nonempty, let j∈I(xk), (5) implies that zk≥0. So, zk+½(dk)T Bkdk≥0, thereby, zk+½(dk)T Bkdk=0; (iv)⇒(i). Since zk+½(dk)T Bkdk=0, it is clear that(0,0) is unique solution of (5) by Lemma 1, that is to say (zk,dk)=(0,0).
Lemma 3: If (zk,dk)=(0,0), then xk is a K-T point of (1). If xk is not a K-T point of (1), then zk<0, dk≠0 and
Proof: From (5), we have
(9) |
If (zk,dk)=(0,0), then
(10) |
It shows that xk is a K-T point.
If xk is not a K-T point, then (zk,dk)≠(0,0). So,
zk+½(dk)T Bkdk<0, zk<-½(dk)T Bkdk≤0, |
and
The claim holds.
Lemma 4: Algorithm A is well defined.
Proof: We only need to show that the line search yields a step tk=(½)i for some finite i=i(k). Denote
For j∈I\I (xk), since fj(xk)<, Mf(xk) continuity of fj implies that there exists some , such that ak≤0.
For j∈I (xk), from (5), the fact α<1 implies that there exists some , such that ak≤0.
Define . It is clear that the line search condition (6) is satisfied for all t in .
In the sequel, we'll prove that any accumulation point x* of {xk}generated by the algorithm must be a K-T point of the problem (1). Since there are only finitely many choices for sets I(xk)⊆I, we might as well assume that there exists a subsequence K, such that
(11) |
where I* is a constant set.
Lemma 5: If xk⇒x*, Bk⇒B*, k∈K, then dk⇒0, k∈K.
Proof: First of all, in view of (6) and Lemma 3, it is evident that {Mf(xk)} is monotonous decreasing. Hence, considering to {xk}k∈K x*and continuity of Mf(x) it holds that
(12) |
Suppose by contradiction that d*≠0, then z*<0. Since
Let k∈K, k⇒∞, then
The corresponding QP problem (5) at x* is
(13) |
It is clear that (z*, d*) is a feasible solution of (13) and B* is positive definite. So, we might as well let be the unique solution of (13). Since
as k∈K, k⇒∞ we obtain that
So, it can be seen that thereby,
and for k∈K, k large enough, we have
(14) |
From (14), similar to Lemma 4, we can conclude that the step-size tk obtained by the linear search is bounded away from zero on K, i.e.,
tk≥t* = inf{tk, k∈K}, k∈K |
So, from (6), (12), we get
(15) |
It is contradiction, which shows that dk⇒0, k∈K.
According to Lemma 3, Lemma 5 and the fact that (z*, d*) is the solution of (13) , we can get the following convergent theorem.
Theorem 1: The algorithm A either stops at the K-Tpoint {xk} of (1) in finite iteration, or generates an infinite sequence {xk} whose all accumulation points are K-T point of (1).
Rate of convergence: Now we strengthen the regularity assumptions on the functions involved. Assumption A1 are replaced by:
A1: | The functions fj(j=1~m) are two times continuously differentiable. We also make the following additional assumptions: |
A4: | The sequence generated by the algorithm possesses an accumulation point a (in view of Theorem 1, a K-T point). |
A5: | Bk⇒B*, k⇒∞ |
A6: | The matrix s nonsingular and it holds that |
where , is the corresponding K-T multipliers of x* and |
(16) |
A7: | The strict complementary slackness are satisfied at the K-T point pair (x*, u*), i.e., |
Theorem 2: The Kuhn-Tucker point x* is isolated.
Proof: This proof is similar to that of[13] but with some technical differences since assumption A2 about the linear independence constraint qualification is different with that of the nonlinear programming and the details are omitted.
Lemma 6: The entire sequence {xk} converges to the K-T point x*, i.e., xk⇒x*, k⇒∞.
Proof: From (5) and (6), we have that
(17) |
So, from (12), it holds that
Thereby, according to assumptions A6, A7, Theorem 2 and proposition by Painer and Tits[14] we have xk⇒x*, K⇒∞
Lemma 7: For K⇒∞, we have
dk⇒0,λk⇒u*, Lk≡I(x*), zk=O (||dk||) |
where:
(18) |
Proof: According to Lemma 5 and xk⇒x*, Bk⇒B*, K⇒∞, it holds that
dk ⇒ 0, zk ⇒ 0, k ⇒ ∞.
Since (x*, u*) is a K-T point pair of (1), we have
Denote
From A2, it is clear that
and
(19) |
At the same time, denote
From (9), we have
thereby,
(20) |
Since |
it is obvious , for k large enough, that
So,
In addition, according to dk⇒0, zk⇒0 and (18), it is clear that Lk⊆I(x*).
On the contrary, for j∈I(x*), assumption A7 implies that uj*>0. So, λkj>0 for k large enough, thereby, it holds that
fj(xk)-Mf(xk) +∇fj(xk)Kdk-zk=0,i .e., j∈Lk.
So, Lk≡I(x*).
Thereby, from I(xk)⊆I*≡Lk and A2, it is clear that Zk=O(||dk||).
In order to obtain superlinear convergence, we make the following assumption:
A8: | ∇2fj(x) (j=1~m) are Lipschitz continuous on some ball B(x*, ∈)of radius ε>0 about x*, i.e., there exist some vj>0, such that |
Lemma 8: Under assumptions A1-A8, ||xk+dk-x*||=o(||xk-x*||) if and only if the following condition holds:
A9: | The sequence of matrices {Bk}satisfies |
Proof: This proof is similar to the proof of Theorem by Xue[11].
Lemma 9: If assumptions A1-A9 are true, then, for k large enough, tk≡1.
Proof: Firstly, we prove, for k large enough, that
(21) |
Let
sj=fj(xk+dk)-Mf (xk)-αzk=fj(xk)-Mf (xk)+∇fj(xk)Tdk-αzk+o(||dk||), j∈I.
Denote
I((xk+dk)={j|fj(xk+dk) = Mf (xk+dk)}
it is obvious, for k large enough, that I(xk+dk)⊆I*.
For j∈I\I(x*), the facts fj(x*)<Mf(x*) and dk⇒0, zk=O(||dk||) imply that sj≤0, ( for k large enough).
For j∈I\I* (xk+dk), from (5), the facts α<1, dk⇒0, zk=O(||dk||) imply that sj≤0, ( for k large enough).
From above analysis, in order to prove that (21) are true, we only prove, for k large enough, that
fj(xk+dk) = Mf(xk+dk)≤Mf(xk)+αzxk,∀j∈I(xk+dk).
i.e.,
While, it is always proven that
Thereby, it only needs to prove that
i.e.,
From (9) and Lk≡I(x*), we have
(22) |
Denote
from (22) and A8, we get
Since a∈(0,½), it holds that s≤0. So, (21) are true, for k large enough. Thereby, we have, for k large enough, that
Mf(xk+dk)≤Mf(xk)+αzk
i.e., tk≡1 . The claim holds.
From Lemma 8 and Lemma 9, we can get the following theorem:
Theorem 3: Under all above-mentioned assumptions, the algorithm is superlinearly convergent, i.e., ||xk+1-x*||=o(||xk-x*).
Numerical experiments: We carry out numerical experiments based on the algorithm. The results show that the algorithm is effective.
During the numerical experiments α=0.25 and B0=I, the nxn unit matrix. Bk is updated by the BFGS formula (8). In following Tables, IP is the initial point, NIT is the number of iterations of algorithm, AS is the approximate solution.
Problem 1: ([9]). Minimize the maximum of the following three functions:
Problem 2: ([11]). Minimize the maximum of the following six functions:
Problem 3: Rosen-Suzuki Problem ([9,11]). Minimize the maximum of the following functions:
Problem 4: Wong problem 1 ([9,11]). Minimize the maximum of the following functions:
Problem 5: Wong problem 2 ([11]). Minimize the maximum of the following functions:
In this paper, detailed information about solutions to above mentioned problems is listed as follows:
Table 1: | The information of the solution to Problem 1 |
Table 2: | The information of the solution to Problem 2 |
Table 3: | The information of the solution to Problem 3 |
Table 4: | The information of the solution to Problem 4 |
Table 5: | The information of the solution to Problem 5 |
The author would like to thank the refree for his helpful comments. This work was supported in part by the NNSF (103G1003) and GIETSF (D20350) of China.