An SQP Method for Solving the Nonlinear Minimax Problem

Zhu, Zhibin

ABSTRACT

In this study, a new algorithm is presented to solve the following nonlinear minimax problem

This algorithm belongs to the sequential quadratic programming (SQP) type methods. At each iteration, the search direction d is obtained by solving one quadratic programming according to the K-T condition of the minimax problem. When d is equal to zero, then the corresponding iteration point x is a K-T point, otherwise, d is a descent direction. Unlike the SQP type algorithms for nonlinear programming, the direction d doesn`t induce any Maratos like effect. A particular linear search with above-mentioned direction assure global convergence as well as superlinear convergence. Numerical results to date suggest the resulting algorithm is effective.

PDF Abstract XML References Citation

INTRODUCTION

The minimax optimization problem can be stated as

(1)

where f_j(j=1~m) are real-valued functions defined on Rⁿ.

The objective function M_f(x) has discontinuous first partial derivatives at points where two or more of the functions f_j(x) are equal to M_f even if f_j(x)(j=1~m) have continuous first partial derivatives. Thus we can not use directly the well known gradient methods to minimizeFor excellent works on this topic and its applications was reported^[2-7].

The minimax problem (1) is equivalent to the following nonlinear programming:

(2)

According to (2), the K-T condition of (1) is obtained as follows:

Image for - An SQP Method for Solving the Nonlinear Minimax Problem

(3)

where:

The active set is denoted as follows:

(4)

A lot of approaches have been proposed for solving the problem (1), most of them transform the minimax problem into the nonlinear programming problem (2) and solves it by well-established methods. Vardi^[8] uses some slack variables to handle inequality constraints of (2), then solves only equality-constrained minimization problem taking advantage of trust-region strategy.

Charalambous and Conn^[9] propose an algorithm to solve the minimax problem directly, in which two distinct search directions are necessary to compute: The first, the horizontal direction, attempts to reduce M_f(x) whilst, at the same time, keeping those functions whose values are close to M_f(x), approximately equal. The second, the vertical direction, amounts to attempting to decrease the error to within which those functions are equal to M_f(x) by means of linearization. Then, under assumptions that the minimax solution is unique, that f_j(j=1~m) are convex and twice differentiable and that the strict complementarity condition is satisfied, global convergence is proven. However, computational effort is large and the rate of convergence is at most linear.

Due to its superlinear convergence rate, the sequential quadratic programming (SQP) methods are considered among the most effective methods for solving nonlinear programming problems. A new algorithm is proposed by Zhou and Tits^[10] with taking advantage of the idea of SQP algorithms to solve the problem (1). There, in order to avoid the Maratos effect, it takes a nonmonotone line search along a direction d which is obtained by solving a QP subproblem. However, it is observed that, with such a line search and the direction d, it may not ensure that the full step of one is accepted. For this reason, it is necessary to perform a nonmonotone arc search along based on a correction which is obtained by solving another QP subproblem. Moreover, its convergence rate is two-step supperlinear.

A SQP type algorithm is proposed by Xue^[11] to solve the problem (1). However, in order to obtain the one-step superlinear convergence rate, there it is necessary to make two additional assumptions which are too strong: Firstly, the entire sequence generated by the algorithm converges to the K-T point of (1); Secondly, the stepsize is always equal to one after finite iterations.

In this paper, a new SQP algorithm is proposed to solve the problem (1) directly according to the K-T condition (3). This algorithm overcomes the shortcomings just pointed out. It performs a monotone line search along a direction which is obtained by solving only one QP subproblem. With such a particular line search, we point out, because of the intrinsic properties of the minimax problem (1) in contrast with the nonlinear programming, unlike^[10] there does not exist any Maratos-like effect and the search direction is not necessary to revise any more. Global convergence is obtained without the above-mentioned strong assumptions in^[9]. Under some suitable conditions which are weaker than those in^[11] one-step superlinear convergence rate is proven.

Description of algorithm: The following algorithm is proposed for solving problem (1).

Algorithm A
Step 1-Initialization and date: x⁰ ∈ Rⁿ, B₀ ∈ R^nxn a symmetric positive definite matrix.

Step 2-Computation of the main search direction d^k: Compute (z^k,d^k) by solving the following quadratic problem at x^k:

(5)

Let λ^k be the corresponding multipliers vector. If (z^k,d^k)= (0, 0), STOP;

Step 3-The line search: Compute t_k, the first number t of the sequence {1,½,¼,.....} satisfying

(6)

Step 4-Updates: Compute a new symmetric definite positive matrix B_k+1. Like Vardi^[8] we use Powell's modification^[12]

(7)

where:

(8)

Let x^k+1 = x^k + t_kd^k,k = k+1. Go back to step 2.

Global convergence of algorithm: It is first shown that Algorithm A is well defined. The following general assumptions are true throughout the paper.

A1: f_j(j=1~m) are continuously differentiable.

A2: For all x∈Rⁿ, the vectors are linearly independent.

A3: There exist two constants 0<a≤b, such that a||d||²≤d^TB_kd≤b||d||², for all k, for all d∈Rⁿ.

Lemma 1: The solution (z^k,d^k) of the quadratic subproblem (5) is unique and z^k+½(d^k)^T B_kd^k≤0.

Proof: Since (0,0) is a feasible point of (5) and B_k is positive definite, it is clear that the claim holds.

Lemma 2: Let (z^k,d^k) be the solution of the QP (5) at x^k, the following are equivalent:

(i)	(z^k,d^k)=(0,0);
(ii)	z^k=0;
(iii)	d^k=0;
(iv)	z^k+½(d^k)^T B_kd^k=0.

Proof: (i)⇒(ii). It is clear; (ii)⇒(iii) Since z^k=0, then d^k=0 by Lemma 1; (iii)⇒(iv). Since I(x^k)={j|f_j(x^k)=M_f(x^k),j∈I}is nonempty, let j∈I(x^k), (5) implies that z^k≥0. So, z^k+½(d^k)^T B_kd^k≥0, thereby, z^k+½(d^k)^T B_kd^k=0; (iv)⇒(i). Since z^k+½(d^k)^T B_kd^k=0, it is clear that(0,0) is unique solution of (5) by Lemma 1, that is to say (z^k,d^k)=(0,0).

Lemma 3: If (z^k,d^k)=(0,0), then x^k is a K-T point of (1). If x^k is not a K-T point of (1), then z^k<0, d^k≠0 and

Proof: From (5), we have

(9)

If (z^k,d^k)=(0,0), then

(10)

It shows that x^k is a K-T point.
If x^k is not a K-T point, then (z^k,d^k)≠(0,0). So,

z^k+½(d^k)^T B_kd^k<0, z^k<-½(d^k)^T B_kd^k≤0,

and

The claim holds.

Lemma 4: Algorithm A is well defined.

Proof: We only need to show that the line search yields a step t_k=(½)ⁱ for some finite i=i(k). Denote

For j∈I\I (x^k), since f_j(x^k)<, M_f(x^k) continuity of f_j implies that there exists some , such that a_k≤0.

For j∈I (x^k), from (5), the fact α<1 implies that there exists some , such that a_k≤0.

Define . It is clear that the line search condition (6) is satisfied for all t in .

In the sequel, we'll prove that any accumulation point x^* of {x^k}generated by the algorithm must be a K-T point of the problem (1). Since there are only finitely many choices for sets I(x^k)⊆I, we might as well assume that there exists a subsequence K, such that

(11)

where I_* is a constant set.

Lemma 5: If x^k⇒x^*, B_k⇒B_*, k∈K, then d^k⇒0, k∈K.

Proof: First of all, in view of (6) and Lemma 3, it is evident that {M_f(x^k)} is monotonous decreasing. Hence, considering to {x^k}_k∈K x*and continuity of M_f(x) it holds that

(12)

Suppose by contradiction that d^*≠0, then z^*<0. Since

Let k∈K, k⇒∞, then

The corresponding QP problem (5) at x^* is

(13)

It is clear that (z^*, d^*) is a feasible solution of (13) and B_* is positive definite. So, we might as well let be the unique solution of (13). Since

as k∈K, k⇒∞ we obtain that

So, it can be seen that thereby,

and for k∈K, k large enough, we have

(14)

From (14), similar to Lemma 4, we can conclude that the step-size t_k obtained by the linear search is bounded away from zero on K, i.e.,

t_k≥t_* = inf{t_k, k∈K}, k∈K

So, from (6), (12), we get

(15)

It is contradiction, which shows that d^k⇒0, k∈K.

According to Lemma 3, Lemma 5 and the fact that (z^*, d^*) is the solution of (13) , we can get the following convergent theorem.

Theorem 1: The algorithm A either stops at the K-Tpoint {x^k} of (1) in finite iteration, or generates an infinite sequence {x^k} whose all accumulation points are K-T point of (1).

Rate of convergence: Now we strengthen the regularity assumptions on the functions involved. Assumption A1 are replaced by:

A1^’:	The functions f_j(j=1~m) are two times continuously differentiable. We also make the following additional assumptions:
A4:	The sequence generated by the algorithm possesses an accumulation point a (in view of Theorem 1, a K-T point).
A5:	B_k⇒B_*, k⇒∞

A6:	The matrix s nonsingular and it holds that

where , is the corresponding K-T multipliers of x^* and

(16)

A7:

The strict complementary slackness are satisfied at the K-T point pair (x^*, u^*), i.e.,

Theorem 2: The Kuhn-Tucker point x^* is isolated.

Proof: This proof is similar to that of^[13] but with some technical differences since assumption A2 about the linear independence constraint qualification is different with that of the nonlinear programming and the details are omitted.

Lemma 6: The entire sequence {x^k} converges to the K-T point x^*, i.e., x^k⇒x^*, k⇒∞.

Proof: From (5) and (6), we have that

(17)

So, from (12), it holds that

Thereby, according to assumptions A6, A7, Theorem 2 and proposition by Painer and Tits^[14] we have x^k⇒x^*, K⇒∞

Lemma 7: For K⇒∞, we have

d^k⇒0,λ^k⇒u^*, L_k≡I(x^*), z^k=O (||d^k||)

where:

(18)

Proof: According to Lemma 5 and x^k⇒x^*, B_k⇒B_*, K⇒∞, it holds that

d^k ⇒ 0, z^k ⇒ 0, k ⇒ ∞.

Since (x^*, u^*) is a K-T point pair of (1), we have

Denote

From A2, it is clear that

and

(19)

At the same time, denote

From (9), we have

thereby,

(20)

Since

it is obvious , for k large enough, that

So,

In addition, according to d^k⇒0, z^k⇒0 and (18), it is clear that L_k⊆I(x^*).

On the contrary, for j∈I(x^*), assumption A7 implies that u_j^*>0. So, λ^k_j>0 for k large enough, thereby, it holds that

f_j(x^k)-M_f(x^k) +∇f_j(x^k)^Kd^k-z^k=0,i .e., j∈L_k.

So, L_k≡I(x^*).

Thereby, from I(x^k)⊆I_*≡L_k and A2, it is clear that Z^k=O(||d^k||).

In order to obtain superlinear convergence, we make the following assumption:

A8:

∇²f_j(x) (j=1~m) are Lipschitz continuous on some ball B(x^*, ∈)of radius ε>0 about x^*, i.e., there exist some v_j>0, such that

Lemma 8: Under assumptions A1-A8, ||x^k+d^k-x^*||=o(||x^k-x^*||) if and only if the following condition holds:

A9:

The sequence of matrices {B_k}satisfies

Proof: This proof is similar to the proof of Theorem by Xue^[11].

Lemma 9: If assumptions A1-A9 are true, then, for k large enough, t_k≡1.

Proof: Firstly, we prove, for k large enough, that

(21)

Let

s_j=f_j(x^k+d^k)-M_f (x^k)-αz^k=f_j(x^k)-M_f (x^k)+∇f_j(x^k)^Td^k-αz^k+o(||d^k||), j∈I.

Denote

I((x^k+d^k)={j|f_j(x^k+d^k) = M_f (x^k+d^k)}

it is obvious, for k large enough, that I(x^k+d^k)⊆I_*.

For j∈I\I(x^*), the facts f_j(x^*)<M_f(x^*) and d^k⇒0, z^k=O(||d^k||) imply that s_j≤0, ( for k large enough).
For j∈I\I_* (x^k+d^k), from (5), the facts α<1, d^k⇒0, z^k=O(||d^k||) imply that s_j≤0, ( for k large enough).
From above analysis, in order to prove that (21) are true, we only prove, for k large enough, that

f_j(x^k+d^k) = M_f(x^k+d^k)≤M_f(x^k)+αzx^k,∀_j∈I(x^k+d^k).

i.e.,

While, it is always proven that

Thereby, it only needs to prove that

i.e.,

From (9) and L_k≡I(x^*), we have

(22)

Denote

from (22) and A8, we get

Since a∈(0,½), it holds that s≤0. So, (21) are true, for k large enough. Thereby, we have, for k large enough, that

M_f(x^k+d^k)≤M_f(x^k)+αz^k

i.e., t_k≡1 . The claim holds.

From Lemma 8 and Lemma 9, we can get the following theorem:

Theorem 3: Under all above-mentioned assumptions, the algorithm is superlinearly convergent, i.e., ||x^k+1-x^*||=o(||x^k-x^*).

Numerical experiments: We carry out numerical experiments based on the algorithm. The results show that the algorithm is effective.

During the numerical experiments α=0.25 and B₀=I, the nxn unit matrix. B_k is updated by the BFGS formula (8). In following Tables, IP is the initial point, NIT is the number of iterations of algorithm, AS is the approximate solution.

Problem 1: (^[9]). Minimize the maximum of the following three functions:

Problem 2: (^[11]). Minimize the maximum of the following six functions:

Problem 3: Rosen-Suzuki Problem (^[9,11]). Minimize the maximum of the following functions:

Problem 4: Wong problem 1 (^[9,11]). Minimize the maximum of the following functions:

Problem 5: Wong problem 2 (^[11]). Minimize the maximum of the following functions:

In this paper, detailed information about solutions to above mentioned problems is listed as follows:

Table 1:	The information of the solution to Problem 1

Table 2:	The information of the solution to Problem 2

Table 3:	The information of the solution to Problem 3

Table 4:	The information of the solution to Problem 4

Table 5:	The information of the solution to Problem 5

REFERENCES

Zang, I., 1980. A smoothing-out technique for min-max optimization. Math. Program., 19: 61-77.
CrossRef
Bandler, J.W. and C. Charalambous, 1974. Nonlinear programming using minimax techniques. JOTA., 13: 607-619.
CrossRef
Charalambous, C. and J.W. Bandler, 1976. Nonlinear minimax optimization as a sequence of least pth optimization with finite values of p. Int. J. Syst. Sci., 7: 377-391.
CrossRef
Osborne, M.R. and G.A.Watson, 1969. An algorithm for minimax approximation in the nonlinear case. Comput. J., 12: 63-68.
CrossRef
Bandler, J.W., T.V. Srinivasan and C. Charalambous, 1972. Minimax optimization of networks by grazor search. IEEE Trans. Micowave Theorey Tech., 20: 596-604.
CrossRef
Barrientos, O., 1998. A global regularizaiton method for solving the finite min-max problem. Comput. Optimization Applic., 11: 277-295.
CrossRef
Polak, E., D.Q. Mayne and J.E. Higgins, 1991. Superlinearly convergent algorithm for min-max problems. J. Optimiz. Theory Appl., 69: 407-439.
CrossRef
Vardi, A., 1992. New minimax algorithm. JOTA., 75: 613-634.
Charalambous, C. and A.R. Conn, 1978. An efficient method to solve the minimax problem directly. SIAM J. Numerical Anal., 15: 162-187.
Direct Link
Zhou, J.L. and A.L. Tits, 1993. Nonmonotone line search for minimax problems. J. Optimiz. Theory Appl., 76: 455-476.
CrossRef
Xue, Y., 2002. The sequential quadratic programming method for solving minimax problem. J. Sys. Sci. Math. Sci., 22: 355-364.
Robinson, S.M., 1972. A quadratically convergent algorithm for general nonlinear programming problems. Math. Program., 3: 145-156.
CrossRef
Painier, E.R. and A.L. Tits, 1987. A Superlinearly Convergent Feasible Method for the Solution of Inequality Constrained Optimization Problems. SIAM J. Control. Opti., 25: 934-950.
CrossRef
Powell, M.J.D., 1978. The Convergence of Variable Metric Methods for Nonlinearly Constrained Optimization Calculations. In: Nonlinear Programming 3, Meyer, R.R. and S.M. Robinson (Eds.). Academic Press, New York.

Journal of Applied Sciences

Research Article

An SQP Method for Solving the Nonlinear Minimax Problem

ABSTRACT

How to cite this article

Search

INTRODUCTION

ACKNOWLEDGMENT

REFERENCES

Search

Leave a Comment