Chapter 2 - Classical Theory of Maxima and Minima
- Introduction
- Analytical Methods without Constraints
- Analytical Methods Applicable for Constraints
- Necessary and Sufficient Conditions for Constrained Problems
- Closure
- References
- Problems
Introduction
The classical theory of maxima and minima (analytical methods) is concerned with finding the maxima or minima, i.e., extreme points of a function. We seek to determine the values of the n independent variables x1,x2,...xn of a function where it reaches maxima and minima points. Before starting with the development of the mathematics to locate these extreme points of a function, let us examine the surface of a function of two independent variables, y(x1, x2), that could represent the economic model of a process. This should help visualize the location of the extreme points. An economic model is illustrated in Figure 2-1(a) where the contours of the function are represented by the curved lines. A cross section of the function along line S through the points A and B is shown in Figure 2-1(b), and in Figure 2-1(c) the first derivative of y(x1, x2) along line S through points A and B is given.
In this example, point A is the global maximum in the region and is located at the top of a sharp ridge. Here the first derivative is discontinuous. A second but smaller maximum is located at point B (a local maximum). At point B the first partial derivatives of y(x1, x2) are zero, and B is called a stationary point. It is not necessary for stationary points to be maxima or minima as illustrated by stationary point C, a saddle point. In this example, the minima do not occur in the interior of the region, but on the boundary at points D and E (local minima). To determine the global minima, it is necessary to compare the value of the function at these points.
In essence, the problem of determining the maximum profit or minimum cost for a system using the classical theory becomes one of locating all of the local maxima or minima, and then comparing the individual values, to determine the global maximum or minimum. The example has illustrated the places to look which are:
- at stationary points (first derivatives are zero)
- on the boundarie
- at discontinuities in the first derivative
When the function and its derivatives are continuous, the local extreme points will occur at stationary points in the interior of the region. However, it is not necessary that all stationary points be local extreme points, since saddle points can occur also.
Analytics Methods without Constraints
Locating Local Maxima and Minima (Necessary Conditions)
Based on geometric intuition from the previous example, we can understand the famous Weierstrass' theorem (11,12) which guarantees the existence of maxima and minima. It states:
Every function which is continuous in a closed domain possesses a maximum and minimum Value either in the interior or on the boundary of the domain.
The proof is by contradiction.
There is another theorem (13) which tells how to locate extreme points in the interior of a region of a continuous function. It states:
A continuous function of n variables attains a maximum or a minimum in the interior of a region, only at those values of the variables for which the n partial derivatives either vanish simultaneously (stationary points) or at which one or more of these derivatives cease to exist (i.e., are discontinuous).
The proof involves examining the Taylor Series expansion at the points where the partial derivatives either vanish or cease to exist.
Thus the problem becomes one of locating points where the partial derivatives are zero or where some of them are discontinuous. The stationary points can be located by solving the algebraic equations which result in setting the partial derivatives of the function equal to zero. Also the algebraic equations must be examined for points of discontinuities, and this has to be accomplished by inspection.
Evaluating Local Maxima and Minima (Sufficient Conditions)
As we have seen, it is not necessary for all stationary points to be local maxima and minima, since there is a possibility of saddle or inflection points. Now we need to develop procedures to determine if stationary points are maxima or minima. These sufficient conditions will be developed for one independent variable first and then extended for two and n independent variables, using the same concepts. Once the local maxima and minima are located, it is necessary to compare the individual points to locate the global maximum and minimum.
Sufficient Conditions for One Independent Variable
To develop criteria establishing whether a stationary point is a local maximum or minimum, we begin by performing a Taylor series expansion about the stationary point xo.
y(x) = y(xo) + y'(xo) (x - xo) + ½ y''(xo) (x - xo)2 + higher order terms
Now, select x sufficiently close to xo so the higher order terms become negligible compared to the second-order terms. Since the first derivative is zero at the stationary point, the above equation becomes
y(x) = y(xo) + ½y" (xo) (x - xo) (2-1)
We can determine if xo is a local maximum or minimum by examining the value of y"(xo), since (x - xo)2 is always positive. If y"(xo) is positive, then the terms ½y"(xo) (x - xo)2 will always add to y(xo) in equation (2-1) for x taking on values that are less than of greater than xo. For this case y(xo) is a local minimum. This is summarized in the following:
______________________________________________________________
If y''(xo) > 0 then y(xo) is a minimum
y''(xo) < 0 y(xo) is a maximum
y''(xo) = 0 no statement can be made
______________________________________________________________
If the second derivative is zero, it is necessary to examine higher order derivatives. In general if y''(xo) = ... = yn-1(xo) = 0, the Taylor series expansion becomes
y(x) = y(xo) +(1/n!) y(n)(xo) (x - xo)n (2-2)
If n is even, then (x - xo)n is always positive, and the result is:
______________________________________________________________
If y(n)(xo) > 0 then y(xo) is a minimum
y(n)(xo) < 0 y(xo) is a maximum
______________________________________________________________
If n is odd then (x - xo)n changes sign as x moves from x < xo to x > xo, and thus
there is an inflection point. These results can be summarized in the following theorem
(1).
If at a stationary point the first and possibly some of the higher derivatives vanish, then the point is or is not an extreme point, according as the first non-vanishing derivative is of even or odd order. If it is even, there is a maximum or minimum according as the derivative is negative or positive.
The proof of this theorem follows the discussion given above. The following example illustrates the principles discussed.
Example 2-1
Locate the extreme points of the following two functions:
y(x) = x4/4 - x2/2
y'(x) = x3 - x = x(x2 - 1) = x(x - 1)(x+1) = 0
Stationary points are x = 0,1,-1
y"(x) = 3x2 - 1
y"(0) = -1 maximum
y"(1) = 2 minimum
y"(-1) = 2 minimum
y(x) = x5
y'(x) = 5x4 = 0 stationary point is x = 0
y"(x) = 20x3 y"(0) = 0
y"'(x) = 60x2 y"'(0) = 0 no statement can be made
y(4)(x) = 120x y(4)(0) = 0
y(5)(x) = 120 y(5)(0) = 120 n is odd, and the stationary
point is an inflection point.
Sufficient Conditions for Two Independent Variables
To develop the criteria for a local maximum or minimum for a stationary point xo(x10, x20) of a function of two variables, a Taylor's series expansion is made about this point.
y(x1, x2) = y(x10, x20) + yx 1 (x1-x10) + yx 2 (x2-x20)
+ ½[yx1 x1 (x1-x10)2 + 2yx1 x2 (x1-x10)(x2-x20) (2-3)
+ yx2 x2 (x2-x20)2] + higher order term
where the subscripts x1 and x2 indicate partial differentiation with respect to those variables and evaluation at the stationary point.
Again we select y(x1, x2) sufficiently close to y(x10, x20) so the higher order terms become negligible compared to the second order terms. Also, the first derivatives are zero at the stationary point. Thus equation (2-3) can be written in matrix form as:
In matrix-vector notation the above equation can be written as
y(x) = y(xo) + ½[(x - xo)T Ho (x - xo)] (2-5)
where Ho is the matrix of second partial derivatives evaluated at the stationary point xo and is called the Hessian matrix.
The term in the bracket of equation (2-5) is called a differential quadratic form, and y(xo) will be a minimum or a maximum accordingly if this term is always positive or always negative. Based on this concept, it can be shown (1) that if the following results apply xo is a maximum or a minimum. If they do not hold, xo could be a saddle point and is not a maximum or a minimum.
An illustration of the above results is given in example 2-2. The term in the bracket of equation(2-5) is an example of a quadratic form. It will be necessary to describe a quadratic form briefly before giving the sufficient conditions for maxima and minima for n-independent variables.
Sign of a Quadratic Form
To perform a similar analysis for a function with more than two independent variables, it is necessary to determine what is called the sign of the quadratic form. The general quadratic form (1) is written as:
where aij are the components of symmetric matrix A, e.g., aij = aji.
It turns out (1) that we can determine if Q is always positive or negative, for all finite values of xi and xj, by evaluating the signs of Di, the determinants of the principal submatrices of A.
The important results that will be used subsequently are:
______________________________________________________________________________
If Di > 0 for i = 1,2,...n then: A is positive definite and Q(A,x) > 0
If Di < 0 for i = 1,3... and
Di > 0 for i = 2,4,... then: A is negative definite and Q(A,x) < 0
If Di is neither of these, then: Q(A,x) and depends on the values of xi and xj.
______________________________________________________________________________
Sufficient Conditions for N Independent Variables
The result of the previous two sections can be extended to the case of n independent variables by considering the Taylor series expansion for n independent variables around stationary point xo:
Again select x sufficiently close to xo, so the higher order terms become negligible compared to the second order terms. Also, the first derivatives are zero at the stationary point. Thus, equation (2-8) can be written in matrix-vector notation as:
y(x) = y(xo) + ½(x - xo)T Ho (x-xo) (2-9)
where x is the column vector of independent variables, and Ho, the matrix of second partial derivative evaluated at the stationary point xo, in the Hessian matrix. This is the same equation as equation (2-5), which was written for two independent variables.
The second term on the right hand side of equation (2-9) is called a differential quadratic form as shown below.
Q[ Ho, (x-xo)] = (x-xo)T Ho (x-xo) (2-11)
Equation (2-11) corresponds to equation (2-6) in the previous section, and the determinants of the principal submatrices of Ho as defined below correspond to equation (2-7).
We can now use the same procedure in evaluating the character of the stationary points for n independent variables. For example, if the term containing the Hessian matrix is always positive for perturbations of the independent variables around the stationary point, then the stationary point is a local minimum. For this differential quadratic form to be positive always, the |Ho| > 0 for i = 1,2,...n. The same reasoning can be applied for a local maximum, and the results for these two cases are summarized below:
___________________________________________________________________
y(xo) is a minimum if |Hoi| > 0 for i = 1,2,...,n
y(xo) is a maximum if |Hoi| < 0 for i = 1,3,5,...
|Hoi| > 0 i = 2,4,6,...
___________________________________________________________________
If zeros occur in the place of some of the positive or negative number in the tests above (semi-definite quadratic form), then there is insufficient information to determine the character of the stationary point (1). As discussed in Avriel (10) higher order terms may have to be examined, or local exploration can be performed. If the test is not met (indefinite quadratic form), then the point is neither a maximum or minimum (1). The following theorem from Cooper (7) summarize these results. It states:
If y(x) and its first two partial derivatives are continuous, then a sufficient condition
for y(x) to have a relative minimum (maximum) at xo, the when dy(xo)/dxj = 0, j =
1,2, ...n, is that Hessian matrix be positive definite (negative definite).
The proof of this theorem employs arguments similar to those given above.
The following example illustrates these methods.
Example 2-2
The flow diagram of a simple process is shown in Figure 2-2 (2) where the hydrocarbon feed is mixed with recycle and compressed before being passed into a catalytic reactor. The product and unreacted material are separated by distillation, and the unreacted material is recycled. The pressure, P, in psi and recycle ratio, R, must be selected to minimized the total annual cost for the required production rate of 107 pounds per year. The feed is brought up to pressure at an annual cost of $1000P, mixed with the recycle stream, and fed to the reactor at an annual cost of $4 x 109/PR. The product is removed in a separator at a cost of $105R per year, and the unreacted material is recycled in a recirculating compressor which consumes $1.5 x 105R annually. Determine the optimal operating pressure, recycle ratio, and total annual cost; and show that the cost is a minimum
Solution: The equation giving the total operating cost is:
C ($/yr.) = 1000P + 4 x 109/ P R + 2.5 x 105R
Equating the partial derivatives of C with respect P and R to zero gives two algebraic
equations to be solved for P and R.
dC/dP = 1000 - 4 x 109 / R P2 = 0
dC/dR = 2.5 x 105 - 4 x 109 / P R2 = 0
Solving simultaneously gives
P = 1000psi and R = 4
Substituting to determine the corresponding total operating cost gives
C = $ 3 x 106 per year
C (P,R) is a minimum if
Performing the appropriate partial differentiation and evaluation at the stationary point ( P = 1000, R = 4) gives
Thus, the stationary point is a minimum since both determinants are positive.
Analytical Methods Applicable for Constraints
To this point independent variables could take on any value. In actuality, the values of the independent variables are limited since they usually represent physical quantities such a flow rates, temperatures, pressures, process unit capacities and available resources. Consequently, there are constraints on variables, if nothing more than the fact that they must be non-negative. In many cases they are bounded within limits as dictated by the process equipment and related by equations such as material balances. The constraints on the variables can be of the form of equations and inequalities.
Methods to locate the stationary points of functions (economic models) subject to equality constraints (e.g., material and energy balance equations) will be developed, and examples illustrating the techniques will be given. Inequality constraints can be converted to equality constraints, and then these procedures for equality constraints can be applied with some additional considerations.
Let us illustrate the conversion of an inequality constraint to an equality constraint using a simple example to help visualize the concept of slack variables. Figure 2-3 is an example of an equality and an inequality constraint for a distillation column. The material balance which says that the feed rate to the column must equal the sum of the overhead and bottom products at steady state is the quality constraint. The upperlimit on the capacity of the distillation column, which was set when the equiptment was designed, is the inequality constraint. This inequality constraint can be converted to an equality constraint by adding a slack variable S as S2 to ensure a positive number has been added to the equation.
F + S2 = 50,000 (2-13)
The term slack is used to represent the difference between the optimal and upper limit on the capacity. It represents the unused, excess, or slack in capacity of the process unit. For example, if Fopt= 30,000 barrels per day; then S2 = 20,000 barrels per day, a slack of 20,000 barrels per day; and the constraint is said to be loose, i.e., the inequality holds. If Fopt = 50,000 barrels per day then there is no slack, and the constraint is said to be tight, i.e., the equality holds. This will be discussed in more detail later in the chapter. Also, if there was a lower limit on F, e.g., F > 10,000, the same procedure would apply except S2 would be subtracted from F. The equation would be F - S2 = 10,000, and S is called a surplus variable.
We can now state a general optimization problem with n-independent variables and m equality constraints where the objective is to optimize (maximize or minimize) the economic model y(x) subject to m constraint equations fi (x).
optimize: y(xi,x2,...xn) (2-14)
subject to: fi(x1,x2,...,xn) = 0 (2-15)
for i - 1,2,...m
There must be fewer equality constraints than independent variables to be able to optimize y(x), i.e., n > m. If m = n the values of the xj's are uniquely determined, and there is no optimization problem. Also if m > n, the problem is said to be overdetermined, because there are more equations than unknowns. There is no optimization problem for this case either.
There are three methods of locating the optimum points of the function y(x1,x2,...,xn) of n independent variables subject to m constraint equations fi(x1,x2,...,xn) = 0. These are: direct substitution, solution be constrained variation and method of Lagrange multipliers. We will find that direct substitution can not always be used, and the method of Lagrange multipliers will be the one most frequently employed.
Direct Substitution
This simply means to solve the constraint equations for the independent variables and to substitute the constraint equations directly into the function to be optimized. This will give an equation (economic model) with (n-m) unknowns, and the previous techniques for unconstrained optimization are applicable.
Unfortunately, it is not always possible to perform the algebraic manipulations required for these substitutions when the constraint equations are somewhat complicated. Consequently, it is necessary to resort to the following methods.
Constrained Variation
This method (3,14) is used infrequently, but furnishes a theoretical basis for important multivariable numerical search methods, such as the generalized reduced gradient. It is best illustrated for the case of two independent variables by considering the example shown in Figure 2-4. There is a local minimum of the constrained system at point a and a local maximum at point B. The maximum of the unconstrained system is at C.
At point A the curve y(x1,x2) = 1 and the curve f(x1,x2) = 0 are tangent and have the same slope. This means that differential changes, dx1 and dx2, produce the same change in the dependent variables y(x1,x2) and f(x1,x2). This can be expressed as:
We will need the total derivatives of y and f to combine with equation (2-16) to obtain the final result. Using the first terms in a Taylor series expansion for y and f gives:
At the minimum, point A, and the maximum, point B, dy is equal to zero; and the constraint is satisfied, i.e., f = 0 and df = 0.
Combining equations (2-16) and (2-17) gives the following result.
This is an algebraic equation, and it is to be solved in combination with the constraint equation to locate the stationary points. It should be remembered in this case that dy/dx1 and dy/dx2 are not necessarily zero. They are zero in the unconstrained case at point C, however.
This technique is illustrated with the following example. Then the extension to the general case for n independent variables will be given.
Example 2-3
Find the stationary points of the following function using the method of constrained variation
optimize: y(x) = x1 x2
subejct to: f(x) = x21 + 22 - 1 = 0
The first partial derivatives are:
Substituting into equation (2-18) gives
x2 · 2x2 - x1 · 2x1 = 0 or x22 - x12 = 0
which is solved simultaneously with the constraint equation
x12 + x22 - 1 = 0
The result is:
x1 = + (½)½ and x2 = + (½)½
for the values of the independent variables at the stationary points.
In general we are interested in finding the stationary points of a function y(x1, x2,..., xn) subject to m constraint equations fi(x1, x2, ..., xn) = 0 where i = 1, ...m, and n > m. The same reasoning applied in (n +1) dimensional space as applied to the three-dimensional space above, and this results in the following equations:
The set of equations given in equation (2-19) above can be solved for (n-m) equations to go with the m constraint equations to locate the stationary points. The (n-m) equations corresponding to equation 2-18 of the two independent variable case can be written in terms of (n-m) Jacobian determinants which are:
The Jacobian determinant for the first equation above is:
A total of n equations are solved for the stationary points, i.e., the (n-m) equation generated by (2-20) above and the m constraint equations. A derivation of these results is given by Beveridge and Schechter(6). This involves using Cramer's rule and eliminating the dxi's. Also similar results are given for this general case in the text by Wilde and Beightler (4). However, a different nomenclature is used, and the results are extended to include Lagrange Multipliers.
To illustrate the use of the Jacobian determinants, consider the following example, which obtains equation (2-18).
Example 2-4
optimize: y(x1,x2)
subject to: f(x1, x2) = 0
For this problem there are 2 independent variables (n=2) and one constraint (m=1),
so the evaluation of one Jacobian determinant is required.
Expanding gives the following equation, gives:
This is the same as equation (2-18) which was solved with the constraint equation for the stationary point values of x1 and x2 in Example 2-3.
Lagrange Multipliers
The most frequently used method for constraints is to employ Lagrange multipliers. The technique will be presented using two independent variables and one constraint equation to illustrate the concepts. Then the procedure will be extended to the general case of n independent variables and m constraint equation. For the case of two independent variables, we have:
Optimize: y(x1, x2) (2-22)
Subject to: f(x1, x2) = 0
We want to show how the Lagrange Multiplier arises and that the constrained problem
can be converted into an unconstrained problem. The profit function and the constraint
equation are expanded in a Taylor series. Then, using the first order terms gives:
This form of the constraint equation will be used to eliminate dx2 in the profit function. Solving for dx2gives:
This equation is substituted into the equation for dy to obtain:
and rearranging gives:
Now we can define l as the value of [-dy / dx2 / df / dx2] at the stationary point of the constrained function. This ratio of partial derivatives l is a constant at the stationary point, and the above equation can be written as
At the stationary point dy = 0, and this leads to:
Now if L is defined as L = y + lf, the above gives:
This is one of the necessary conditions to locate the stationary points of an unconstrained function L which is constructed from the profit function y(x1,x2) and the constraint equation f(x1,x2) = 0. Now the same manipulations can be repeated to obtain the other necessary conditions:
Therefore, the constrained problem can be converted to an unconstrained problem by forming the Lagrangian, or augmented, function and solving this problem by the previously developed methods of setting the first partial derivatives equal to zero. This will give two equations to solve for the three unknowns x1, x2 and l at the stationary point. The third equation to be used is the constraint equation. In fact the Lagrange multiplier is sometimes treated as another variable since dL / dl = 0 gives the constraint equation. The example used for the method of constrained variation will be used to illustrate these ideas.
Example 2-5:
Find the stationary points for the following constrained problem using the method of Lagrange multipliers.
Optimize: y(x) = x1x2
Subject to: f(x) = x12 + x22 - 1 = 0
The Lagrangian, or augmented, function is formed as shown below.
L(x1, x2, l) = x1x2 + l (x12 + x22 - 1)
The following equations are obtained from setting the first partial derivatives equal to zero.
¶L/¶x1 = x2 + 2lx1 = 0
¶L/¶x2 = x1 + 2lx2 = 0
¶L/¶l = x12 + x22 - 1 = 0
Solving the previous equations simultaneously gives the following stationary points:
maxima : x1 = (½)½ , x2 = (½)½ , l = -½
x1 = -(½)½ , x2 = -(½)½ , l = - ½
minima : x1 = (½)½ , x2 = -(½)½ , l = ½
x1 = -(½)½, x2 = (½)½ , l = ½
The type of stationary points, i.e., maxima, minima or saddle points were determined
by inspection for this problem. Sufficient conditions for constrained problems will
be discussed subsequently in this chapter.
The development of the Lagrangian function of the case of n independent variables and m constraint equations is a direct extension from that of two independent variables and one constraint equation, and Avriel (10) gives a concise derivation of this result. (See problem 2-14) The Lagrangian, or augmented, function is formed as previously, and for every constraint equation there is a Lagrange multiplier. This is shown below:
optimize: y(x) x = (x1, x2 ,..., xn)T
(2-23)
subject to: fi(x) = 0 for i = 1,2,...,m
where n > m
The Lagrangian, or augmented, function is formed from the constrained problem as follows:
To locate the stationary points of the constrained problem, the first partial derivatives
of the Lagrangian function with respect to the xj's and l i 's are set equal to zero
(necessary conditions). There are (n + m) equations to be solved for the (n + m) unknowns:
n - xj's and m - l i's.
It is sometimes said that the method of Lagrange multipliers requires more work than the method of constrained variation, since an additional m equations have to be solved for the values of the Lagrange multipliers. However, additonal and valuable information is obtained from knowing the values of the Lagrange multipliers, as will be seen. The following simple example gives a comparison among the three techniques.
Example 2-6
For the process in Example 2-1 (2), it is necessary to maintain the product of the pressure and recycle ratio equal to 9000 psi. Determine the optimal values of the pressure and recycle ration and minimum cost within this constraint by direct substitution, constrained variation and Lagrange multipliers.
Again, the problem is to minimize C.
C = 1000P + 4 x 109/PR + 2.5 x 105R
However, C is subject to the following constraint equation.
PR = 9000
Direct Substitution: Solving the constraint above for P and substituting into the
objective function gives:
C = 9 x 106/R + (4/9) x 106 + 2.5 x 105R
Setting dC/dR = 0 and solving gives:
R = 6 and P = 1500 psi.
The corresponding cost is:
C = 3.44 x 106
which is greater than the unconstrained system, as would be expected.
Constrained Variation: The equations to be solved for this case are:
The first equation simplifies to:
P = 250R
which, when solved simultaneously with the second equation gives the same results
as direct substitution.
Lagrange Multipliers: The Lagrangian, or augmented, function is:
L = 1000P + 4 x 109/PR + 2.5 x 105R + l(PR - 9000)
Setting partial derivatives of L with respect to P, R, and l equal to zero gives:
1000 - 4 x 109/P2R + lR = 0
2.5 x 105 - 4 x 109/PR2 + lP = 0
PR - 9000 = 0
Solving the above simultaneously gives the same results as the two previous methods
and a value for the Lagrange Multiplier.
P = 1500, R = 6, l = -117.3
Method of Steepest Ascent
A further application of the method of Lagrange Multipliers is developing the method of steepest ascent (descent) for a function to be optimized. This result will be valuable when search methods are discussed.
To illustrate the direction of steepest ascent, a geometric representation is shown in Figure 2-5. To obtain the direction of steepest ascent, we wish to obtain the maximum value of dy, and y(x1,x2,...xn) is a function of n variables. Also, there is a constraint equation relating dx1, dx2, ... dxn and ds as shown in Figure 2-5for two independent variables.
The problem is:
To obtain the maximum value of dy, the Lagrangian function is formed as follows:
Differentiating L with respect to the independent variables dxj and equating to zero gives:
These n equation are solved simultaneously with the constraint equation for the values of dxj and . Solving for gives:
and solving for dxj gives:
The term in the brackets is not a function of j, and consequently dxj is proportional to ¶y/¶xj. The positive sign indicates the direction of steepest ascent and the negative sign indicates the direction of steepest descent.
If a constant k is used to represent the term in the brackets in equation (2-29), this equation can be written as:
If a finite difference approximation is used for dxj = (xj - xjo) and ¶y/¶xj is evaluated at xo, then the following equation gives the gradient line.
This equation can be written vector notation in terms of the gradient of y evaluated at xo,y (xo), as:
x = xo + kÑy ( xo ) (2-32)
If the positive sign is used, then movement is along the line in the direction of steepest ascent, and if the negative sign is used then movement is along the line in the direction of steepest descent. The following example illustrates the use of the method of steepest descent on a simple function.
Example 2-7:
Find the minimum along the direction of steepest descent of the function given below starting at the point xo = (1,1).
y = x12 + x22
Gradient line (steepest descent):
x = xo - kÑy(xo)
or for two independent variables
Evaluating the partial derivatives at the starting point (1,1)
The gradient line is
x1 = 1 - 2k
x2 = 1 - 2k
Substituting the gradient line into the function to be minimized gives:
y = (1 - 2k)2 + (1 - 2k)2 = 2(1 - 2k)2
Computing dy/dk will locate the minimum along the gradient line, i.e.,
The corresponding values of x1 and x2 are
x1 = 1 - 2(½) = 0 x2 = 1 - 2(½) = 0
It turns out that the minimum along the gradient line is also the minimum for the function in this problem, because it is the sum of squares.
The method of steepest ascent is the basis for several search techniques which are described in Chapter 6, e.g., Steep Ascent Partan. It should be noted that when dealing with physical systems, the direction of steepest ascent (descent) may be only a direction of steep ascent (descent) depending on the scales used to represent the independent variables. This is discussed and illustrated by Wilde (5), and Wilde and Beightler (4).
Economic Interpretation of the Lagrange Multipliers
The values of the Lagrange Multipliers at the optimum provide additional and important information. If the constraint equations are written with parameters bi on the right-hand side, the Lagrange multipliers give the change in the profit function with respect to these parameters, i.e., ¶y/¶bi. Many times, the right-hand sides of the constraint equations represent the availability of raw materials, demand for products, or capacities of process units. Consequently, it is frequently important to know how the optimal solution is affected by changes in availability, demand, and capacities. As we shall see, the Lagrange Multipliers are given the names "shadow prices" and "dual activity" in linear programming where these changes are analyzed by sensitivity analysis.
The following brief derivation obtains the result that ¶y/¶b = - l for the case of one constraint and two independent variables, and the extension to m constraint equations with n independent variables is comparable.
optimize: y(x1, x2)
(2-33)
subject to: f(x1, x2) = b
First, we can obtain the following equation from the profit function by the chain
rule.
Also, we can obtain the next equation from the constraint equation written as f - b = 0 by the chain rule.
Then the equation from the constraint is multiplied by the Lagrange Multiplier and added to the equation from the profit function to give:
The values of ¶L/¶x1 and ¶L/¶x2 are zero at the stationary point (necessary conditions), and consequently¶y/¶b = -l. Thus, the change in the profit function y with respect to the right-hand side of the constraint b is equal to the negative of the Lagrange Multiplier. Also, comparable results can be obtained for the case on n independent variables and m constraint equations to obtain the following result using a similar procedure and arguments(7).
In the next section, we will see that the Lagrange Multiplier is also a key factor in the analysis of problems with inequality constraints.
Inequality Constraints
An additional complication arises when seeking the optimum value of a profit or cost function if inequality constraints are included. Although the same procedures are used, it will be necessary to consider two cases for each inequality constraint equation. One case is when the Lagrange Multiplier is zero, and the other is when the Lagrange Multiplier is not zero. This is best illustrated by the following example with one inequality constraint equation as shown below.
Optimize: y(x)
(2-38)
Subject to: f(x) £ 0
As described previously, the procedure is to add a slack variable xs as xs2 and form
the Lagrangian function:
L(x, l) = y(x) + l [ f(x) + xs2 ] (2-39)
Then the first partial derivatives with respect to the xi's, xs, and l are set equal to zero to have a set of equations to be solved for the stationary points. To illustrate the complication, the equation obtained for the slack variable is:
The result is two cases, i.e., either l = 0 and xs ¹ 0, or l ¹ 0 and xs = 0. If l = 0 and xs ¹ 0 the inequality holds, and the constraint is said to be "loose", "passive" or "inactive". If l ¹ 0 and xs = 0 then the equality holds, and the constraint is said to be "tight" or "active". The following example illustrates this situation using a modification of the previous simple process.
Example 2-8:
For the process the cost function is:
C = 1000P + 4 x 109/PR + 2.5 x 105 R
However, C is subject to the inequality constraint equation.
PR < 9000
Adding the slack variable S, as S2, and forming the Lagrangian function gives:
L = 1000P + 4 x 109/PR + 2.5 x 105 R + l(PR + S2 - 9000)
Setting the first partial derivatives of L with respect to P, R, S, and l equal to
zero gives the following four equations:
The two cases are l ¹ 0, S = 0 and l = 0, S ¹ 0. For the case of l ¹ 0, S = 0 the equality PR = 9000 holds, i.e., the constraint is active. This was the solution obtained in Example 2-6, and the results were:
C = $3.44 x 106 per year P = 1500 psi R = 6 l = -117.3
For the case of l = 0, S ¹ 0, the constraint is an inequality, i.e., inactive. This
was the solution obtained in Example 2-2 and the results were:
C = $3.0 x 106 per year P = 1000psi R = 4 S = (5000)½
The example above had only one inequality constraint and two cases to consider. However, with several inequality constraints locating the stationary points can become time-consuming, for the possibilities must be searched exhaustively. A procedure for this evaluation has been given by Cooper (7) and Walsh (8) as follows:
Solve the problem of optimizing: y(x), ignoring the inequality constraints, i.e.,
having all positive slack variables. Designate this solution xo. If xo satisfies the
constraints as inequalities, an optimum has been found.
If one or more constraints are not satisfied, select one of the constraints to be
an equality, i.e., active (the slack variable for this constraint is zero), and solve
the problem. Call this solution x1. If x1satisfies all of the constraints, an optimum
has been found.
If one or more constraints are not satisfied, repeat step 2 until every inequality constraint has been treated as an equality constraint (slack variable being zero) in turn.
If step 3 did not yield an optimum, select combinations of two inequality constraints at a time to be equalities, and solve the problem. If one of these solutions satisfies all of the constraints, an optimum has been found.
If step 4 did not yield an optimum, select combinations of three inequality constraints at a time to be equalities, and solve the problem. If one of these solutions satisfies all of the constraints, an optimum has been found. If not try combinations of four inequality constraints at a time to be equalities, etc.
The above procedure applies, assuming that the stationary point located is a maximum or a minimum of the constrained problem. However, there is a possibility that several stationary points will be located; some could be maxima, some minima, and others saddle points. In Problem 2.6, four stationary points were found; two are maxima, one a minimum, and one a saddle point. Also, from equation (2-40) for each inequality constraint where the strict inequality holds, the slack variable is positive, and the Lagrange Multiplier is zero. For each inequality constraint where the equality holds, the slack variable is zero, and the Lagrange Multiplier is not zero.
In the next section necessary and sufficient conditions for constrained problems are described to determine the character of stationary points. This will be similar to and an extension of the previous discussion for unconstrained problems.
Necessary and Sufficient Conditions for Constrained Problems
The necessary conditions have been developed by Kuhn and Tucker (14) for a general nonlinear optimization problem with equality and inequality constraints. This optimization problem written in terms of minimizing y(x) is:
Minimize: y(x) (2-41)
Subject to: fi(x) £ 0 for i = 1, 2, ..., h (2-42)
fi(x) = 0 for i = h+1, ..., m (2-43)
where y(x) and fi(x) are twice continuously differentiable real valued functions.
Any value of x that satisfies the constraint equations (2-42) and (2-43) is called a feasible solution to the problem in the Kuhn-Tucker theory. Then to locate points that can potentially be local minima of equation (2-41) and satisfy the constraint equations (2-42) and (2-43), the Kuhn-Tucker necessary conditions are used. These conditions are written in terms of the Lagrangian function for the problem which is:
where the xn+i's are the surplus variables used to convert the inequality constraints
to equalities.
The necessary conditions for a constrained minimum are given by the following theorem (7,8,10,14).
In order to minimize y(x) subject to fi(x)£ 0, i = 1,2, ..., h and fi(x) = 0, i =
h+1, ..., m, the necessary conditions for the existence of a relative minimum at x* are:
Examining these conditions, the first one is setting the first partial derivatives
of the Lagrangian function with respect to the independent variables x1, x2, ...,
xn equal to zero to locate the Kuhn-Tucker point, x*. The second and third conditions
are repeating the inequality and equality constraint equations which must be satisfied
at the inequality and equality constraint equations which must be satisfied at the
minimum of the constrained problem. The fourth conditions is another way of expressing l ixn+i =
0, i = 1, 2, ..., h from setting the partial derivatives of the Lagrangian function
with respect to the surplus variables equal to zero. Either li ¹ 0 and xn+i = 0 (constraint
is active) or l i = 0 and xn+i ¹ 0 (constraint is inactive). Thus, the product of
the Lagrange Multiplier and the constraint equation set equal to zero is an equivalent
statement, and this is called the complementary slackness condition (15). The fifth
condition comes from examining equation (2-37), i.e., dy(x*)/dbi = - l i. The argument
is that as bi is increased the constraint region is enlarged; and this cannot result
in a higher value for y(x*), the minimum in the region. However, it could result in
a lower value of y(x*); and correspondingly dy(x*)/dbi would be negative, i.e., as
bi increases, y(x*) could decrease. Therefore, if dy(x*)/dbiis negative, then the
Lagrange Multiplier l i must be positive for equation (2-37) to be satisfied. This
condition is called a constraint qualification, as will be discussed subsequently.
For the sixth condition, it has been shown by Bazaraa and Shetty (15) that the Lagrange
multipliers associated with the equaltiy constraints are unrestricted in sign; and
there is not an argument comparable to the one given above for the Lagrange multipliers
associated with the inequality constraints.
For the problem of maximizing y(x) subject to inequality and equality constraints, the problem is as follows:
Maximize: y(x) (2-46)
Subject to : fi(x) < 0 for i = 1, 2, ..., h (2-47)
fi(x) = 0 for i = h+1, ..., m (2-48)
For this problem the Kuhn-Tucker conditions are:
These conditions are the same as the ones for minimizing given by equation (2-45),
except the inequality is reversed for the Lagrange Multipliers in the fifth condition.
Also, the inequality constraints are written as less than or equal to zero for convenience
in the subsequent discussion on sufficient conditions.
The following example illustrates the Kuhn-Tucker necessary conditions for a simple problem.
Example 2-9:
Locate the five Kuhn-Tucker points of the following problem, and determine their character, i.e., maximum, minimum, or saddle point.
optimize: y = x1x2
subject to: x1 + x2 < 1
-x1 + x2 < 1
-x1 - x2 < 1
x1 - x2 < 1
A diagram of the above equations is given in Figure 2-6. The function being optimized
is the classic saddle point function which in constrained by planes.
The first step in the procedure is to locate the stationary points by ignoring the inequality constraints, i.e., l1 = l2= l3 = l4 = 0. If this point satisfies the constraints as inequalities, an optimum may have been found. For this problem:
The Kuhn-Tucker point is xo (0,0), and evaluating its character by the unconstrained
sufficiency conditions gives the following result.
The point xo(0,0) is a saddle point, and the constraints are satisfied.
Proceeding to step two, one constraint equation at a time is selected, and the character of the Kuhn-Tucker point is determined. Beginning with the first constraint equation as an equality, i.e., l 1¹ 0 and considering the other three as inequalities, i.e., l2 = l3 = l4 = 0, gives the following equation for the Lagrangina function
Solving gives:
x1 = ½, x2 = ½, l = -½ y(½, ½) = ¼
The sign of the Lagrange Multiplier is negative, and by the Kuhn-Tucker necessary
conditions, the point can be a maximum, since the other constraint equations are satisfied
as inequalities.
The procedure is repeated for the other three constraint equations, each considered individually as an equality. The results for the Kuhn-Tucker points are summarized in the following table:
y : ¼ -¼ ¼ -¼ 0
x1: ½ -½ -½ ½ 0
x2: ½ ½ -½ -½ 0
l: -½ ½ -½ ½ 0
Character: max min max min saddle
The character of each of the stationary points is based on the Kuhn-Tucker necessary conditions. By examining Figure 2-5, we confirm their character. However, constraint qualifications and sufficient conditions are needed to give a general method, and this is discussed next.
In the theorems developed by Kuhn and Tucker (14), the constraint equations must satisfy certain conditions at the Kuhn-Tucker points, and these conditions are called constraint qualifications. As given in Bazaraa and Shetty (15), there are several forms of constraint qualifications; and one, according to Gill et. al. (16), is important for nonlinear constraints. This is the condition that the gradients of the constraint equations at the Kuhn-Tucker point are linearly independent. This constraint qualification is required for the necessary conditions given by equations (2-45) and (2-49). As an example, Kuhn and Tucker(14) constructed the constraint equations:
f1 = (1 - x1)3 - x2 > 0,
f2 = x1 > 0,
f3 = x2 > 0
These do not satisfy this condition at point x2* = 1 and x2* = 0. At this point Ñf1 = [-3(1 - x1)2, -1] = (0,-1), Ñf2= (1,0) and Ñf3 = (0,1) are not linearly independent. At such a point as this one, the necessary condition may fail to hold, and Kuhn and Tucker (14) give arguments that this constraint qualification is required to ensure the existence of the Lagrange multipliers at the optimum point. Verification of the constraint qualifications for a general nonlinear programming problem is almost an impossible task according to Avriel (10). He states that fortunately in practice constraint qualification usually holds, and it is justifiable to use the existence of the Lagrange mulitpliers as a basis for having the necessary conditions hold.
The same concepts used for unconstrained problems are followed to develop the sufficient conditions for constrained problems. This involves expanding the Lagrangian function in a Taylor series about the Kuhn-Tucker point located using the necessary conditions. The Taylor series is simplified by neglecting third and higher order terms to give a function that contains only terms involving second partial derivatives evaluated at the Kuhn-Tucker point. This gives a differential quadratic form, and a test similar to the one for the unconstrained problem is obtained to determine if the Kuhn-Tucker point is a maximum, minimum, or saddle. The sufficient conditions for the case of both inequality and equality constraints are more elaborate than if only equality constraints are involved. We have space to give only the appropriate theorems and describe their development and use. Further details are given by Avriel(10), Bazaraa and Shetty(15) and Reklaitis, et. al.(17).
Considering the case of only equality constraints first, the Lagrangian function for n independent variables and m equality constraint equations is given by the following equation.
Expanding the Lagrangian function in a Taylor series about the Kuhn-Tucker point x* gives:
This equation is comparable to equation (2-8), and subscripts xi, xj, and xk indicate partial differentiation with respect to those variables. Again, the first partial derivatives are zero at the Kuhn-Tucker point by the necessary conditions, and x is selected sufficiently close to x* such that the higher order terms are negligible when compared to the second order terms. This gives the following equation, which is comparable to equation (2-9) for the unconstrained case.
As previously, we need to determine if the term in the brackets remains positive (minimum), remains negative (maximum) or changes sign (saddle point) for small feasible changes in x about x*. The term in the bracket is a differential quadratic form.
To determine if the quadratic form is always positive or always negative, results comparable to those given by equation (2-7) are required, with the extension that the constraints also be satisfied, i.e., for feasible values of x. A theorem is given by Avriel (10) which establishes these conditions, and this theorem is then applied to the differential quadratic form of the Lagrangian function. The result, after Avriel (10), is the following theorem for the sufficient conditions of the optimization problem with only equality constraints. In this theorem the second partial derivatives of the Lagrangian function evaluated at the Kuhn-Tucker point x* are Lxj xk (x*,l* ) and are written as Ljk. Also, first partial derivatives of the constraint equations evaluated at the Kuhn-Tucker point x*are ¶fj(x*)/¶xk and are written fjk
Let y(x) and fi(x) = 0, i = 1, 2, ..., m, be twice continuously differentiable real valued functions. If there exist vectors x* and l *, such that:
for p = m+1, .., n, then y(x*) has a strict local minimum at x*, such that:
fi(x*) = 0, i = 1, 2, ..., m
The proof of this theorem is given by Avriel (10), and follows the discussion of the
concepts given above. The comparable result for a strict maxima is obtained by changing
(-1)m in the above theorem to (-1)p, according to Avriel (10). The following example
illustrates the application of this theorem.
Example 2-10
Consider the following problem.
optimize: x12 + 2x22 + 3x32
subject to: x1 + 2x2 + 4x3 - 12 = 0
2x1 + x2 + 3x3 - 10 = 0
Forming the Lagrangian function and differentiating with respect to x1, x2, x3, l1,
and l2 gives the following set of equations to be solved for the Kuhn-Tucker point.
Lx1 = 2x1 + l1 + 2l2 = 0
Lx2 = 4x2 + 2l1 + l2 = 0
Lx3 = 6x3 + 4l1 + 3l2 = 0
Ll1 = x1 + 2x2 + 4x3 - 12 = 0
Ll2 = 2x1 + x2 + 3x3 - 10 = 0
Solving the above equation set simultaneously gives the following values for the Kuhn-Tucker point.
x1 = 112/81, x2 = 118/81, x3 = 52/27, l1 = -80/27, l2 =
8/81
From the necessary conditions of equations (2-45) or (2-49), the Lagrange multipliers
are unrestricted in sign, and the value of the determinants from the theorem on sufficiency
conditions is required to determine the character of this point. The partial derivatives
needed for this evaluation are:
L11 = 2 L12 = 0 L13 = 0
L21 = 0 L22 = 4 L23 = 0
L31 = 0 L32 = 0 L33 = 6
f11 = 1 f12 = 2 f13 = 4
f21 = 2 f22 = 1 f23 = 3
The determinant is (m = 2, n = 3, p = 3, only one in this case):
The value of D3 is positive, and Kuhn-Tucker point is a minimum.
The sufficient conditions for problems with equality and inequality constraints, equations (2-41), (2-42), and (2-43), are summarized in the following theorem. There are a number of mathematical concepts and theorems required to obtain this result. These are given in some detail by Avriel (10), Bazaraa and Shetty (15) and Reklaitis, et. al.(17); it is not feasible to describe them in the space available here.
Let y(x), fi(x) ³ 0, i = 1, 2, ..., h and fi(x) = 0, i = h+1, ...,m be twice continuously differentiable real-valued functions. If there exist vectors x* and l * satisfying
The following example illustrates the application of this theorem.
Example 2-11
Consider the following problem after Reklaitis et. al.(17).
Minimize: (x1 - 1)2 + x22
Subject to : -x1 + x22 > 0
Applying the theorem gives:
2(x1 -1) + l = 0
2x2 - 2x2l = 0
l(-x1 + x22) = 0
l > 0
Solving this set of equations gives x1 = 0, x2 = 0, l = 2 for the Kuhn-Tucker point.
Then applying the sufficient conditions gives the following results at x* = (0,0).
2z12 - 2z22 > 0
However, for all finite values of z, the above inequalities cannot be satisfied, and
the second-order sufficiency conditions show that the point is not a minimum.
In summary, the necessary and sufficient conditions for nonlinear programming problems have been described and illustrated with examples. References have been given for more details for this theory.
An important special case is when the economic model is concave, and all of the constraint equation are convex and are inequalities. This is known as convex programming. "A function is concave if linear interpolation between its values at any two points of definition yields a value not greater than its actual value at the point of interpolation; such a function is the negative of a convex function" according to Kuhn and Tucker (14). Thus, the convex programming problem can be stated as follows.
Maximize: y(x) (2-53)
Subject to: fi(x) < 0 for i = 1, 2, ..., m (2-54)
The necessary and sufficient conditions for the maximum of concave function y(x) subject
to convex constraints fi(x) < 0, i = 1, 2, ..., m are the Kuhn Tucker conditions given
below as:
The theorem from Cooper(7) that establishes the above result is:
If y(x) is a strictly concave function and fi(x), i = 1, 2, ..., m are convex functions
which are continuous and differentiable, then the Kuhn-Tucker conditions (2-49) are
sufficient as well as necessary for a global maximum.
The proof of this theorem uses the definition of convex and concave functions and
the fact that the Lagrangian function can be formulated as the sum of concave functions,
which is concave.
These concepts and results for the Kuhn-Tucker conditions and those given previously will be valuable in our discussion of modern optimization procedures in the following chapters. Those interested in further theoretical results are referred to the references at the end of the chapter and Chapter 6. Also, in industrial practice we will see that the concepts from the Kuhn-Tucker conditions are used in computer programs for advanced multivariable search methods to optimize economic and process models which are too elaborate for the algebraic manipulations required to use these theories directly.
Closure
In the chapter we have discussed the necessary and sufficient conditions to evaluate the character of stationary points for unconstrained and constrained optimization problems. It was necessary to confine the illustration of these procedures to simple algebraic models. Even though we are not able to apply these procedures directly to the optimization of industrial processes, the concepts developed in this chapter are used many times over in the following chapters.
It is worthwhile to attempt to solve the following unconstrained economic model from the design of horizontal vapor condensers in evaporators used in water desalination plants to see one of the major limitations of the classical theory of maxima and minima. The problem is to minimize the cost given by the following equation.
C = aN-7/6 D-1 L-4/3 + b N-0.2 D0.8 L-1 + cNDL + d N-1.8 D-4.8 L (2-56)
In this equation the cost is in dollars per year; N is the number of tubes in the condenser; D is the nominal diameter of the tubes in inches; L is the tube length in feet; and a, b, c, and d are coefficients that vary with the fluids involved and the construction costs. Avriel and Wilde (18) give further details about the significance of each term. This equation is typical of the form that is obtained from assembling correlations of equipment costs and related process operating conditions for preliminary cost estimates.
Differentiating this equation with respect to the three independent variables N, D, and L, and setting the results equal to zero gives the following three equations to be solved for the values of N, D, and L that would give the minimum cost.
(-7a/6)N-13/6 D-1L-4/3 + (-0.2b)N-1.2D0.8L-1 + cDL + (-1.8d)N-2.8D-4.8L =
0
-aN-7/6 D-2 L- 4/3 + 0.8bN-0.2 D-0.2 L-1 + cNL + (-4.8d)N-1.8 D-5.8 L = 0
(-4a/3)N-7/6 D-2 L-7/5 - N-0.2 D.08 L-2 + cND + dN-1.8 D-4.8 = 0
(2-57)
There is no straightforward way to solve this relatively complicated set of three nonlinear algebraic equations other than numerically with a root finding procedure at this point. This then illustrates one on the major limitations with classical methods, i.e., if the variables in the economic model have fractional exponents, then a set of nonlinear algebraic equations are obtained that will probably require an iterative solution using a computer. However, as will be seen in the next chapter on geometric programming, we will be able to obtain the optimal solution for this economic model by solving a set of linear algebraic equations. We will take advantage of the mathematical structure of the problems to be able to find the optimum readily. In fact, this will be true of most of the modern methods; they take advantage of the mathematical structure of the optimization problem to quickly find the best values of the independent variables and the maximum profit or minimum cost.
References
1. Hancock, H., Theory of Maxima and Minima, Dover Publications, Inc., New York (1960).
2. Wilde, D. J. Ind. Eng. Chem., 57 (8):18 (1965).
3. Smith, C. L., R. W. Pike and P. W. Murrill, Formulation and Optimization of Mathematical Models, International Textbook Co., Scranton, Pa. (1970).
4. Wilde, D. J. and C. S. Beightler, Foundations of Optimization, Prentice-Hall, Inc., Englewood Cliffs, N.J. (1967).
5. Wilde, D. J., Optimum Seeking Methods, Prentice-Hall, Inc., Englewood Cliffs, N.J. (1965).
6. Beveridge, G. S. G., and R. S. Schechter, Optimization Theory and Practice, McGraw-Hill Book Co., New York (1970).
7. Cooper, Leon, Mathematical Programming for Operations Researchers and Computer Scientists, Ed. A. G. Holtzman, Marcel Dekker, Inc., New York (1981).
8. Walsh, G. R. Methods of Optimization, John Wiley and Sons, Inc., New York (1979).
9. Sivazlian, B. D. and L. E. Stanfel, Optimization Techniques in Operations Research, Prentice-Hall, Inc., Englewood Cliffs, N.J. (1975).
10. Avriel, M. Nonlinear Programming, Analysis and Methods, Prentice Hall, Inc., Englewood Cliffs, N. J. (1976).
11. Courant, R. and D. Hilbert, Methods of Mathematical Physics, vol.I, p. 164, Interscience Publishers, Inc., New York (1953).
12. Burley, D. M., Studies in Optimization, John Wiley and Sons, Inc., New York (1974).
13. Sokolnikoff, I. S., and R. M. Redheffer, Mathematics of Physics and Modern Engineering, Second Edition, McGraw - Hill Book Co., New York (1966).
14. Kuhn, H. W., and A. W. Tucker, "Nonlinear Programming," Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, Ed. Jerzy Neyman, p. 481 - 92, University of California Press, Berkeley, California (1951).
15. Bazaraa, M. S., and C. M. Shetty, Nonlinear Programming - Theory and Algorithms, John Wiley & Sons, Inc., New York (1979).
16. Gill, P. E., W. Murray, and M. H. Wright, Practical Optimization, Academic Press, New York (1981).
17. Reklaitis, G. V., A. Ravindran and K. M. Ragsdell, Engineering Optimization: Methods and Applications, John Wiley and Sons, Inc., New York (1983).
18. Avriel, M. and D. J. Wilde, "Optimal Condenser Design by Geometric Programming", I & EC Process Design and Development, Vol. 6, No. 2, p. 256 (April, 1967).
Problems
2-1. Locate the stationary points of the following functions and determine their character.
a. y = x4/2 - x2/2
Solution to 2-1.a
b. y = x7
Solution to 2-1.b
c. y = x12 + x1x2 + x22
Solution to 2-1.c
d. y = 2x12 + 3x22 + 4x32 - 8x1 - 12x2 - 24x3 + 110
Solution to 2-1.d
2-2. Find the global maximum of the function
y(x1, x2) = 5(x1 - 3)2 - 12(x2 + 5)2 + 6x1x2
in the region
0 < x1 < 10
0 < x2 < 5
Solution
2-3. Use the Jacobian determinants and obtain the two equations to be solved with
the constraint equation for the following problem
Optimize: y(x1, x2, x3)
Subject to: f(x1, x2, x3) = 0
Solution
2-4. Solve the following problem by the method of constrained variation and the method
of Lagrange multipliers, evaluating x1, x2 and the Lagrange multiplier l at the optimum.
Maximize: x1 + x2
Subject to: x12 + x22 = 1
Solution
2-5. Solve the following problem by the method of Lagrange multipliers and give the character of the stationary point.
Minimize: y = 2x12 - 4x1x2 + 4x22 + x1 - 3x2
Subject to: 10x1 + 5x2 < 3
Solution
2-6. Consider the following problem
Optimize: -x12 - 2x1 + x22
Subject to: x12 + x22 - 1 < 0
a. Obtain the equation set to be solved to locate the stationary points of the above problem using the method of Lagrange multipliers. Convert the inequality constraint to an equality constraint with the slack variable x3 as x32; why?
b. Show that the following are solutions to the algebraic equations obtained in part a.
Stationary Points
A B C D
x1 -½ -½ -1 1
x2 Ö3/2 Ö-3/2 0 0
x3 0 0 0 0
l -1 -1 0 2
c. Based on the value of the function being optimized, state whether stationary points
A through D are maximum, minimum, or saddle points.
Solution
2-7. The cost of operation of a continuous, stirred-tank reactor is given by the following equation.
CT = Cf cAo q + CmV
The total operating cost CT ($/hr) is the sum of the cost of the feed, Cf cAo q, and
the cost of mixing, CmV. The following gives the values for the reactor.
Cf = $5.00/lb-mole of A, cost of feed
cAo = 0.04 lb-mole/ft3, initial concentration of A.
q = volumetric flow rate of feed to the reactor in ft3/hr.
Cm = $0.30/hr-ft3, cost of mixing
V = volume of reactor in ft3
We are interested in obtaining the minimum operating cost and the optimal values of
the feed rate, q; reactor volume, V; and concentration in the reactor, cA. The following
first order reaction takes place in the reactor
A ® B
where the rate of formation of B, rB is given by
rB= kcA
where k = 0.1 hr-1
a. If 10 lb-moles per hour of B are to be produced, give the two material balance
constraint equations which restrict the values of the independent variables. (There
is no B in the feed stream.)
b. Form the Lagrangian function and perform the appropriate differentiation to obtain the set of equations that would be solved for the optimal values of the independent variables. How many equations and variables are obtained?
c. Solve for the optimal values of the reactor volume, V; feed rate, q; and concentration of A in the product, CA.
Solution
2-8. Solve the following problem by the method of Lagrange multipliers, and determine the character of the stationary point.
Optimize: y = 2x12 + 2x1x2 + x22 - 20x1 - 14x2
Subject to: x1 + 3x2 < 5
2x1 - x2 < 3
Solution
2-9.(9) Solve the following problem by the method of Lagrange multipliers, and determine the character of the stationary point.
Optimize: (1/3)x1 + x2
Subject to: -x1 + x2 < 0
x1 + x2 < 3
Solution
2-10. The total feed rate to three chemical reactors in parallel is 1100 pounds per hour. Each reactor is operating with a different catalyst and conditions of temperature and pressure. The profit function for each reactor has the feed rate as the independent variable, and the parameters in the equation are determined by the catalyst and operating conditions. The profit functions for each reactor are given below.
P1 = 0.2F1 - 2(F1/100)2
P2 = 0.2F2 - 4(F2/100)2
P3 = 0.2F3 - 6(F3/100)2
Determine the maximum profit and the optimal feed rate to each reactor.
Solution
2-11. Solve the following problem by the method of Lagrange multipliers and determine the character of the stationary points.
maximize: 3x12 + 2x22
subject to: x12 + x22 < 25
9x1 - x22 < 27
Solution
2-12. Find the stationary points of the following problem, and determine their character, i.e., maximum, minimum, or saddle point.
Optimize: 2x12 + x22 - 5x1 - 4x2
Subject to: x1 + 3x2 < 5
2x1 - x2 < 4
Solution
2-13. The rate of return (ROR) is defined as the interest rate where the net present value (NPV) is zero for a specified number of years n and initial cash flow CFo. This can be formulated as an optimization problem, as follows:
Minimize: (NPV)2
For the case of constant cash flows CFj = A, develop the equation to determine the
rate of return. The net present value is given by the following equation.
NPV = -|CFo| + A[1 - (i + 1)-n] / i
Solution
2-14. Derive the Lagrangian function for n independent variables and m constraint
equations, equation (2-24). Begin by multiplying the constraint equations given in
equation (2-19) by Lagrange multipliers l1, l2, ...,lm. Then add all of the equations
rearrange terms and obtain the result as equation (2-24).
Solution
2-15. For sufficient conditions of the equality constraint problem to determine if the quadratic form is positive or negative definite, the signs of the roots of a polynomial can be evaluated. This characteristic polynomial is obtained by evaluating the following determinant, which includes the second partial derivatives of the Lagrangian function evaluated at the Kuhn-Tucker points, Lxj xk(x*,l) written as Ljk for simplicity, and the first partial derivative of the constraint equations evaluated at the Kuhn-Tucker point, dfj(x*)/dxk written as fjk for simplicity.
The following results are used to evaluate the type of stationary points. First, evaluate
the roots of P(a) using the above equation.
If each root of P(a) is positive, then x* is a maximum. If each root of P(a) is negative, then x* is a minimum. Finally, if the roots are of mixed sign, then x* is a saddle point.
Use the results given in Example 2-10 in the above determinant, and confirm the character of the Kuhn-Tucker point by this method. (There is a comparable sufficient condition test for the unconstrained problem which is described by Sivazlian and Stanfel (9).)