Constrained Extrema

In practice, we wish to optimize a function considering some existing constraints. In economics and engineering, the constrain may be due to limited funds, materials, or energy. If we wish to find the distance from a point P=(x0,y0) to a line ax+by=c, we should find the minimum value of d(x,y)=(xx0)2+(yy0)2 while (x,y) satisfies ax+by=c. Suppose T(x,y,z) represents the temperature in space, and we want to find the maximum temperature on a surface given by g(x,y,z)=0. When the constraint (also called the side condition) equation is given, we can solve it for one of the variables, say z=ϕ(x,y), and replace it in the function T. Then the problem would reduce to finding the extremum value of the function T(x,y,ϕ(x,y)), which now depends only on two independent variables. We have already applied this method in this example, where we optimized the value of the function on the boundary of a region. To practice how to use this method, let’s consider the following example.

Example 1
We want to make a rectangular box without a top, and of given volume V. If the least amount of material is to be used, determine the design specifications.

Solution
Let
x= length of the box
y= width of the box
z= height of the box
where x,y and z are in the interval (0,). The specified volume is V=xyz The amount of material for constructing this box is proportional to its surface area S=xy+2xz+2yz. So we want to minimize S(x,y,z) subject to xyz=V.

From xyz=V we obtain z=Vxy and then we plug it into the formula of S: S=xy+2Vy+2Vx. Now S is expressed as a function of only two variables. To determine the relative extremum, we differentiate and set the partial derivatives equal to zero:
Sx=y2Vx2=0,Sy=x2Vy2=0
or equivalently:
{x2y=2Vxy2=2V If we divide the first question by the second one, we obtain x/y=1. Therefore: x2y=x2x=x3=2Vx=y=(2V)1/3 From these values of x and y, we get z=Vxy=(2V)1/32 Using the second derivative test, we can show that these values of x,y and z give a relative minimum of S. Because S as either x0+, y0+, x, or y, we can conclude that the relative minimum is also the absolute minimum.

Lagrange Multipliers

When the constraint is given implicitly by g(x,y,z)=c, it is not always possible or easy to solve the constraint equation for one of the variables (express x,y or z as a function of the remaining variables). The problem can be more complicated when there is more than one constraint. In such cases, an alternative procedure is a method called Lagrange multipliers.

To explain this method, let’s start with an example. Suppose we want to find the shortest and longest distance between the point (1,1) and the curve C

g(x,y)=(xy2)218+(x+y)232=1.

The distance between a point (x,y) and (1,1) is given by f(x,y)=(x1)2+(y+1)2. So we want to maximize and minimize f(x,y) subject to g(x,y)=1. First let’s sketch the curve C and some level curves of f (Fig. 1).

Figure. 1: The orange lines show contour lines of f with different values. The function f takes on extreme values at four points A,B,P, and Q.

 

To extremize f(x,y) subject to g(x,y)=1, we have to find the largest and smallest value of k such that the level curve f(x,y)=k intersects g(x,y)=1. Among the level curves that intersect g(x,y)=1, the minimum value of f(x,y) occurs at the points P and Q where f(x,y) has a value of 3 (see Fig. 1). At these points, the constraint curve g(x,y)=1 and the level curve f(x,y)=3 just touch each other; in other words, f=3 and g=1 have a common tangent line at P and a common tangent at Q.

Note that g(x,y)=1 is the level curve of z=g(x,y). Because at each point f is perpendicular to the level curves of f and similarlyg is perpendicular to the level curves of g, a common tangent at P means that f(P) and g(P) are parallel. That is, there is a number λ1 such that
f(P)=λ1g(P).
Similarly there is a number λ1 such that
f(Q)=λ1g(Q).

We also observe that the maximum value of f(x,y) subject to g(x,y)=1 occurs where the constraint curve and a level curve (here f=4) touch each other (In Fig. 1, they are denoted byA and B). Thus f(A)=μ1g(A), and f(B)=μ2g(B), for some μ1 and μ2. Therefore, to find the maximum or minimum of f(x,y) subject to the constraint g(x,y)=1 , we look for a point x0 such that
f(x0)=λg(x0)
for some λ. This is the method of Lagrange multiplier. But why this is true?

Suppose the constraint curve C is parameterized by some functions:1 x=X(t),y=Y(t) If, in the equation of f, x and y are replaced by X(t) and Y(t), then the distance between (1,1) and points on C becomes a function of t F(t)=f(x(t),y(t)). Therefore, the extreme values of the distance occur where F(t)=0. From the chain rule we know
F(t)=dFdt=fxdXdt+fydYdx
We can write the equation F(t)=0 as

RxdXdt+RydYdx=0(fx,fy)(dXdt,dYdt)=0
This means at the extreme points the gradient vector (fx,fy) is perpendicular to (X(t),Y(t)). Recall that (X(t),Y(t)) is tangent to the curve C.

On the other hand C is a level curve of g. Therefore, at each point g is perpendicular to C. Therefore at the extreme points, g and f are parallel. The simplest version of the method of Lagrange multipliers is as follows.

Theorem 1. Suppose URn is an open set and f:UR and g:UR are two continuously differentiable functions2. Let S={xU| g(x)=c} be the level set for g with value c. If f|S denoting “f restricted to S,” has a relative extremum on S at x0U, and g(x0)0, then there exists a real number λ such that
f(x0)=λg(x0).

 

Read the proof

 

  • The number λ in the above theorem is called a Lagrange multiplier.
  • λ might be zero.

Note that to find the extremum of f|S, we have n+1 unknowns (n components of x0 and λ) and n+1 equations:
(i)fx1(x0)=λgx1(x0)fxn(x0)=gxn(x0)g(x0)=c}(n+1) equations

Example 2
Find the extrema of the function f(x,y)=xy subject to the constraint 14x2+19y2=1.

Solution
Let g(x,y)=14x2+19y2, so S consists of all points (x,y) such that g(x,y)=1. We have:
f(x,y)=(y,x),g(x,y)=(12x,29y)

Note that g=0 if and only if (x,y)=(0,0). Thus if f subject to the constraint has an extremum at (x0,y0), we must have f(x0,y0)=λg(x0,y0). Equations i take on the form:
y0=12λx0,x0=29λy0,14x02+19y02=1.
There are different ways to solve the above system of equations. One way is to replace y0 from the first equation in the second equation
x0=29λy0=2λ9(λ2x0)=19λ2x0
This leads to
x0x0λ29=x0(1λ29)=0x=0 or λ=±3.
If x=0, then substituting in g(x0,y0)=1, we obtain 1402+19y02=1y0=±3. If λ=±3, it follows from y0=12λx0 that y0=±32x0. Substituting into g(x0,y0), we get:
14x02+1994x02=112x02=1x0=±2.
If x0=2, then y0=±322, and if x0=2, then y0=±322.

Now we list the points we found and the values of f at these points:

\large (x_0,y_0) (0,3) (0,3) (2,322) (2,322) (2,322) (2,322)  
f(x0,y0)=x0y0 0 0 3 3 3 3  

The function f attains its maximum value 3 at (2,32/2) and (2,32/2), and its minimum value -3 at (2,32/2) and (2,32/2). Note that at these points the level curve g=1 is tangent to one of the level curves of f (see Fig. 2).

(a) (b)
Figure 2. (a) The graph of f(x,y); the blue curve shows f restricted to the ellipse g(x,y)=1. (b) The level curves of  f and g; the extrema of f subject to the constraint g(x,y)=1 occur at points where the level curves of f are tangent to the level curve g(x,y)=1.
Let’s go back to the example of making an open box and try to solve it using the method of Lagrange multipliers.

Example 3
We want to make a rectangular box without a top, and of given volume V. If the least amount of material is to be used, determine the design specifications using Lagrange multipliers.

Solution
Again Let x,y and z be the length, width, and height of the box, respectively. We want to minimize S=xy+2xz+2yz subject to xyz=V, where V is the given volume.

If we let g(x,y,z)=xyz, then
S=(y+2z,x+2z,2x+2y),g=(yz,xz,xy)
Note that g=0 if and only if (x,y,z)=(0,0,0) which is not on the level surface g(x,y,z)=V. So if S has an extremum at (x0,y0,z0), we have
S(x0,y0,z0)=λg(x0,y0,z0).
Hence our equations become
{y+2z=λyz(i)x+2z=λxz(ii)2x+2y=λxy(iii)xyz=V(iv)
Because x,y and z are nonzero, if we divide the first equation by yz, the second equation by xz and the third one xy, we get:
{1z+2y=λ(i)1z+2x=λ(ii)2y+2x=λ(iii)xyz=V(iv)
Then
Eq. (i),  Eq. (ii)x=y
Eq. (ii),  Eq (iii), and x=yz=12x
x=y=2z,  Eq (4)12x3=Vx=y=2z=(2V)1/3.
This is the result we previously obtained in Example 1.

The following example shows an application of Lagrange multipliers in economics and when there are many variables.
Example 4
Assume there are n commodities with prices per unit p1,,pn. Assume we have L dollars to spend on these commodities. This means we have the constraint p1x1+pnxn=L where x1,,xn are the amounts of the commodities. Assume the utility is given by the Cobb-Douglas function U=f(x1,,xn)=Kx1α1xnαn where K,α1,,αn are positive constants. Find the maximum utility we can achieve.

Solution
Let g(x1,,xn)=p1x1++pnxn. So we seek for the maximum value of f(x1,,xn) subject to g(x1,,xn)=L. Because xi0 (for i=1,,n) the constraint is a part of the level set g(x1,,xn)=L. We note that when the amount of a commodity xi approaches zero, the utility approaches zero. Hence, the utility takes on its maximum value when xi’s are positive. Because g(x1,,xn)=(p1,,pn)0, if f attains its maximum at a point, we have f(x1,,xn)=λg(x1,,xn); that is,
{α1Kx1α11x2α2,xnαn=λp1αnKx1α1x2α2,xnαn1=λpn
If we multiply both sides of the first equation by x1, the second equation by x2, , and the n-th equation by xn, we get:
{α1f(x1,,xn)=λp1x1αnf(x1,,xn)=λpnxn
Therefore: p1α1x1==pnαnxn This result makes sense because it says the amount of commodity xi is proportional to its power and inversely proportional to its price, pi.

Let αipixi=C, then
p1x1++pnxn=p1α1p1C++pnα1pnC=L
or
C(α1++αn)=Ci=1nαi=LC=Li=1nαi.
Consequently,
xi=αipiLi=1nαn. Finally we conclude the maximum value of f is:
K(αiLpii=1nαi)α1(αnLpni=1nαi)αn=K(Li=1nαi)i=1nαi(α1p1)α1(αnpn)αn

The following example shows when maximizing or minimizing a function subject to a constraint, we should also investigate the points where the gradient of the constraint is zero g(x)=0.

Example 5
Find the nearest point on the curve y3x2=0 to the point (0,1).

Solution
If we plot the level curve
g(x,y)=y3x2=0 (Fig. 3), it is clear that (0,0) is the nearest point on this curve to the point (0,1), and the distance is 1. If we want to use the method of Lagrange multipliers, we need to minimize the square of distance f(x,y)=(x0)2+(y(1))2 subject to the constraint g(x,y)=y3x2=0.

Figure 3.

Here
f(x,y)=(2x,2y+2),g(x,y)=(2x,3y2).
Equations i take on the form
{2x0=2λx02y0+2=3λy02y03x02=0
However the solution, (x0,y0)=(0,0), does not satisfy the second equation. The reason is attributed to the fact that at this point the gradient of g is 0, g(0,0)=(0,0), so we cannot use Theorem 1. In addition, the level curve is not smooth at (0,0).

If a function f subject to a constraint g(x)=c attains its extreme value at a point x0, then x0 is one of the four types of the point:

  1. x0 is a point where f(x0)=λg(x0),
  2. x0 is a point where g(x0)=0,
  3. x0 is a rough point of f or g where f(x0) or g(x0) does not exist, or
  4. x0 is on the boundary of the level set g(x)=c.

Multiple Constraints

The method of Lagrange multipliers extends to the case of multiple constraints: say to f(x,y,z) subject to two constraints
(ii)g1(x,y,z)=c1,andg2(x,y,z)=c2.
Geometrically this means we seek the extreme values of f(x,y,z) on a curve C formed by the intersection of two level surfaces g1(x,y,z)=c1 and g2(x,y,z)=c2 (Fig. 4).

Figure 4. Because g1 is perpendicular to the surface g1=c1 and g2 is perpendicular to the surface g2=c2, both g1 and g2 are perpendicular to C. At a point where f has a maximum or minimum, also vecf is perpendicular to C.

 

Suppose f subject to these constraints, f|C, has an extremum at P=(x0,y0,z0), and the functions f,g1, and g2 have continuous first partial derivatives near P. Analogous to the case of one constraint, we can argue that f is perpendicular to C at P. Additionally because the gradient is perpendicular to the level surface, both g1 and g2 are perpendicular to C. If g1(x0,y0,z0) and g2(x0,y0,z0) are neither zero vectors nor collinear vectors, then f(x0,y0,z0) lie in a plane spanned by the two vectors g1(x0,y0,z0) and g2(x0,y0,z0) and hence can be expressed as a linear combination of these two vectors, say:
(iii)f(x0,y0,z0)=λ1g1(x0,y0,z0)+λ2g2(x0,y0,z0)
In this method, Equations ii and iii must be solved simultaneously for five unknowns x0,y0,z0,λ1 and λ2:
{fx(x0,y0,z0)=λ1g1x(x0,y0,z0)+λ2g2x(x0,y0,z0)fy(x0,y0,z0)=λ1g1y(x0,y0,z0)+λ2g2y(x0,y0,z0)fz(x0,y0,z0)=λ1g1z(x0,y0,z0)+λ2g2z(x0,y0,z0)g1(x,y,z)=c1g2(x,y,z)=c2

Example 6
Find the extreme points of the function f(x,y,z)=x+4y2z on the curve of the intersection of the cylinder x2+y2=4 and the plane 2yz=3 (Fig. 5).

Figure 5.
Solution
Here we are given two constraints:
g1(x,y,z)=x2+y2=4,g2(x,y,z)=2yz=3. Note that
f=(1,4,2),g1=(2x,2y,0),andg2=(0,2,1).
The vector g1=(0,0,0) only when x=0 and y=0, which clearly does not satisfy the constraint g1=4. Thus, the two vectors g1 and g2 are clearly not parallel. Therefore, any constrained critical point (x0,y0,z0) must satisfy
f(x0,y0,z0)=λ1g1(x0,y0,z0)+λ2g2(x0,y0,z0).
It follows that we must solve the following system of equations for five unknowns
x,y,z,λ1 and λ2:
{1=2xλ1+04=2yλ1+2λ22=0λ2x2+y2=42yz=3
From the third equation, we know λ2=2. So the second equation becomes 2yλ1=0. Because from the first equation, λ10, it follows from 2yλ1=0 that y=0. If we substitute y=0 in the fourth and fifth equations, we get x=±2 and z=1. Therefore, f may have extrema at (±2,0,1).

The condition x2+y2=4 implies 2x,y2. Then it follows from 2yz=3 that 7z1. This means the constraint set, S, is bounded. Because the constraint set S is closed and bounded, and f is a continuous function, f|S assumes its maximum and minimum ( Theorem 1 in the previous section). Here we have only two potentials, therefore one of them (2,0,1) is the maximum point and the other one (2,0,1) is the minimum point.

By similar reasoning, we obtain equations for minimizing or maximizing
f(x1,,xn) subject to several constraints
(iv){g1(x1,,xn)=c1g2(x1,,xn)=c2gm(x1,,xn)=cm
where m<n. Assume f(x1,,xn) has a relative extremum at x0 when the variables are restricted by the constraint equations iv. If f,g1,, and gm have continuous first partial derivatives at all points near x0 and if each gi(x0) is not a linear combination of the other gj(x0) (ji), then there exists m real numbers λ1,,λm such that
f(x0)=λ1g1(x0)++λmgm(x0).


1 For instance, in this specific example, ↩
{x=X(t)=32cost+22sinty=Y(t)=32cost+22sint(0t2π)

2 In other words, f and g have continuous partial derivatives.↩